| <html> |
| <head> |
| <!-- |
| |
| Copyright (c) 2010, 2021 Oracle and/or its affiliates. All rights reserved. |
| |
| This program and the accompanying materials are made available under the |
| terms of the Eclipse Distribution License v. 1.0, which is available at |
| http://www.eclipse.org/org/documents/edl-v10.php. |
| |
| SPDX-License-Identifier: BSD-3-Clause |
| |
| --> |
| |
| <title>Design of XSOM</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/> |
| <style> |
| pre { |
| background-color: rgb(240,240,240); |
| margin-left: 2em; |
| margin-right: 2em; |
| padding: 1em; |
| } |
| p { |
| margin-left: 2em; |
| } |
| dt { |
| margin-top: 0.5em; |
| margin-left: 2em; |
| font-weight: bold; |
| } |
| dd { |
| margin-left: 3em; |
| } |
| </style> |
| </head> |
| <body> |
| |
| <h1 style="text-align:center">Design of XSOM</h1> |
| <div align=right style="font-size:smaller"> |
| By <a href="mailto:kohsuke.kawaguchi@sun.com">Kohsuke Kawaguchi</a><br> |
| </div> |
| |
| <p> |
| This document describes the details you need to know to extend/maintain XSOM. |
| </p> |
| |
| <h1>Design Goals</h1> |
| <p> |
| The primary design goals of XSOM are: |
| </p> |
| <ol> |
| <li>Expose all the information defined in the schema spec |
| <li>Provide additional methods that helps simplifying client applications. |
| </ol> |
| <p> |
| Providing mutation methods was a non-goal for this project, primarily because of the added complexity. |
| </p> |
| |
| |
| <h1>Building workspace</h1> |
| <p> |
| The workspace uses Ant as the build tool. The followings are the major targets: |
| </p> |
| <dl> |
| <dt>clean</dt> |
| <dd>remove the intermediate and output files.</dd> |
| |
| <dt>compile</dt> |
| <dd>generate a parser by RelaxNGCC and compile all the source files into the bin directory.</dd> |
| |
| <dt>jar</dt> |
| <dd>make a jar file</dd> |
| |
| <dt>release</dt> |
| <dd>build a distribution zip file that contains everything from the source code to a binary file</dd> |
| |
| <dt>src-zip</dt> |
| <dd>Build a zip file that contains the source code.</dd> |
| </dl> |
| |
| <h1>Architecture</h1> |
| <p> |
| XSOM consists of roughly three parts. |
| |
| The first part is the public interface, which is defined in the <code>com.sun.xml.xsom</code> package. The entire functionality of XSOM is exposed via this interface. This interface is derived from a draft document submitted to W3C by some WG members. |
| </p><p> |
| The second part is the implementation of these interfaces, the <code>com.sun.xml.xsom.impl</code> package. These code are all hand-written. |
| </p><p> |
| The third part is a parser that reads XML representation of XML Schema and builds XSOM nodes accordingly. The package is <code>com.sun.xml.xsom.parser</code>. This part of the code is mostly generated by <a href="http://relaxngcc.sourceforge.net/">RelaxNGCC</a>. |
| </p> |
| <center> |
| <img src="architecture.png"/> |
| </center> |
| |
| |
| |
| |
| <h1>Implementation Details</h1> |
| <p> |
| Most of the implementation classes are fairly simple. Probably the only one interesting piece of code is the <code>Ref</code> class, which is a reference to other schema components. |
| </p><p> |
| The <code>Ref</code> class itself is just a place hodler and this class defined a series of inner interfaces that are specialized to hold a reference to different kinds of schema components. |
| |
| The sole purpose of this indirection layer is to support forward references during a parsing of the XML representation. |
| </p><p> |
| A typical reference interface would look like this: |
| </p> |
| <pre> |
| public static interface Term { |
| /** Obtains a reference as a term. */ |
| XSTerm getTerm(); |
| } |
| </pre> |
| <p> |
| In case this indirection is unnecessary, all implementation classes of <code>XSTerm</code> implements this <code>Ref.Term</code> interface. This applies to all the other types of the <code>Ref</code> interface. Therefore, whereever a reference is necessary, you can stimply pass a real object. In other words, a direct reference (<code>XS***Impl</code>) can be always treated as an indirect reference (<code>Ref.***</code>). |
| </p><p> |
| Implementations for forward references are placed in the <code>com.sun.xml.xsom.impl.parser.DelayedRef</code> class. The detail will be discussed later. |
| </p> |
| |
| |
| |
| <h1>Parser</h1> |
| <p> |
| The following collaboration diagram shows various objects that participate in a parsing process. |
| </p> |
| <center> |
| <img src="collaboration.png"/> |
| </center> |
| <p> |
| <code>XSOMParser</code> is the only publicly visible component in this picture. This class also keeps references to vairous other objects that are necessary to parse schemas. This includes an error handler, the root <code>SchemaSet</code> object, an entity resolver, etc. |
| </p><p> |
| Whenever the parse method is called, it will create a new NGCCRuntimeEx and configure XMLReader so that a schema file is parsed into this NGCCRuntimeEx instance. |
| |
| <code>NGCCRuntimeEx</code> derives from <code>NGCCRuntime</code>, which is a class generated by RelaxNGCC. This object will use other RelaxNGCC-generated classes and parse a document and constructs a XSOM object graph appropriately. |
| </p><p> |
| When a new XML document is referenced by an import or include statement, a new set of <code>NGCCRuntimeEx</code> is set up to parse that document. One NGCCRuntimeEx can only parse one XML document. |
| </p> |
| <h2>Forward references and back-patching</h2> |
| <p> |
| Since we use SAX to parse schemas, the referenced schema component is often unavailable when we hit a reference. Because of this, when we see a reference, we create a "delayed" reference that keeps the name of the referenced component. |
| </p><p> |
| Note that because of the way XML Schema <redefine> works, all the references by name must be lazily bound even if the component is already defined. |
| </p><p> |
| All these "delayed" references are remembered and tracked by XSOMParser. When the client calls the <code>XSOMParser.getResult</code> method, XSOMParser will make sure that they resolve to a schema component correctly. |
| "Delayed" references are available in the <code>DelayedRef</code> class. |
| </p> |
| |
| |
| <h2>RelaxNGCC</h2> |
| <p> |
| The actual parser is generated by RelaxNGCC from <code>xsom/src/*.rng</code> files. <code>xmlschema.rng</code> is the entry point and all the other files are referenced from this file. For more information about RelaxNGCC, goto <a href="http://relaxngcc.sourceforge.net/">here</a>. Or just contact me (as I'm one of the developers of RelaxNGCC.) |
| </p> |
| |
| </body> |
| </html> |