| // |
| // Copyright (c) 2020, 2021 Contributors to the Eclipse Foundation |
| // |
| |
| == Architecture |
| |
| === Introduction |
| |
| This chapter describes the architectural |
| components, comprising the XML-databinding facility, that realize the |
| goals outlined in <<Goals>>. The scope of |
| this version of specification covers many additional goals beyond those |
| in JAXB 1.0. As a result, JAXB 1.0 architecture has been revised |
| significantly. |
| |
| === Overview |
| |
| The XML data-binding facility consists of the |
| following architectural components: |
| |
| * *_schema compiler_*: A schema compiler binds a |
| _source schema_ to a set of _schema derived program elements_. The binding |
| is described by an XML-based language, *_binding language_*. |
| * *_schema generator_*: A schema generator maps a |
| set of existing program elements to a _derived schema_. The mapping is |
| described by *_program annotations_*. |
| * *_binding runtime framework_* that provides two |
| primary operations for accessing, manipulating and validating XML |
| content using either schema derived or existing program elements: + |
| ** _Unmarshalling_ is the process of reading |
| an XML document and constructing a tree of _content objects_. Each content |
| object is an instance of either a schema derived or an existing program |
| element mapped by the schema generator and corresponds to an instance in |
| the XML document. Thus, the content tree reflects the document’s |
| content. + |
| Validation can optionally be enabled as part of the |
| unmarshalling process. _Validation_ is the process of verifying that an |
| xml document meets all the constraints expressed in the schema. |
| ** _Marshalling_ is the inverse of |
| unmarshalling, i.e., it is the process of traversing a content tree and |
| writing an XML document that reflects the tree’s content. Validation can |
| optionally be enabled as part of the marshalling process. |
| |
| As used in this specification, the term |
| _schema_ includes the W3C XML Schema as defined in the XML Schema 1.0 |
| Recommendation[XSD Part 1][XSD Part 2]. <<a210>> illustrates relationships |
| between concepts introduced in this section. |
| |
| .Non-Normative Jakarta XML Binding Architecture diagram |
| [[a210]] |
| image::images/xmlb-3.png[image] |
| |
| JAXB-annotated classes are common to both |
| binding schemes. They are either generated by a schema compiler or the |
| result of a programmer adding JAXB annotations to existing Java classes. |
| The universal unmarshal/marshal process is driven by the JAXB |
| annotations on the portable JAXB-annotated classes. |
| |
| Note that the binding declarations object in |
| the above diagram is logical. Binding declarations can either be inlined |
| within the schema or they can appear in an external binding file that is |
| associated with the source schema. |
| |
| .JAXB 1.0 style binding of schema to interface/implementation classes. |
| image::images/xmlb-4.png[image] |
| |
| Note that the application accesses only the |
| schema-derived interfaces, factory methods and `jakarta.xml.bind` APIs |
| directly. This convention is necessary to enable switching between JAXB |
| implementations. |
| |
| === Java Representation |
| |
| The content tree contains instances of _bound types_, |
| types that bind and provide access to XML content. Each bound |
| type corresponds to one or more schema components. As much as possible, |
| for type safety and ease of use, a bound type that constrains the values |
| to match the schema constraints of the schema components. The different |
| bound types, which may be either schema derived or authored by a user, |
| are described below. |
| |
| *_Value Class_* A coarse grained schema |
| component, such as a complex type definition, is bound to a Value class. |
| The Java class hierarchy is used to preserve XML Schema’s “derived by |
| extension” type definition hierarchy. JAXB-annotated classes are |
| portable and in comparison to schema derived interfaces/implementation |
| classes, result in a smaller number of classes. |
| |
| *_Property_* A fine-grained schema component, |
| such as an attribute declaration or an element declaration with a simple |
| type, is bound directly to a _property_ or a _field_ within a value class. |
| |
| A property is _realized_ in a value class by |
| a set of JavaBeans-style _access methods_. These methods include the |
| usual `get` and `set` methods for retrieving and modifying a property’s |
| value; they also provide for the deletion and, if appropriate, the |
| re-initialization of a property’s value. |
| |
| Properties are also used for references from |
| one content instance to another. If an instance of a schema component |
| _X_ can occur within, or be referenced from, an instance of some other |
| component _Y_ then the content class derived from _Y_ will define a |
| property that can contain instances of _X_. |
| |
| Binding a fine-grained schema component to a |
| field is useful when a bound type does not follow the JavaBeans |
| patterns. It makes it possible to map such types to a schema without the |
| need to refactor them. |
| |
| *_Interface_* JAXB 1.0 bound schema components |
| (XML content) to schema derived content interfaces and implementation |
| classes. The interface/implementation classes tightly couple the schema |
| derived implementation classes to the Jakarta XML Binding implementation |
| runtime framework and are thus not portable. The binding of schema components to |
| schema derived interfaces continues to be supported in Jakarta XML Binding. |
| |
| [NOTE] |
| .Note |
| ==== |
| The mapping of existing Java interfaces to schema constructs is not |
| supported. Since an existing class can implement multiple interfaces, |
| there is no obvious mapping of existing interfaces to XML schema constructs. |
| |
| ==== |
| |
| *_Enum type_* J2SE 5.0 platform introduced |
| linguistic support for type safe enumeration types. Enum type are used |
| to represent values of schema types with enumeration values. |
| |
| *_Collection type_* Collections are used to |
| represent content models. Where possible, for type safety, parametric |
| lists are used for homogeneous collections. For e.g. a repeating element |
| in content model is bound to a parametric list. |
| |
| *_DOM node_* In some cases, binding XML content |
| to a DOM or DOM like representation rather than a collection of types is |
| more natural to a programmer. One example is an open content model that |
| allows elements whose types are not statically constrained by the |
| schema. |
| |
| Content tree can be created by unmarshalling |
| of an XML document or by programmatic construction. Each bound type in |
| the content tree is created as follows: |
| |
| * schema derived implementation classes that |
| implement schema derived interfaces can be created using factory methods |
| generated by the schema compiler. |
| * schema derived value classes can be created |
| using a constructor or a factory method generated by the schema |
| compiler. |
| * existing types, authored by users, are |
| required to provide a no arg constructor. The no arg constructor is used |
| by an unmarshaller during unmarshalling to create an instance of the |
| type. |
| |
| ==== Binding Declarations |
| |
| A particular binding of a given source schema |
| is defined by a set of _binding declarations_ . Binding declarations are |
| written in a _binding language_ , which is itself an application of XML. |
| A binding declaration can occur within the annotation `appinfo` of each |
| XML Schema component. Alternatively, binding declarations can occur in |
| an auxiliary file. Each binding declaration within the auxiliary file is |
| associated to a schema component in the source schema. It was necessary |
| to support binding declarations external to the source schema in order |
| to allow for customization of an XML Schemas that one prefers not to |
| modify. The schema compiler hence actually requires two inputs, a source |
| schema and a set of binding declarations. |
| |
| Binding declarations enable one to override |
| default binding rules, thereby allowing for user customization of the |
| schema-derived value class. Additionally, binding declarations allow for |
| further refinements to be introduced into the binding to Java |
| representation that could not be derived from the schema alone. |
| |
| The binding declarations need not define |
| every last detail of a binding. The schema compiler assumes _default |
| binding declarations_ for those components of the source schema that are |
| not mentioned explicitly by binding declarations. Default declarations |
| both reduce the verbosity of the customization and make it more robust |
| to the evolution of the source schema. The defaulting rules are |
| sufficiently powerful that in many cases a usable binding can be |
| produced with no binding declarations at all. By defining a standardized |
| format for the binding declarations, it is envisioned that tools will be |
| built to greatly aid the process of customizing the binding from schema |
| components to a Java representation. |
| |
| ==== Mapping Annotations |
| |
| A mapping annotation defines the mapping of a |
| program element to one or more schema components. A mapping annotation |
| typically contains one or more annotation members to allow customized |
| binding. An annotation member can be required or optional. A mapping |
| annotation can be collocated with the program element in the source. The |
| schema generator hence actually requires both inputs: a set of classes |
| and a set of mapping annotations. |
| |
| Defaults make it easy to use the mapping |
| annotations. In the absence of a mapping annotation on a program |
| element, the schema generator assumes, when required by a mapping rule, |
| a _default mapping annotation_. This, together with an appropriate choice |
| of default values for optional annotation members makes it possible to |
| produce in many cases a usable mapping with minimal mapping annotations. |
| Thus mapping annotations provide a powerful yet easy to use |
| customization mechanism. |
| |
| === Annotations |
| |
| Many of the architectural components are driven by program |
| annotations defined by this specification, _mapping annotations_. |
| |
| *_Java to Schema Mapping_* Mapping annotations |
| provide meta data that describe or customize the mapping of existing |
| classes to a derived schema. |
| |
| *_Portable Value Classes_* Mapping annotations |
| provide information for unmarshalling and marshalling of an XML instance |
| into a content tree representing the XML content without the need for a |
| schema at run time. Thus schema derived code annotated with mapping |
| annotations are portable i.e. they are capable of being marshalled and |
| unmarshalled by a universal marshaller and unmarshaller written by a |
| JAXB vendor implementation. |
| |
| *_Adding application specific behavior and data_* |
| Applications can choose to add either behavior or data to schema derived |
| code. Section <<Modifying Schema-Derived Code>> |
| specifies how the mapping annotation, `@jakarta.annotation.Generated`, |
| should be used by a developer to denote developer added/modified code |
| from schema-derived code. This information can be utilized by tools to |
| preserve application specific code across regenerations of schema |
| derived code. |
| |
| === Binding Framework |
| |
| The binding framework has been revised |
| significantly since JAXB 1.0. Significant changes include: |
| |
| * support for unmarshalling of invalid XML |
| content. |
| * removal of on-demand validation. |
| * unmarshal/marshal time validation deferring |
| to JAXP validation. |
| |
| ==== Unmarshalling |
| |
| ===== Invalid XML Content |
| |
| *_Rationale:_* Invalid XML content can arise for |
| many reasons: |
| |
| * When the cost of validation needs to be avoided. |
| * When the schema for the XML has evolved. |
| * When the XML is from a non-schema-aware processor. |
| * When the schema is not authoritative. |
| |
| Support for invalid XML content required |
| changes to JAXB 1.0 schema to java binding rules as well as the |
| introduction of a flexible unmarshalling mode. These changes are |
| described in <<Unmarshalling Modes>>. |
| |
| ==== Validation |
| |
| The constraints expressed in a schema fall |
| into three general categories: |
| |
| * A _type_ constraint imposes requirements |
| upon the values that may be provided by constraint facets in simple type |
| definitions. |
| * A _local structural_ constraint imposes |
| requirements upon every instance of a given element type, e.g., that |
| required attributes are given values and that a complex element’s |
| content matches its content specification. |
| * A _global structural_ constraint imposes |
| requirements upon an entire document, e.g., that `ID` values are unique |
| and that for every `IDREF` attribute value there exists an element with |
| the corresponding `ID` attribute value. |
| |
| A _document_ is valid if, and only if, all of |
| the constraints expressed in its schema are satisfied. The manner in |
| which constraints are enforced in a set of derived classes has a |
| significant impact upon the usability of those classes. All constraints |
| could, in principle, be checked only during unmarshalling. This approach |
| would, however, yield classes that violate the _fail-fast_ principle of |
| API design: errors should, if feasible, be reported as soon as they are |
| detected. In the context of schema-derived classes, this principle |
| ensures that violations of schema constraints are signalled when they |
| occur rather than later on when they may be more difficult to diagnose. |
| |
| With this principle in mind we see that schema |
| constraints can, in general, be enforced in three ways: |
| |
| * _Static_ enforcement leverages the type |
| system of the Java programming language to ensure that a schema |
| constraint is checked at application’s compilation time. Type |
| constraints are often good candidates for static enforcement. If an |
| attribute is constrained by a schema to have a boolean value, e.g., then |
| the access methods for that attribute’s property can simply accept and |
| return values of type `boolean`. |
| * _Simple dynamic_ enforcement performs a |
| trivial run-time check and throws an appropriate exception upon failure. |
| Type constraints that do not easily map directly to Java classes or |
| primitive types are best enforced in this way. If an attribute is |
| constrained to have an integer value between zero and 100, e.g., then |
| the corresponding property’s access methods can accept and return `int` |
| values and its mutation method can throw a run-time exception if its |
| argument is out of range. |
| * _Complex dynamic_ enforcement performs a |
| potentially costly run-time check, usually involving more than one |
| content object, and throwing an appropriate exception upon failure. |
| Local structural constraints are usually enforced in this way: the |
| structure of a complex element’s content, e.g., can in general only be |
| checked by examining the types of its children and ensuring that they |
| match the schema’s content model for that element. Global structural |
| constraints must be enforced in this way: the uniqueness of `ID` values, |
| e.g., can only be checked by examining the entire content tree. |
| |
| It is straightforward to implement both static |
| and simple dynamic checks so as to satisfy the fail-fast principle. |
| Constraints that require complex dynamic checks could, in theory, also |
| be implemented so as to fail as soon as possible. The resulting classes |
| would be rather clumsy to use, however, because it is often convenient |
| to violate structural constraints on a temporary basis while |
| constructing or manipulating a content tree. |
| |
| Consider, e.g., a complex type definition |
| whose content specification is very complex. Suppose that an instance of |
| the corresponding value class is to be modified, and that the only way |
| to achieve the desired result involves a sequence of changes during |
| which the content specification would be violated. If the content |
| instance were to check continuously that its content is valid, then the |
| only way to modify the content would be to copy it, modify the copy, and |
| then install the new copy in place of the old content. It would be much |
| more convenient to be able to modify the content in place. |
| |
| A similar analysis applies to most other sorts |
| of structural constraints, and especially to global structural |
| constraints. Schema-derived classes have the ability to enable or |
| disable a mode that verifies type constraints. JAXB mapped classes can |
| optionally be validated at unmarshal and marshal time. |
| |
| ===== Validation Re architecture |
| |
| The detection of complex schema constraint |
| violations has been redesigned to have a Jakarta XML Binding implementation to |
| delegate to the validation API in JAXP. JAXP defines a standard |
| validation API (`javax.xml.validation` package) for validating XML |
| content against constraints within a schema. Furthermore, JAXP has |
| been incorporated into J2SE 5.0 platform. Any Jakarta XML Binding implementation |
| that takes advantage of the validation API will result in a smaller |
| footprint. |
| |
| ===== Unmarshal validation |
| |
| When the unmarshalling process incorporates |
| validation and it successfully completes without any validation errors, |
| both the input document and the resulting content tree are guaranteed to |
| be valid. |
| |
| However, always requiring validation during |
| unmarshalling proves to be too rigid and restrictive a requirement. |
| Since existing XML parsers allow schema validation to be disabled, there |
| exist a significant number of XML processing uses that disable schema |
| validation to improve processing speed and/or to be able to process |
| documents containing invalid or incomplete content. To enable the JAXB |
| architecture to be used in these processing scenarios, the binding |
| framework makes validation optional. |
| |
| ===== Marshal Validation |
| |
| Validation may also be optionally performed |
| at marshal time. This is new for Jakarta XML Binding. Validation of object graph |
| while marshalling is useful in web services where the marshalled output |
| must conform to schema constraints specified in a WSDL document. This |
| could provide a valuable debugging aid for dealing with any |
| interoperability problems |
| |
| ===== Handling Validation Failures |
| |
| While it would be possible to notify a JAXB |
| application that a validation error has occurred by throwing a |
| `JAXBException` when the error is detected, this means of communicating |
| a validation error results in only one failure at a time being handled. |
| Potentially, the validation operation would have to be called as many |
| times as there are validation errors. Both in terms of validation |
| processing and for the application’s benefit, it is better to detect as |
| many errors and warnings as possible during a single validation pass. To |
| allow for multiple validation errors to be processed in one pass, each |
| validation error is mapped to a validation error event. A validation |
| error event relates the validation error or warning encountered to the |
| location of the text or object(s) involved with the error. The stream of |
| potential validation error events can be communicated to the application |
| either through a registered validation event handler at the time the |
| validation error is encountered, or via a collection of validation |
| failure events that the application can request after the operation has |
| completed. |
| |
| Unmarshalling and marshalling are the two |
| operations that can result in multiple validation failures. The same |
| mechanism is used to handle both failure scenarios. See |
| <<General Validation Processing>> for further details. |
| |
| === An example |
| |
| Throughout this specification we will refer |
| and build upon the familiar schema from [XSD Part 0], which describes a |
| purchase order, as a running example to illustrate various binding |
| concepts as they are defined. Note that all schema name attributes with |
| values in *this font* are bound by JAXB technology to either a Java |
| interface or JavaBean-like property. Please note that the derived Java |
| code in the example only approximates the default binding of the |
| schema-to-Java representation. |
| |
| [source,xml,subs="specialcharacters,quotes"] |
| ---- |
| <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> |
| <xsd:element name=*"purchaseOrder"* type=*"PurchaseOrderType"*/> |
| <xsd:element name=*"comment"* type=*"xsd:string"*/> |
| <xsd:complexType name=*"PurchaseOrderType"*> |
| <xsd:sequence> |
| <xsd:element name=*"shipTo"* type="USAddress"/> |
| <xsd:element name=*"billTo"* type="USAddress"/> |
| <xsd:element ref=*"comment"* minOccurs="0"/> |
| <xsd:element name=*"items"* type="Items"/> |
| </xsd:sequence> |
| <xsd:attribute name=*"orderDate"* type="xsd:date"/> |
| </xsd:complexType> |
| |
| <xsd:complexType name=*"USAddress"*> |
| <xsd:sequence> |
| <xsd:element name=*"name"* type="xsd:string"/> |
| <xsd:element name=*"street"* type="xsd:string"/> |
| <xsd:element name=*"city"* type="xsd:string"/> |
| <xsd:element name=*"state"* type="xsd:string"/> |
| <xsd:element name=*"zip"* type="xsd:decimal"/> |
| </xsd:sequence> |
| <xsd:attribute name=*"country"* type="xsd:NMTOKEN" fixed="US"/> |
| </xsd:complexType> |
| |
| <xsd:complexType name=*"Items"* > |
| <xsd:sequence> |
| <xsd:element name=*"item"* minOccurs="1" maxOccurs="unbounded"> |
| <xsd:complexType> |
| <xsd:sequence> |
| <xsd:element name=*"productName"* type="xsd:string"/> |
| <xsd:element name=*"quantity"* > |
| <xsd:simpleType> |
| <xsd:restriction base="xsd:positiveInteger"> |
| <xsd:maxExclusive value="100"/> |
| </xsd:restriction> |
| </xsd:simpleType> |
| </xsd:element> |
| <xsd:element name=*"USPrice"* type="xsd:decimal"/> |
| <xsd:element ref=*"comment"* minOccurs="0"/> |
| <xsd:element name=*"shipDate"* type="xsd:date" minOccurs="0"/> |
| </xsd:sequence> |
| <xsd:attribute name=*"partNum"* type="SKU" use="required"/> |
| </xsd:complexType> |
| </xsd:element> |
| </xsd:sequence> |
| </xsd:complexType> |
| |
| <!-- Stock Keeping Unit, a code for identifying products --> |
| <xsd:simpleType name=*"SKU"* > |
| <xsd:restriction base="xsd:string"> |
| <xsd:pattern value="\d{3}-[A-Z]{2}"/> |
| </xsd:restriction |
| </xsd:simpleType> |
| </xsd:schema> |
| ---- |
| |
| Binding of purchase order schema to a Java |
| representationfootnote:[In the interest of |
| terseness, Jakarta XML Binding program annotations have been ommitted.]: |
| |
| [source,java,subs="+macros"] |
| ---- |
| import javax.xml.datatype.XMLGregorianCalendar; import java.util.List; |
| public class PurchaseOrderType { |
| USAddress getShipTo() {...} void setShipTo(USAddress) {...} |
| USAddress getBillTo() {...} void setBillTo(USAddress) {...} |
| /** Optional to set Comment property. */ |
| String getComment() {...} void setComment(String) {...} |
| Items getItems() {...} void setItems(Items) {...} |
| XMLGregorianCalendar getOrderDate() void setOrderDate(XMLGregorianCalendar) |
| }; |
| public class USAddress { |
| String getName() {...} void setName(String) {...} |
| String getStreet() {...} void setStreet(String) {...} |
| String getCity() {...} void setCity(String) {...} |
| String getState() {...} void setState(String) {...} |
| int getZip() {...} void setZip(int) {...} |
| static final String COUNTRY=”USA”;footnote:creq[Appropriate |
| customization required to bind a fixed attribute to a constant value.] |
| }; |
| public class Items { |
| public class ItemType { |
| String getProductName() {...} void setProductName(String) {...} |
| /** Type constraint on Quantity setter value 0..99.footnote:[Type constraint |
| checking only performed if customization enables it and implementation supports fail-fast checking] */ |
| int getQuantity() {...} void setQuantity(int) {...} |
| float getUSPrice() {...} void setUSPrice(float) {...} |
| /** Optional to set Comment property. */ |
| String getComment() {...} void setComment(String) {...} |
| XMLGregorianCalendar getShipDate(); void setShipDate(XMLGregorianCalendar); |
| /** Type constraint on PartNum setter value "\d{3}-[A-Z]{2}".footnote:creq[] */ |
| String getPartNum() {...} void setPartNum(String) {...} |
| }; |
| |
| /** Local structural constraint 1 or more instances of Items.ItemType */ |
| List<Items.ItemType> getItem() {...} |
| } |
| public class ObjectFactory { |
| // type factories |
| Object newInstance(Class javaInterface) {...} |
| PurchaseOrderType createPurchaseOrderType() {...} |
| USAddress create USAddress() {...} |
| Items createItems() {...} |
| Items.ItemType createItemsItemType() {...} |
| // element factories |
| JAXBElement<PurchaseOrderType> createPurchaseOrder(PurchaseOrderType) {...} |
| JAXBElement<String> createComment(String value) {...} |
| } |
| ---- |
| |
| The purchase order schema does not describe |
| any global structural constraints. |
| |
| The coming chapters will identify how these |
| XML Schema concepts were bound to a Java representation. Just as in [XSD |
| Part 0], additions will be made to the schema example to illustrate the |
| binding concepts being discussed. |
| |