.. _contentModel: Content Model ============= PyXB's content model is used to complete the link between the :ref:`componentModel` and the :ref:`bindingModel`. These classes are the ones that: - determine what Python class attribute is used to store which XML element or attribute; - distinguish those elements that can occur at most once from those that require an aggregation; and - ensure that the ordering and occurrence constraints imposed by the XML `model group `_ are satisfied, when XML is converted to Python instances and vice-versa. Associating XML and Python Objects ---------------------------------- Most of the classes involved in the content model are in the :py:obj:`pyxb.binding.content` module. The relations among these classes are displayed in the following diagram. .. _cd_contentModel: .. image:: Images/ContentModel.jpg In the standard code generation template, both element and attribute values are stored in Python class fields. As noted in :ref:`binding_deconflictingNames` it is necessary to ensure an attribute and an element which have the same name in their containing complex type have distinct names in the Python class corresponding to that type. Use information for each of these is maintained in the type class. This use information comprises: - the original :py:obj:`name ` of the element/attribute in the XML - its :py:obj:`deconflicted name ` in Python - the private name by which the value is stored in the Python instance dictionary Other information is specific to the type of use. The :py:obj:`pyxb.binding.basis.complexTypeDefinition` retains maps from the component's name the attribute use or element use instance corresponding to the component's use. .. _attributeUse: Attribute Uses ^^^^^^^^^^^^^^ The information associated with an `attribute use `_ is recorded in an :py:obj:`pyxb.binding.content.AttributeUse` instance. This class provides: - The :py:obj:`name ` of the attribute - The :py:obj:`default value ` of the attribute - Whether the attribute value is :py:obj:`fixed ` - Whether the `attribute use `_ is :py:obj:`required ` or :py:obj:`prohibited ` - The :py:obj:`type ` of the attribute, as a subclass of :py:obj:`pyxb.binding.basis.simpleTypeDefinition` - Methods to :py:obj:`read `, :py:obj:`set `, and :py:obj:`reset ` the value of the attribute in a given binding instance. A :py:obj:`map ` is used to map from expanded names to AttributeUse instances. This map is defined within the class definition itself. .. _elementUse: Element Uses ^^^^^^^^^^^^ The element analog to an attribute use is an `element declaration `_, and the corresponding information is stored in a :py:obj:`pyxb.binding.content.ElementDeclaration` instance. This class provides: - The :py:obj:`element binding ` that defines the properties of the referenced element, including its type - Whether the use allows :py:obj:`multiple occurrences ` - The :py:obj:`default value ` of the element. Currently this is either C{None} or an empty list, depending on :py:obj:`pyxb.binding.content.ElementDeclaration.isPlural` - Methods to :py:obj:`read `, :py:obj:`set `, :py:obj:`append to ` (only for plural elements), and :py:obj:`reset ` the value of the element in a given binding instance - The :py:obj:`setOrAppend ` method, which is most commonly used to provide new content to a value A :py:obj:`map ` is used to map from expanded names to ElementDeclaration instances. This map is defined within the class definition itself. As mentioned before, when the same element name appears at multiple places within the element content the uses are collapsed into a single attribute on the complex type; thus the map is to the :py:obj:`ElementDeclaration `, not the :py:obj:`ElementUse `. .. _validating-content: Validating the Content Model ---------------------------- As of :ref:`PyXB 1.2.0 `, content validation is performed using the **Finite Automata with Counters (FAC)** data structure, as described in `Regular Expressions with Numerical Constraints and Automata with Counters `_, `Dag Hovland `_, Lecture Notes in Computer Science, 2009, Volume 5684, Theoretical Aspects of Computing - ICTAC 2009, Pages 231-245. This structure allows accurate validation of occurrence and order constraints without the complexity of the original back-tracking validation solution from :ref:`PyXB 1.1.1 ` and earlier. It also avoids the :ticket:`incorrect rejection of valid documents <112>` that (rarely) occurred with the greedy algorithm introduced in :ref:`PyXB 1.1.2 `. Conversion to this data structure also enabled the distinction between :py:obj:`element declaration ` and :py:obj:`element use ` nodes, allowing diagnostics to trace back to the element references in context. The data structures for the automaton and the configuration structure that represents a processing automaton are: .. image:: Images/FACAutomaton.jpg The implementation in PyXB is generally follows the description in the ICTAC 2009 paper. Calculation of first/follow sets has been enhanced to support term trees with more than two children per node. In addition, support for unordered catenation as required for the `"all" model group `_ is implemented by a state that maintains a distinct sub-automaton for each alternative, requiring a layered approach where executon of an automaton is suspended until the subordinate automaton has accepted and a transition out of it is encountered. For more information on the implementation, please see the :py:obj:`FAC module `. This module has been written to be independent of PyXB infrastructure, and may be re-used in other code in accordance with the :ref:`PyXB license `. FAC and the PyXB Content Model ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ As depicted in the :ref:`Content Model class diagram ` each complex type binding class has a :py:obj:`_Automaton ` which encodes the content model of the type as a Finite Automaton with Counters. This representation models the occurrence constraints and sub-element orders, referencing the specific element and wildcard uses as they appear in the schema. Each instance of a complex binding supports an :py:obj:`AutomatonConfiguration ` that is used to validate the binding content against the model. An :py:obj:`ElementUse ` instance is provided as the metadata for automaton states that correspond an element declaration in the schema. Similarly, a :py:obj:`WildcardUse ` instance is used as the metadata for automaton states that correspond to an instance of the `xs:any `_ wildcard schema component. Validation in the automaton delegates through the :py:obj:`SymbolMatch_mixin ` interface to see whether content in the form of a complex type binding instance is conformant to the restrictions on symbols associated with a particular state. When parsing, a transition taken results in the storage of the consumed symbol into the appropriate element attribute or wildcard list in the binding instance. In many cases, the transition from one state to a next is uniquely determined by the content; as long as this condition holds, the :py:obj:`AutomatonConfiguration ` instance retains a single underlying :py:obj:`FAC Configuration ` representing the current state. To generate the XML corresponding to a binding instance, the element and wildcard content of the instance are loaded into a Python dictionary, keyed by the :py:obj:`ElementDeclaration `. These subordinate elements are appended to a list of child nodes as transitions that recognize them are encountered. As of :ref:`PyXB 1.2.0 ` the first legal transition in the order imposed by the schema is taken, and there is no provision for influencing the order in the generated document when multiple orderings are valid. .. ignored ## Local Variables: ## fill-column:78 ## indent-tabs-mode:nil ## End: