I've discovered some unexpected behavior (to me) as I've been building the schema for an app I'm creating. This is my first project using XSD schemas, so it is possible that I am just misunderstanding the way substitution groups are supposed to work. I've noted the suggestion in various places to use <choice> instead, but I'd still like to understand how it works, and I do have a slight preference for the use of substitution groups in my case. Anyway here's what I found:
I started with a hierarchy of types that extend the same base, and using the base as a substitution group. There is also a root level type that has a sequence of the base (or, by implication, any combination of types in the sub group):
<xs:schema targetNamespace="test_1_0" elementFormDefault="qualified" xmlns="test_1_0" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="basicType">
<xs:attribute name="name" type="xs:string" />
</xs:complexType>
<xs:element name="basic" type="basicType"/>
<xs:complexType name="containerType">
<xs:complexContent>
<xs:extension base="basicType">
<xs:sequence>
<xs:element ref="basic" minOccurs="1" maxOccurs="unbounded"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:element name="container" type="containerType" substitutionGroup="basic"/>
<xs:complexType name="someOtherType">
<xs:complexContent>
<xs:extension base="basicType">
<xs:attribute name="special" type="xs:boolean" />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:element name="someOther" type="someOtherType" substitutionGroup="basic"/>
<xs:complexType name="rootType">
<xs:sequence>
<xs:element ref="basic" minOccurs="1" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:element name="root" type="rootType"/>
</xs:schema> With this schema, I would expect to be able to have a document such as:
<root>
<basic name="something"/>
<someOther name="some-other-thing" special="true" />
<containerType name="some-container">
<someOther name="some-nested-thing" special="false" />
<someOther name="some-other-nested-thing" special="false" />
</containerType>
</root>
Note in particular that the basicType is not abstract and is intended to be a valid element anywhere the substitution group is evoked. And indeed, The DomXmlReader can load this document just fine. But once I go to write the document back out to file, it is throwing an error: "
No suitable Substitution Group found for node <basicType>" (See DomXmlWriter.WriteStartElement(), line 175). It would appear that the SubstitutionGroupChildRule does not include the original element in the substitutions - implying that a substitution is mandatory.
Q1: Is this correct?At any rate, I discovered that I could begin to work around it by making a base type for basic:
<xs:complexType abstract="true" name="realBaseType"/>
<xs:element name="realBase" type="realBaseType"/>
<xs:complexType name="basicType">
<xs:complexContent>
<xs:extension base="realBaseType">
<xs:attribute name="name" type="xs:string" />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:element name="basic" type="basicType" substitutionGroup="realBase"/>
<!-- All other types are the same except that substitutionGroups are all changed to realBase -->
The
basicType now serves as an intermediary type so that others can inherit it's definition (the name attr, in this example). So now I can read and write the document without exception, but the resulting document is not as expected. All elements in the substitution group are written out as
basic elements, and, of course, any data that is not defined in the
basicType doesn't get written, as that would be invalid to the schema:
<root>
<basic name="something"/>
<basic name="some-other-thing" />
<basic name="some-container" />
</root>
This is happening because when DomXmlWriter searches for a substitution that is assignable, and picks the first one found (see line 172). The basicType is right at the top of the list, so it picks that one every time. Indeed, I can fix this by moving the basicType, in the schema, down below all of those that extend it. This completes the workaround, but it feels a bit fragile that such an important semantic would rest on the order of types in the XSD. It also means if I wanted to open this XSD up for extension later, all new subtypes of basic would have to be inserted before basicType, which at best is a piece of tribal knowledge that I would need to document well.
Q2: Would it make sense to have DomXmlWriter be smarter about selecting the element to write? At least checking for the exact type first?
I hope my findings are of interest. If anyone more familiar with XSD and ATF thinks they warrant a change, I'd be happy to assist.