DomXmlReader and DomXmlWriter behavior with SubstitutionGroups

36 views
Skip to first unread message

TwainJ

unread,
May 30, 2015, 11:38:33 PM5/30/15
to authoring-to...@googlegroups.com
 I've discovered some unexpected behavior (to me) as I've been building the schema for an app I'm creating. This is my first project using XSD schemas, so it is possible that I am just misunderstanding the way substitution groups are supposed to work. I've noted the suggestion in various places to use <choice> instead, but I'd still like to understand how it works, and I do have a slight preference for the use of substitution groups in my case. Anyway here's what I found:

I started with a hierarchy of types that extend the same base, and using the base as a substitution group. There is also a root level type that has a sequence of the base (or, by implication, any combination of types in the sub group):

<xs:schema targetNamespace="test_1_0" elementFormDefault="qualified" xmlns="test_1_0" xmlns:xs="http://www.w3.org/2001/XMLSchema">

   
<xs:complexType name="basicType">
       
<xs:attribute name="name" type="xs:string" />
   
</xs:complexType>
   
<xs:element name="basic" type="basicType"/>

   
<xs:complexType name="containerType">
       
<xs:complexContent>
           
<xs:extension base="basicType">
               
<xs:sequence>
                   
<xs:element ref="basic" minOccurs="1" maxOccurs="unbounded"/>
               
</xs:sequence>
           
</xs:extension>
       
</xs:complexContent>
   
</xs:complexType>
   
<xs:element name="container" type="containerType" substitutionGroup="basic"/>
   
   
<xs:complexType name="someOtherType">
       
<xs:complexContent>
           
<xs:extension base="basicType">
               
<xs:attribute name="special" type="xs:boolean" />
           
</xs:extension>
       
</xs:complexContent>
   
</xs:complexType>
   
<xs:element name="someOther" type="someOtherType" substitutionGroup="basic"/>

   
   
<xs:complexType name="rootType">
       
<xs:sequence>
           
<xs:element ref="basic" minOccurs="1" maxOccurs="unbounded"/>
       
</xs:sequence>
   
</xs:complexType>
   
<xs:element name="root" type="rootType"/>
   
</xs:schema>

With this schema, I would expect to be able to have a document such as:
<root>
   
<basic name="something"/>
   
<someOther name="some-other-thing" special="true" />
   
<containerType name="some-container">
       
<someOther name="some-nested-thing" special="false" />
       
<someOther name="some-other-nested-thing" special="false" />
   
</containerType>
</root>

Note in particular that the basicType is not abstract and is intended to be a valid element anywhere the substitution group is evoked. And indeed, The DomXmlReader can load this document just fine. But once I go to write the document back out to file, it is throwing an error: "No suitable Substitution Group found for node <basicType>" (See DomXmlWriter.WriteStartElement(), line 175).  It would appear that the SubstitutionGroupChildRule does not include the original element in the substitutions - implying that a substitution is mandatory. Q1: Is this correct?

At any rate, I discovered that I could begin to work around it by making a base type for basic:
    <xs:complexType abstract="true" name="realBaseType"/>
   
<xs:element name="realBase" type="realBaseType"/>

   
<xs:complexType name="basicType">
       
<xs:complexContent>
           
<xs:extension base="realBaseType">
               
<xs:attribute name="name" type="xs:string" />
           
</xs:extension>
       
</xs:complexContent>
   
</xs:complexType>
   
<xs:element name="basic" type="basicType" substitutionGroup="realBase"/>

<!-- All other types are the same except that substitutionGroups are all changed to realBase -->

The basicType now serves as an intermediary type so that others can inherit it's definition (the name attr, in this example). So now I can read and write the document without exception, but the resulting document is not as expected. All elements in the substitution group are written out as basic elements, and, of course, any data that is not defined in the basicType doesn't get written, as that would be invalid to the schema:
<root>
   
<basic name="something"/>
   
<basic name="some-other-thing" />
   
<basic name="some-container" />
</root>

This is happening because when DomXmlWriter searches for a substitution that is assignable, and picks the first one found (see line 172). The basicType is right at the top of the list, so it picks that one every time. Indeed, I can fix this by moving the basicType, in the schema, down below all of those that extend it. This completes the workaround, but it feels a bit fragile that such an important semantic would rest on the order of types in the XSD. It also means if I wanted to open this XSD up for extension later, all new subtypes of basic would have to be inserted before basicType, which at best is a piece of tribal knowledge that I would need to document well. Q2: Would it make sense to have DomXmlWriter be smarter about selecting the element to write? At least checking for the exact type first?

I hope my findings are of interest. If anyone more familiar with XSD and ATF thinks they warrant a change, I'd be happy to assist.

Ron2

unread,
Jun 1, 2015, 1:59:18 PM6/1/15
to authoring-to...@googlegroups.com, ja...@2jdevelopment.com
Hi TwainJ,

Thank you for the in-depth research that you've done. It looks like you've identified a bug with our handling of XSD substitution groups. The substitution should not have been mandatory. The explanation and examples on this XSD website seem pretty clear about this.

On your second point, that looks like a bug, too.

To get these problems fixed in ATF, we need a unit test that demonstrates what the correct behavior should be, and then we need the fixes, too.

TwainJ, do you want to submit a pull request about this? That might be the fastest way for you to get the correct behavior that you would like, if you don't want to rely on your somewhat fragile work-around. Otherwise, I'm not sure I can get to it this week. In the long run, I would like to share ownership of ATF with non-Sony developers, to expand the "circle of trust" that will help ATF continue to be improved. Internally, we're tight on resources for ATF now; there are only two of us now (me and Alan) who will make regular contributions to ATF and we both have other primary responsibilities.

--Ron

TwainJ

unread,
Jun 1, 2015, 2:18:14 PM6/1/15
to authoring-to...@googlegroups.com, ja...@2jdevelopment.com
Thanks Ron,

I'd love to help out with this. This is a side-project for me, so I can't guarantee a timeframe for getting a pull request out, but I'm pretty motivated on this project and I'll tackle this before moving on.

And thanks for that link. I'll check it out, and run my changes by you in this topic to make sure I'm approaching it correctly.

~Jason

TwainJ

unread,
Jun 8, 2015, 3:40:29 PM6/8/15
to authoring-to...@googlegroups.com
Ron,

I have a fix that seems to be working pretty well. I have another unit test I want to write, to check a nuance I suspect, but that shouldn't take long.
Meanwhile, I ran the full suite of Functional and Unit tests, and ended up with a few functional tests failing mysteriously. Or at least it is mysterious to me.

- CircuitEditorTests.EditSaveCloseAndReopen has changed so that instead of completing without user input, it is opening a dialog requesting whether to save or discard the untitled.circuit file. It then times out, waiting for the input.
- DiagramEditorTests.StateChartEditSaveCloseAndReopen is failing with "Object reference not set to an instance of an object."

Through process of elimination, I was able to reproduce when only the changes I made to DomXmlWriter were in place (see the simplified diff file, attached). But in trying to debug these tests, I never hit a breakpoint within that code. This may be because the test is running a python script, and I'm not debugging it properly.

At any rate, the first one puzzles me, as I am not changing any code dealing with the UI of any of the samples. Do you see any hints you could pass along to resolve these?

Thanks,
Jason
DomXmlWriter.diff

Ron AtSony

unread,
Jun 8, 2015, 6:16:26 PM6/8/15
to TwainJ, authoring-to...@googlegroups.com
Hi Jason,

Thanks for the investigation and the diff file. I ran those two functional tests before and after applying the patch. They succeeded before and the failed after. :-(  One potential problem is that with the change, it's possible for actualChildInfo to be null now, whereas previously, it couldn't.

The functional tests are kind of a pain to debug -- you have to attach a debugger to the running process before the Python script runs.

I'll try to figure out what's going on now, and try to fix the original problems with our handling of substitution groups. Hopefully I can have a patch by the end of the day.

--Ron


--
You received this message because you are subscribed to the Google Groups "Authoring Tools Framework" group.
To unsubscribe from this group and stop receiving emails from it, send an email to authoring-tools-fr...@googlegroups.com.
Visit this group at http://groups.google.com/group/authoring-tools-framework.
For more options, visit https://groups.google.com/d/optout.

TwainJ

unread,
Jun 8, 2015, 6:37:30 PM6/8/15
to authoring-to...@googlegroups.com, ja...@2jdevelopment.com
Thanks Ron,

And good point about the possible null reference. I'll spare you my thought process, but it occurred to me that my change was not honoring all of the possible inputs to the DomXmlWriter.WriteStartElement method. I was able to update my code change so that those tests are now passing.

I need to get back to my day job for the moment, but if you like, I can return to this at the end of the day, and finish up a pull request with my changes and unit tests.

Let me know, and thanks again for your help,

Jason


Reply all
Reply to author
Forward
0 new messages