Thanks for the quick response Tatu! I am delighted that at least it is not an immediate "this will not work" conclusion because of fundamental design principles.
I think discussing this here is good -- I will be out until next week now but wanted to send a quick response before that.
I appreciate your time - no rush at all.
Out of curiosity, is any work related to these issues already on the Jackson roadmap, which we can piggyback off, or is there no concrete work planned in the area?
Just to zoom in a bit on (5), because you mention it is probably the trickiest, and it might be a good indication of "how far" we can go with Jackson. The use case I have described (deserialize two properties with the same name with a different order), is actually not an important use case on its own, but it becomes much more relevant in interaction with (2) (unwrapping) and (3) (substitution groups). Two use cases I have seen while POC-ing support for some real XSD's are described below.
a) Having the same property name on different levels in the Java pojo, but because of unwrapping they overlap.
Example structure taken straight out of a real XSD, but simplified.
Interpretation: you either have an `issuer` element followed by a single `tradeId`
element, OR you have a `partyReference` element followed by a variable
number of `tradeId` elements.
```
<xs:complexType name="Trade">
<xs:choice>
<xs:sequence>
<xs:element name="issuer" type="IssuerId"/>
<xs:element name="tradeId" type="TradeId"/>
</xs:sequence>
<xs:sequence>
<xs:element name="partyReference" type="PartyReference"/>
<xs:element name="tradeId" type="TradeId" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:choice>
</xs:complexType>
```
We currently represent this something like the following in Java: (using records to concisely show structure - we actually use classes)
```
record Trade(TradeOpt1 opt1, TradeOpt2 opt2) {}
record TradeOpt1(IssuerId issuer, TradeId tradeId) {}
record TradeOpt2(PartyReference partyReference, List<TradeId> tradeIds) {}
```
where we unwrap `TradeOpt1` and `TradeOpt2`. At this point, however, when we encounter a `tradeId` element, we somehow need to know whether to set it to `TradeOpt1` or to add it to the list of `TradeOpt2`. Right now, BOTH happen. (in other situations I have seen one of the two taking precedence, depending on the exact unwrapping structure)
b) A substituted name overlaps with an already existing element name on the type
Another example structure based on what I have seen in a real XSD.
Note that the element called `substituted` can be substituted by an element called `foo`.
```
<xs:complexType name="Root">
<xs:sequence>
<xs:element ref="substituted"/>
<xs:element name="inbetween" type="xs:string"/>
<xs:element name="foo" type="Foo"/>
</xs:sequence>
</xs:complexType>
<xs:element name="substituted" type="Parent"/>
<xs:element name="foo" type="Foo" substitutionGroup="substituted"/>
<!-- assume type Foo extends type Parent -->
```
In this scenario, a sample such as
```
<root>
<foo></foo>
<inbetween>value</inbetween>
<foo></foo>
</root>
```
should be able to decide that the first `foo` element should deserialize into the `substituted` property, and the second `foo` element should deserialize into the `foo` element, given below structure.
```
record Root(Parent substituted, String inbetween, Foo foo) {}
```
Thoughts...
In order to support this, I think it would require work to extend how Jackson is able to identify properties. Some ideas:
- based on element index, although that does not work well if some elements are optional, or if some elements can occur multiple times.
- based on a selector which allows relative matching, e.g., "the element that comes after another element", such as
XPath.
... or a drastically different approach, e.g., deserializing using recursive descent with backtracking, instead of based on property names.
Then there is thinking about how to support this without breaking other backends. Again high-level ideas I can think of:
- making matching on `PropertyName` more generic. E.g., instead of fetching a deserializer straight from a map, add a layer of abstraction that exposes a method `findMatchingProperty`, which backends can override based on their own element identification. The default implementation would lookup a property in a map using `PropertyName`.
- entirely skipping the regular Jackson way of building deserializers, and creating a custom BeanDeserializer that implements its own lookup system.
- entirely skipping the regular Jackson way of building deserializers, and creating a custom recursive descent deserializer.
All of them seem like quite a chunk of work, and require careful thought about their implications. So: any thoughts on whether this is achievable at all? Other ideas?
I assume use cases (1) - (4) would be less involved than this, but as I show in my examples, they will break when they interact with (5), hence why I just want to check upfront whether (5) is doable at all.