Doubts about XML namespace handling

15 views
Skip to first unread message

Axel Guckelsberger

unread,
Nov 13, 2023, 4:28:25 AM11/13/23
to Smooks Users
Hi all,

I have a question regarding how the service segments are handled in the infoset XML files.

For example Smooks created something like this for a given EDIFACT document:

<?xml version="1.0" encoding="UTF-8"?>
<D96A:Interchange xmlns:D96A="http://www.ibm.com/dfdl/edi/un/edifact/D96A" xmlns:srv="http://www.ibm.com/dfdl/edi/un/service/4.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<UNA>
  <CompositeSeparator>:</CompositeSeparator>
  <FieldSeparator>+</FieldSeparator>
  <DecimalSeparator>.</DecimalSeparator>
  <EscapeCharacter>?</EscapeCharacter>
  <RepeatSeparator/>
  <SegmentTerminator>'</SegmentTerminator>
</UNA>
<UNB>
  <S001>
    <E0001>UNOD</E0001>
    <E0002>3</E0002>
  </S001>

I find it a bit strange that the service segments are not prefixed with "srv:".

If I create such a document programmatically (without Smooks but just using DOMDocument etc.) it results in something like:

<?xml version="1.0" encoding="UTF-8"?>
<Interchange xmlns="http://www.ibm.com/dfdl/edi/un/edifact/D19B">
<UNA>
  <CompositeSeparator xmlns="http://www.ibm.com/dfdl/edi/un/service/4.1">:</CompositeSeparator>
  <FieldSeparator xmlns="http://www.ibm.com/dfdl/edi/un/service/4.1">+</FieldSeparator>
  <DecimalSeparator xmlns="http://www.ibm.com/dfdl/edi/un/service/4.1">.</DecimalSeparator>
  <EscapeCharacter xmlns="http://www.ibm.com/dfdl/edi/un/service/4.1">?</EscapeCharacter>
  <RepeatSeparator xmlns="http://www.ibm.com/dfdl/edi/un/service/4.1">*</RepeatSeparator>
  <SegmentTerminator xmlns="http://www.ibm.com/dfdl/edi/un/service/4.1">'</SegmentTerminator>
</UNA>
<UNB>
  <S001 xmlns="http://www.ibm.com/dfdl/edi/un/service/4.1">
    <E0001>UNOC</E0001>
    <E0002>4</E0002>
  </S001>

or

<?xml version="1.0" encoding="UTF-8"?>
<Interchange xmlns="http://www.ibm.com/dfdl/edi/un/edifact/D19B"
xmlns:srv="http://www.ibm.com/dfdl/edi/un/service/4.1">
<UNA>
  <srv:CompositeSeparator>:</
srv:CompositeSeparator>
  <
srv:FieldSeparator>+</srv:FieldSeparator>
  <
srv:DecimalSeparator>.</srv:DecimalSeparator>
  <
srv:EscapeCharacter>?</srv:EscapeCharacter>
  <
srv:RepeatSeparator>*</srv:RepeatSeparator>
  <
srv:SegmentTerminator>'</srv:SegmentTerminator>
</UNA>
<UNB>
  <
srv:S001>
    <
srv:E0001>UNOC</srv:E0001>
    <
srv:E0002>4</srv:E0002>
  </
srv:S001>


So is it possible that Smooks creates (and expects) a wrong structure? Or am I missing something? Thank you for shedding some light on this :-)

Best regards,
Axel

Claude Mamo

unread,
Nov 16, 2023, 11:52:10 PM11/16/23
to smook...@googlegroups.com
Interesting question Axel. Smooks's DFDL processor (i.e., Apache Daffodil) doesn't seem to care about the infoset's namespaces with some notable exceptions (e.g., global element). I experimented directly with Apache Daffodil by removing all the namespace declarations and the infoset was unparsed. I also prefixed the infoset elements with invalid namespaces and the infoset got unparsed as well. My reasoning could be wrong here but this might be by design because the DFDL schema in reality is not an XSD of the infoset. The DFDL schema provides instructions to the DFDL processor for turning the EDIFACT into XML and the XML into EDIFACT but, strictly speaking, an element in the infoset does not belong to a type in the DFDL schema so, generally speaking, there's no need to prefix the infoset elements with namespaces from the DFDL schema.

I will follow up on this but I wouldn't be worried about it unless it's giving you issues. Probably xmlns:srv should not be even in the infoset.

Claude

--
You received this message because you are subscribed to the Google Groups "Smooks Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smooks-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smooks-user/12d0266b-bcb0-4163-ba00-a164ad5b0e43n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages