On Fri, 2008-03-28 at 18:55 -0700, richard...@gmail.com wrote:
...
> However it looks like the 2.8 schema ditched the "custom"
> tags. ...
Yes, custom tags have been removed in 2.8.
> ...
> I think using the HUD elements in other XML documents will
> be far less valuable.
We're designing this standard for a more interoperable future where the
HMIS data standard is used in larger human services/disaster standards
frameworks, and the capability to incorporate all or part of the HMIS
Schema is key for this. Moreover, we need to be able to validate the
customized schema, which has been a critically missing feature in 2.7.
Just as HMIS imports AIRS types, it's hoped other frameworks may find
some or all components of HMIS useful. But the capability to import
really isn't what's at issue here, but the removal of the capacity for
custom tag expansion. Custom tags using #any or #other actually
preclude the ability to extend via importation, but not vice versa.
> Think about it, if a document adheres to the
> HUD standard then any vendor can process it, choosing to ignore or
> parse the custom elements as needed.
How will the receiving vendor validate the custom elements? The
receiving system may wish to ignore the custom elements, but likely
there is valuable data being put in the custom tags the customer wants
preserved and shouldn't be ignored by the receiving system. Conversely,
if a schema extension is used, the changes can be explicitly described
in advance (as a point of negotiation betwen vendors or what-have-you),
and enforced through validation.
> However with the HUD elements
> embedded in some other unknown XML it is much more difficult to
> process the XML as all bets are off as to where elements might be.
But the XML is not unknown. The XML is explicitly described in advance
by an extension schema. In other words, first the parties agree upon
extension, then the XML flows between parties and validated by the
receiver.
> For example in HUD 2.7 I'm guaranteed the ServiceEvent element will be
> under a Need which is thus guaranteed to contain a NeedID.
Just to clarify, there is no NeedID in 2.7 (that's a 2.8 feature), but I
get your point. This same argument could be made against custom tags,
which are even more of a grab bag, since there wouldn't even exist a
controlling schema. You could perhaps, using a prefixed, declared
namespace in the custom tag to provide a validation schema. However,
this would, as mentioned above, preclude type reuse/importation, so why
not go with the more flexible alternative. Also, importation is more
object-oriented, as the base classes (HMIS types) get inherited, so to
speak, into the extension schema which further specializes them.
> In some
> other XML format that simply includes the HUD schema and re-uses it's
> complex types this might not be true. Thus if my processor depends on
> the NeedID to process the ServiceEvent then I can't use the data.
True. So you wouldn't agree to receiving an extended schema that
Likewise, one could put services unassociated to NeedIDs in a custom
tag, and you wouldn't have an easy way to catch it. I should disclose
that, in the latest revision of the 2.8, it was requested that Services
not require a parent need, since HUD doesn't specify needs in their data
standard (useful as they are for CoC performance measurement, etc.). So
one can have a standalone service, or have services linked to (no longer
under hierarchically) a need. An extension schema could enforce one or
the other between systems. The latest draft version is in the SVN
repository in the same place as always.
> Further this makes writing a processor far more complex as I'm no
> longer assured of the structure of the document making XSL processing
> infinitely harder and slower as doing a foreach() on the NEED in the
> above example is out of the question.
So the XSL processing should run against XML restricted by the extended
schema to avoid ambiguity. It could be stipulated that the extended
schema in no way removes or rearranges the original HMIS types, only
extends existing ones in place (and I should write this in as a
best-practice in the documentation). Extensions that meet these
criteria could be defined in the documentation as valid HMIS 2.8 XML as
well.
> I think a far better approach would be to simply change the custom
> tags to specify the ##other namespace. IE:
##other and ##any directives preclude the type from being importable,
which is a problem for extensibility.
> ...
> Thus the contents of the custom tags are forced to be from a different
> namespace, avoiding the situation where people modify the HUD schema
> to enforce structure inside the custom tag. Further, processing of
> the custom tag is made simple as a vendor can simply process only
> elements which adhere to a namespace which is known to them. You
> still don't lose the ability to re-use the HUD types in other schemas
> if need be,
Yes, you do lose the ability if you use ##other. I had originally
discussed this problem with various dedicated XML practitioners, and
thereby arrived at using the extension method for customization.
> but the HUD schema itself stays extensible, and the
> extensions remain generically processable by all.
>
> Anyway, a dollar short and a day late, but my 2 cents none-the-less.
Richard, please let me know what you think about my responses. Maybe we
can arrive at solution or just an understanding of how to proceed by
keeping this thread going. Any other ideas from the group? Thanks!
--
Eric Jahn
Alexandria Consulting LLC
3126 8th Ave. N
St. Petersburg, FL 33713
941.321.1466
Richard,
I was just going to suggest that...extend the 2.8 schema and tack on
whatever you wish. The trick there would be getting the receiver to
accept it, but yes, definitely an option technically. If someone wishes
to do this and contribute it to the community, we could hopefully post
it as an addendum/additional resource to the standard on the hmis.info
site.
BTW, for everyone on this list, here's another resource I've been
looking at as to how the NIEM folks handle extensions, from less than a
year ago:
http://www.niem.gov/Techniques_for_Building_and_Extending_NIEM.pdf
-Eric
On Sat, 2008-04-05 at 11:22 -0700, richard...@gmail.com wrote:
> Without a standard set of elements with an extensible "structure"
> being put forth, it seems the HUD XML standard will only have
> extremely limited usefulness,
That's why we made the structure truly extensible in version 2.8.
Custom elements as you advocate for are not Schema extensions, but
rather a Schema relaxation. 2.7 Custom tags were just arbitrary XML
instance (not Schema) extensions, not capable of being validated by the
2.7 Schema. The 2.8 extensibility mechanism is the same technique used
by many major standards bodies, including NIEM and OASIS, so we'll be in
good company.
> Even this use will be marginal because the scope of the
> HUD standard data elements is narrow for obvious reasons.
So extend the 2.8 Schema with additional elements.
> ... One of the reasons the project is a disaster
> because the HUD schema was modified, not extended but modified to
> cover the needs of Minnesota.
This was because the 2.7 Schema was not capable of extended, which was
one reason we made 2.8 extensible. We had to modify it to have a
validated data format. Custom elements would relax enforcement of a
standard unless an additional form of validation were then subsequently
used on the actual XML custom tags.
> We can not stop people from modifying
> the schema, but with the proper schema design we can certainly
> encourage them to extend it cleanly instead. I don't see how 2.8
> improves that situation in any way.
It improves the situation by allowing the Schema to be extended in any
way you like, and with standard XML Schema validation capabilities.
Your suggestion in the last email was a great example of how it could be
used: extend the 2.8 types with Custom tags in the specific places a
Continuum agrees they should exist. Or, extend any field with a
restriction or additional elements, such as data quality, etc.. It's up
to you. 2.8 lets you add in Custom tags, but as I mentioned last email,
that's going to be a tricky negotiating point between the sender and
receiver, because you now have to agree what XML can be encapsulated
within the Custom tag, and how strictly (if at all) that XML will then
be validated. It would be easier to not use Custom tags and just
explicitly state what will be extended and how, using Schema importation
extension. Why send the data if you can't specify the format and can't
simply validate it because the Schema validation is turned off? It
sounds like a mess waiting to happen. And we would still have the
problem of 2.7 style Custom tags prohibiting importation of types, so
the HUD HMIS Schema can't be utilized in other data standards, existing
or future.
> ...but also data structure, and it all
> must be extensible.
Exactly, but Custom tags can't be extended because they use the #any
directive. 2.8 is so extensible, you can go back and extend it to add
Custom tags. The choice to do so is yours with 2.8, so you can have the
capability you envision with 2.8. But doing it the way you are
mentioning blocks what I am arguing is a better methodology. A standard
which states (via Custom tags) that there is not standard to be enforced
in a certain section, is no standard at all.
> Imagine if Minnesota's customizations to the HUD
> schema were in their own namespace and confined to the custom tags.
> If that was so then the HUD standard data elements in a Minnesota XML
> file could be processed by anyone who can process a 2.7 instance.
> Can the same be said for 2.8?
No 2.8 can't, and it's no loss. If certain recipients can't process XML
conforming to the extension a jurisdiction designs, they can just
request to receive vanilla 2.8, which should be a relatively simple task
to generate, if the sender can already generate a 2.8 extension.
But I don't think this scenario will occur often, since the recipients
and sender will first come to a negotiated understanding of the
jurisdiction's format. Receivers can still choose to ignore bits of XML
which doesn't apply to their situation, but at least they can easily
validate against the jurisdiction's extension Schema.
The 2.8 Schema allows far more customization than 2.7, not just
additional elements in certain limited places. Any State/jurisdiction
could have an extension schema that defines its standard based on the
HUD HMIS 2.8 Schema, and all in that jurisdiction could validate against
Extend 2.8 with some Custom tags and use that if you really want to go
the path you mention.
Another problem with Custom tags is it only allows extension via
additional elements. You can't modify other things with Custom tags,
like add attributes to existing elements and further restrict valid
data, which a Continuum may need to do.
> How does 2.8 help Minnesota with a path
> forward towards their goal of using HUD XML for data exchange between
> transactional systems in the least vendor-specific way possible?
It gives them a way to extend (not modify) the 2.8 Schema without
modifying the base HUD HMIS format, then validate the resulting XML,
which 2.7 didn't allow because of its structure and lack of namespace.
HMIS Data Integration listserv in general:
Well, Richard's and my positions are pretty clear. Any other viewpoints
or insights which might further the discussion? -Eric