When the W3C Schema Working Group reports out with a replacement for
DTDs, that will probably add a considerable amount of support for other
datatypes. I'll be slightly surprised if that becomes a Recommendation
before year's end. Some parsers are likely to start supporting
working-draft versions of the schema language before then, but it's
unlikely to become universal until some time after the standard becomes
official.
-----------------------------------------------------------------
Joe Kesselman, kes...@us.ibm.com
Unless stated otherwise, all opinions are solely those of the author
There's also IDREF, IDREFS, ENTITY, ENTITIES and NOTATION, all of
which are quite useful. NMTOKEN and NMTOKENS also have their uses.
Especially IDREF and IDREFS are valuable for in-document links that
are verified by the parser. I use that heavily in the source document
for my XML tools list to ensure that all links are correct.
Other than that I agree with you.
--Lars M.
Jany.
Good question. As the intent was to make XML simpler than SGML I
assume NAME/NAMES/NUMBER/NUMBERS/NUTOKEN/NUTOKENS were left out as a
part of this simplification.
However, NAME/NAMES/NUMBER/NUMBERS are rather useful and would be very
easy to implement in a parser (especially compared to a lot of the
entity horrors), so I don't really agree with this decision.
Leaving out NUTOKEN/NUTOKENS is OK, although if you add everything but
them you might as well include them for completeness. I doubt most
parser writers would notice the extra effort required.
| if it is possible to find an explanation for this and
This note from the annotated XML specification may offer some hints:
<URL:http://www.xml.com/axml/notes/AttrsBoring.html>
| if this is likely to be changed in some future specification.
I must confess I have no idea, but I'm not even sure there will be any
more XML DTD specifications. XML schemas should be able to do what you
want, though, although it seems rather self-contradictory to me to
leave this functionality out of DTDs and then add it (and more) to
schemas afterwards.
--Lars M.
Lars Marius Garshol wrote:
> * Joseph Kesselman
> |
> | In the current definition of XML all data is strings. You can
> | constrain attributes to an enumerated set, or force them to be
> | unique by making them type ID, but that's about it.
>
> There's also IDREF, IDREFS, ENTITY, ENTITIES and NOTATION, all of
> which are quite useful. NMTOKEN and NMTOKENS also have their uses.
>
> Especially IDREF and IDREFS are valuable for in-document links that
> are verified by the parser. I use that heavily in the source document
> for my XML tools list to ensure that all links are correct.
>
So do I. It would be nice to be able to say that an ID (and IDREF(S))
belongs to a particular namespace and thus use that to partition and
constrain the IDs and references.
| Is there a standard or at least common way to describe that an IDREF
| attribute should only reference elements of a certain type?
What I use is a prefix convention. For example, for my XML tools list
all vendor IDs start with V_ and all product IDs with P_. I haven't
bothered to write any code that actually verifies that I follow the
convention, partly because I've never had any problems with it.
With more than one author extra validation software might be a good
idea.
--Lars M.
For now, the answer seems to be "no; write application code". That _may_
be dealt systematically with when DTDs are phased out in favor of the
Schema language now in development.
> What I really miss for IDREF/IDREFs is a way to further constraint the
> reference. Is there a standard or at least common way to describe that an
> IDREF attribute should only reference elements of a certain type?
>
You could use architectural forms and the and the General Architecture
described in Annex A.5 of the HyTime standard. Section A.5.5
describes how the type of the referent is constrained with a ireftype
attribute.
See
<URL:http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.5.5.html>
--
Lennart Staflin <lenn...@infotek.no> /*/
STEP Infotek AS, Gjerdrums vei 12, N-0486 Oslo, Norway, http://www.infotek.no/
Will I run in a lot of problems (or none at all), if I try to use such
architecural extensions with XML?
Which validating XML parsers (if any) respect (standard) architectures?
What other extensions in form of standardarized architectures are available?
I think the last question is a bit wide, and I don't expect anyone to answer it
completly. A few links would be helpful.
Thanks in advance, Oliver
They aren't extensions as such, but more of a layer on top of XML.
Architectures simply use existing constructs (processing instructions
and attributes) and then a separate layer on top of the parser does
the transformations.
| Which validating XML parsers (if any) respect (standard)
| architectures?
Do you mean which existing parsers implement them? At the moment only
SP does, but there are two implementations that build on top of SAX as
well:
<URL:http://www.stud.ifi.uio.no/~larsga/linker/XMLtools.html#SC_ArchForm>
| What other extensions in form of standardarized architectures are
| available?
Well, in a sense architectures are just normal DTDs, the difference is
just the way the documents are produced.
Some applications are defined as architectures though, and some
examples would be HyTime (ISO 10744) and Topic Maps (ISO 13250).
--Lars M.
> * Oliver Meyer
> | Which validating XML parsers (if any) respect (standard)
> | architectures?
>
> Do you mean which existing parsers implement them? At the moment only
> SP does, but there are two implementations that build on top of SAX as
> well:
>
> <URL:http://www.stud.ifi.uio.no/~larsga/linker/XMLtools.html#SC_ArchForm>
>
In A Tutorial Introduction to SGML Architectures
<URL:http://www.isogen.com/papers/archintro.html#div-1-2"> can be read: "In
other words, an SGML architecture is nothing more than a bag of rules for
documents, with some of the rules definable using DTD syntax and the rest of
the rules defined using some other mechanism or mechanisms."
The rest of the text simplifies SGML architecutres to DTDs. As I understand
them, architectures are more than just DTDs. E. g. the clause A.5.5 of the
General Architecture in the HyTime standard
<URL:http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.5.5.html>
constrains documents further in that the type of the target must match the
value of ireftype. That constraint is not expressed in the DTD.
So what I mean is: Which parsers enforce wich rules from (standard)
architectures, that are not expressed (or expressable) in a DTD?
> | What other extensions in form of standardarized architectures are
> | available?
>
> Well, in a sense architectures are just normal DTDs, the difference is
> just the way the documents are produced.
Is this really true? Compare my quote from above.
> Some applications are defined as architectures though, and some
> examples would be HyTime (ISO 10744) and Topic Maps (ISO 13250).
I'll have a further look at these.
Bye, Oliver
>The rest of the text simplifies SGML architecutres to DTDs. As I understand
>them, architectures are more than just DTDs. E. g. the clause A.5.5 of the
>General Architecture in the HyTime standard
><URL:http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.5.5.html>
>constrains documents further in that the type of the target must match the
>value of ireftype. That constraint is not expressed in the DTD.
Actually, it is. You use (in your DTD) an ireftype attribute definition based on
the form at the end of the clause. As with any attribute, your application (not
the parser) must check the semantics: in this case, the target type constraints.
The advantage of a standardized architecture (like the General Architecture) is
that of standards in general: it is more likely that common tools will be
available for processing them.
--
Charles F. Goldfarb * Information Management Consulting * +1(408)867-5553
13075 Paramount Court * Saratoga CA 95070 * USA
International Standards Editor * ISO 8879 SGML * ISO/IEC 10744 HyTime
Prentice-Hall Series Editor * Definitive XML * Open Information Management
--
> On Fri, 12 Mar 1999 16:43:10 +0100, Oliver Meyer
> <ome...@i3.informatik.rwth-aachen.de> wrote:
>
> >The rest of the text simplifies SGML architecutres to DTDs. As I understand
> >them, architectures are more than just DTDs. E. g. the clause A.5.5 of the
> >General Architecture in the HyTime standard
> ><URL:http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.5.5.html>
> >constrains documents further in that the type of the target must match the
> >value of ireftype. That constraint is not expressed in the DTD.
>
> Actually, it is. You use (in your DTD) an ireftype attribute definition based on
> the form at the end of the clause. As with any attribute, your application (not
> the parser) must check the semantics: in this case, the target type constraints.
A DTD (aka Document type definition) can be expressed by the document type
declaration + semantics, conventions, etc. (The SGML Handbook, p. 126 lines 22--29
also [4.108] (DTD)). I didn't know that when I wrote the above. When I wrote DTD
above I only meant the formal part expressible in SGML (that is the document type
declaration), and I further constrain myself to an XML document type declaration.
I expect a validating processor to check some document against the declaration and
the constrains enforced by the XML spec. A validating processor will not complain
about a document that uses immediate referent type control and violates the
constraints imposed on the document. That's what I meant by "is not expressed in the
DTD." The additional constrains are only checked by an architecture specific
validating processor.
> The advantage of a standardized architecture (like the General Architecture) is
> that of standards in general: it is more likely that common tools will be
> available for processing them.
I cannot agree more about the usefulness of standards.
Do these tools (a validating processor) exist for the General Architecture?
Bye, Oliver Meyer
>> Actually, it is. You use (in your DTD) an ireftype attribute definition based on
>> the form at the end of the clause. As with any attribute, your application (not
>> the parser) must check the semantics: in this case, the target type constraints.
>
>A DTD (aka Document type definition) can be expressed by the document type
>declaration + semantics, conventions, etc. (The SGML Handbook, p. 126 lines 22--29
>also [4.108] (DTD)). I didn't know that when I wrote the above. When I wrote DTD
>above I only meant the formal part expressible in SGML (that is the document type
>declaration), and I further constrain myself to an XML document type declaration.
The "document type declaration" includes the markup declarations in its internal
and external declaration subsets. Therefore, all element type and attribute
definition list declarations are part of "the document type declaration" as far
as your quote from The SGML Handbook goes.
Of course it would be best if the parser could enforce ireftype, but then we
would need an "ireftype" markup declaration or parameter. That idea is being
considered for the revision of SGML.