Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Using XML document with SGML parser

1 view
Skip to first unread message

Dominique Besagni

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to
Hi,

A valid XML document is supposed to be a valid
SGML document. But the following document:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Module [
<!ELEMENT Module - O EMPTY >
]>
<Module />

gets that error message with nsgmls :

nsgmls:<OSFD>0:7:9:E: character data is not allowed here

So, is there a trick to have an EMPTY element valid
in both case ?

Dominique Besagni <bes...@inist.fr>

Lars Marius Garshol

unread,
Mar 11, 1999, 3:00:00 AM3/11/99
to

* Dominique Besagni

|
| A valid XML document is supposed to be a valid SGML document.

And so it is.

| <?xml version="1.0" encoding="UTF-8"?>

(Just FYI: in this case you don't need the XML declaration.)

| <!DOCTYPE Module [
| <!ELEMENT Module - O EMPTY >

XML doesn't use the omitted tag minimization parameter '- O', so this
is in fact not a valid XML document.

| gets that error message with nsgmls :
|
| nsgmls:<OSFD>0:7:9:E: character data is not allowed here

Hmmm. This suggests to me that nsgmls is parsing this file as if it
were SGML. See <URL:http://www.jclark.com/sp/xml.htm> for instructions
on how to make it parse it as XML.

| So, is there a trick to have an EMPTY element valid in both case ?

No, it should be straightforward.

--Lars M.

Oliver Meyer

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
If every XML document is a valid SGML document, why do I have to tell the
parser that a specific document is XML? It should be able to parse it as
SGML, or shouldn't it?

Bye, Oliver

Lars Marius Garshol wrote:

> * Dominique Besagni

[ deleted sample 'XML' document ]

Dominique Besagni

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
[Lars Marius Garshol]

> | A valid XML document is supposed to be a valid SGML document.
>
> And so it is.
>
> | <?xml version="1.0" encoding="UTF-8"?>
>
> (Just FYI: in this case you don't need the XML declaration.)
>
> | <!DOCTYPE Module [
> | <!ELEMENT Module - O EMPTY >
>
> XML doesn't use the omitted tag minimization parameter '- O', so this
> is in fact not a valid XML document.

Thank you for your answer but I am afraid I was not clear enough in
explaining my problem which is: if I have a valid XML document (DTD
+ instance), how can I use that same instance with a SGML DTD so it
is considered a valid SGML document by a parser or an application
that does not work with XML ?
Which explain why there is a tag minimization parameter in my
example.

Dominique Besagni <bes...@inist.fr>

David Carlisle

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to Dominique Besagni

> how can I use that same instance with a SGML DTD so it
> is considered a valid SGML document by a parser or an application
> that does not work with XML ?
> Which explain why there is a tag minimization parameter in my
> example.

You need to use a SGML declaration that sets up your SGML parser
for XML (including _not_ having the - O minimisation information
in the element declaration).

The xml declaration normally seems to go under the name of xml.dcl
you may have it already, if not you can find a copy in James Clark's
distributions at the url already mentioned.

Note that you need to get a version of xml.dcl that matches your parser.
There is a `WWW' extension to SGML that makes it a bit easier to specify
the syntax changes needed to get the xml <foo/> empty element syntax.
So if you get a recent xml.dcl that makes use of these new declaration
forms you need a new(ish) sgml parser. Alternatively you should be able
to find an older declaration.

Specifically if your copy of xml.dcl starts off

<!SGML -- SGML Declaration for valid XML documents --
"ISO 8879:1986 (WWW)"


then you need an SGML parser that understands "ISO 8879:1986 (WWW)"


David

Lars Marius Garshol

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to

* Oliver Meyer

|
| If every XML document is a valid SGML document, why do I have to
| tell the parser that a specific document is XML? It should be able
| to parse it as SGML, or shouldn't it?

Well, SGML is a much more customizable language than XML is by virtue
of having another level of declarations. HTML is completely fixed, XML
allows you to choose element types and attributes and SGML goes even
further by allowing you to tweak the syntax of the language (using the
SGML declaration).

This means stuff like:

- the limits on the lengths of identifiers
- the document character set (you can in fact define your own
character set by referring to code points in known ones)
- lexical structure (identifiers, tokens, DTD keywords etc)
- minimization features
etc etc

So to allow an SGML parser to parse XML as an XML parser would parse
it it needs an SGML declaration that tells it what is and isn't
allowed. Otherwise it would allow lots of stuff that is allowed in
SGML, but not in XML.

The reason you don't normally have to specify an SGML declaration is
that parsers use the so-called reference concrete syntax by default.

--Lars M.

Oliver Meyer

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
How are these constrains specified in SGML? Is there some kind of header I
can include in my XML document to make it a SGML document every(?) SGML
parser will accept and validate?

I tried to use an XML-DTD with Framemaker+SGML and it violates e. g. the
NAMELEN constrain, as the Reference Quantity Set defines NAMELEN as 8
(according to the SGML Handbook by Charles F. Goldfarb).

I might try to build my own SGML Declaration to apply to XML documents and
feed it into Framemaker, but this will take _a lot of_ time and be error
prone. I hope someone has already done that job and someone will point me
to it.


Lars Marius Garshol wrote:

Thank you Lars, for this complete and informative answer.


Oliver Meyer

unread,
Mar 12, 1999, 3:00:00 AM3/12/99
to
Oliver Meyer wrote:

> How are these constrains specified in SGML? Is there some kind of header I
> can include in my XML document to make it a SGML document every(?) SGML
> parser will accept and validate?

And after reading the other part of this thread I found the answer: xml.dcl or
some cutout from http://www.ornl.gov/sgml/wg8/document/1955.htm

Oliver


Charles F. Goldfarb

unread,
Mar 14, 1999, 3:00:00 AM3/14/99
to
On 12 Mar 1999 10:42:41 +0000, David Carlisle <dav...@nag.co.uk> wrote:

>Specifically if your copy of xml.dcl starts off
>
><!SGML -- SGML Declaration for valid XML documents --
> "ISO 8879:1986 (WWW)"
>
>
>then you need an SGML parser that understands "ISO 8879:1986 (WWW)"

BTW, the final spec for "ISO 8879:1986 (WWW)" (aka "Web SGML") can be found at
"http://www.SGMLsource.com/8879rev/n0029.htm". It is now an ISO-approved
addition to the SGML standard.

--
Charles F. Goldfarb * Information Management Consulting * +1(408)867-5553
13075 Paramount Court * Saratoga CA 95070 * USA
International Standards Editor * ISO 8879 SGML * ISO/IEC 10744 HyTime
Prentice-Hall Series Editor * Definitive XML * Open Information Management
--

rou...@gatwick.geco-prakla.slb.com

unread,
Apr 8, 1999, 3:00:00 AM4/8/99
to
In article <yg4lnh3...@openmath.nag.co.uk>,

David Carlisle <dav...@nag.co.uk> wrote:
>
> > how can I use that same instance with a SGML DTD so it
> > is considered a valid SGML document by a parser or an application
> > that does not work with XML ?
> > Which explain why there is a tag minimization parameter in my
> > example.
>
> You need to use a SGML declaration that sets up your SGML parser
> for XML (including _not_ having the - O minimisation information
> in the element declaration).
>
> The xml declaration normally seems to go under the name of xml.dcl
> you may have it already, if not you can find a copy in James Clark's
> distributions at the url already mentioned.

How do you referenced this XML declaration ?
How do you make the link between your XML file or your XML DTD and this XML
declaration file ?

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

0 new messages