Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

validate xml with scgema using sax gets a: "Valid documents must have a <!DOCTYPE declaration."

16 views
Skip to first unread message

Ray Tayek

unread,
Nov 30, 2003, 1:01:10 AM11/30/03
to
hi, fooling around with xmlspy (which seems pretty broken when *doing*
xslt's). trying to validate in java (1.4) using code from
http://cermics.enpc.fr/doc/java/j2eetutorial-1.4/doc/JAXPSAX13.html
(click on the Echo10.java link). i get an error saying that a doctype
decl is required (see below). i get the same error whether or not i turn
on name space awareness. spy says this file is well formed and valid.
and my java code that transforms it with the same .xslt works as
expected (all the files are at http://tayek.com/~ray/spy1/). the spy
starts the xml doc that i am trying to validate with:

<?xml version="1.0" encoding="UTF-8"?>
<!--Sample XML file generated by XMLSPY v2004 rel. 3
(http://www.xmlspy.com)-->
<inputDocument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="H:\java\projects\spy1\spy\inputDocument.xsd">
<special>Text</special>
<header>
<inputFieldName>String</inputFieldName>

while the .xsd file starts with:

<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XMLSPY v2004 rel. 3 U (http://www.xmlspy.com) by Ray
Tayek (Freightgate) -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="inputDocument">
<xs:annotation>
<xs:documentation>input csv converted to xml</xs:documentation>
</xs:annotation>

what i would like to do is to validate the xml doc against it's schema
before applying the transform (in java) . i have not yet tried the dom
example at
http://cermics.enpc.fr/doc/java/j2eetutorial-1.4/doc/JAXPDOM9.html as i
thought the sax would be a bit more lightweight.

i just started using spy (i had java code that does the transform just
fine). but i used spy to generate the new .xsd and sample xml. so maybe
he put some proprietary junk in there or something?

anaict, doctype is for dtd's, but sax is complaining about a missing
doctype.

anybody got a clue?

thanks


output from Echo10.java when trying to validate the xml doc:

LOCATOR
SYS ID: file:H:/java/projects/spy1/spy/inputDocument.xml
START DOCUMENT
<?xml version='1.0' encoding='UTF-8'?>** Warning, line 4, uri
file:H:/java/projects/spy1/spy/inputDocument.xml
Valid documents must have a <!DOCTYPE declaration.
** Parsing error, line 4, uri
file:H:/java/projects/spy1/spy/inputDocument.xml
Element type "inputDocument" is not declared.
org.xml.sax.SAXParseException: Element type "inputDocument" is not declared.
at org.apache.crimson.parser.Parser2.error(Parser2.java:3160)
at
org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1322)
at
org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:500)
at org.apache.crimson.parser.Parser2.parse(Parser2.java:305)
at
org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:442)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:345)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:281)
at V.main(V.java:31)

Andrew Thompson

unread,
Nov 30, 2003, 1:52:48 AM11/30/03
to
"Ray Tayek" <rtay...@spam.comcast.net> wrote in message
news:GIfyb.360196$HS4.3010339@attbi_s01...

> hi, fooling around with xmlspy (which seems pretty broken when *doing*
> xslt's). trying to validate in java (1.4) using code from
> http://cermics.enpc.fr/doc/java/j2eetutorial-1.4/doc/JAXPSAX13.html
> (click on the Echo10.java link). i get an error saying that a doctype
> decl is required (see below). i get the same error whether or not i turn

Did you lose your shift key, Ray?
...


> Valid documents must have a <!DOCTYPE declaration.

Try this as your 1st line..
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

And for reference, an URL would be _so_ much easier.

--
Andrew Thompson
* http://www.PhySci.org/ PhySci software suite
* http://www.1point1C.org/ 1.1C - Superluminal!
* http://www.AThompson.info/andrew/ personal site


Ray Tayek

unread,
Nov 30, 2003, 2:27:56 AM11/30/03
to
Andrew Thompson wrote:

> "Ray Tayek" <rtay...@spam.comcast.net> wrote in message
> news:GIfyb.360196$HS4.3010339@attbi_s01...
>
>>

>>... trying to validate in java (1.4) using code from


>>http://cermics.enpc.fr/doc/java/j2eetutorial-1.4/doc/JAXPSAX13.html
>>(click on the Echo10.java link). i get an error saying that a doctype
>>decl is required (see below). i get the same error whether or not i turn
>
>
> Did you lose your shift key, Ray?
> ...

no, it's just a *very* old habbit.


>
>> Valid documents must have a <!DOCTYPE declaration.
>
>
> Try this as your 1st line..
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
>

spy says these are still valid and this works fine and i just get a few
warnigs:

..Declared encoding "UTF-8" does not match actual one "Cp1252"; this
might not be an error.
Using original entity definition for "&quot;".
Using original entity definition for "&amp;".
Using original entity definition for "&lt;".
Using original entity definition for "&gt;".
Using original entity definition for "&apos;".
Time: 6.21

now i can try to write the xslt that will write that xslt :)

but i really do want to produce xml, not html or xhtml (i think) as this
will be processed only by an xslt.

> And for reference, an URL would be _so_ much easier.

i assume you mean:
xsi:noNamespaceSchemaLocation="H:\java\projects\spy1\spy\inputDocument.xsd"?

if so, i agree as spy complains whem i move stuff around. what would the
url for that look like?

thanks for your assistance!
---
ray tayek http://tayek.com/ actively seeking mentoring or telecommuting work
vice chair orange county java users group http://www.ocjug.org/
hate spam? http://samspade.org/ssw/

Ray Tayek

unread,
Nov 30, 2003, 2:38:44 AM11/30/03
to
Andrew Thompson wrote:
...

> > Try this as your 1st line..
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
>

is there something similar for my .xsd's? it would be nice to do some
validation on them?

thanks

Andrew Thompson

unread,
Nov 30, 2003, 2:40:44 AM11/30/03
to
"Ray Tayek" <rtay...@spam.comcast.net> wrote in message
news:0_gyb.366660$Fm2.366085@attbi_s04...
> Andrew Thompson wrote:
...

> > Did you lose your shift key, Ray?
> > ...
>
> no, it's just a *very* old habbit.

Hobbits are very old, whereas habits are just ingrained. ;-)

> >> Valid documents must have a <!DOCTYPE declaration.
> >
> >
> > Try this as your 1st line..
> > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> >
>
> spy says these are still valid and this works fine and i just get a few
> warnigs:

...


> but i really do want to produce xml, not html or xhtml (i think) as this
> will be processed only by an xslt.

(shrugs vaguely) I was really just guessing anyway.

The only thing I knew was that valid web pages
have a DOCTYPE, I chose that one at random,
from the list the validator supplies in it's report
page, vis.
http://validator.w3.org/check?uri=http%3A%2F%2Fwww.1point1c.org%2F&charset=%
28detect+automatically%29&doctype=%28detect+automatically%29

> > And for reference, an URL would be _so_ much easier.
>
> i assume you mean:
>
xsi:noNamespaceSchemaLocation="H:\java\projects\spy1\spy\inputDocument.xsd"?

No! I meant an URL on the net where we could
see your page break. It is much easier than transcribing
errors to Usenet posts.

In an ideal world you would put a regular index.html
file that links to the URL in question, the source files etc.

Ray Tayek

unread,
Nov 30, 2003, 2:59:26 AM11/30/03
to
Andrew Thompson wrote:
> "Ray Tayek" <rtay...@spam.comcast.net> wrote in message
> news:0_gyb.366660$Fm2.366085@attbi_s04...
>
>>Andrew Thompson wrote:
>
...
>
>>but i really do want to produce xml, not html or xhtml (i think) as this
>> will be processed only by an xslt.
>
>
> (shrugs vaguely) I was really just guessing anyway.

i found a list at: http://www.w3.org/QA/2002/04/valid-dtd-list.html#full
- yours seems to be the best fit.
>
> ...
> http://validator.w3.org/check?uri=http%3A%2F%2Fwww.1point1c.org%2F&charset=%
> 28detect+automatically%29&doctype=%28detect+automatically%29
>

nice site, thanks for the link


>
>>>And for reference, an URL would be _so_ much easier.
>>

>>i assume you mean: ...
>>
> No! I meant an URL on the net...

oh, sure, here: http://tayek.com/~ray/spy1/

thanks

Andrew Thompson

unread,
Nov 30, 2003, 3:22:59 AM11/30/03
to
"Ray Tayek" <rtay...@spam.comcast.net> wrote in message
news:yrhyb.167099$Dw6.655869@attbi_s02...
...
> >..I meant an URL on the net...

>
> oh, sure, here: http://tayek.com/~ray/spy1/

Aha, yep, _that's_ what I'm talking about!

Now hopefully the experts will jump in and answer
the more technical aspects of your question. :-)

Ray Tayek

unread,
Nov 30, 2003, 3:44:27 AM11/30/03
to
Andrew Thompson wrote:
> "Ray Tayek" <rtay...@spam.comcast.net> wrote in message
> news:yrhyb.167099$Dw6.655869@attbi_s02...
> ...
>
>>>..I meant an URL on the net...
>>
>>oh, sure, here: http://tayek.com/~ray/spy1/
>
>
> Aha, yep, _that's_ what I'm talking about!
>
> Now hopefully the experts will jump in and answer
> the more technical aspects of your question. :-)
>

that would be cool.

unfortunately, i lied. the "success" that i quoted last time was because
i was running the wrong program. i needed to run the validator (not the
testCase). so it still thinks the inputDocument is not defined. there is
no mention of the .xsd, so maybe it's a typo somewhere (sigh). i will
put the validation calls in the test case so i don't forget them again.

thanks.

LOCATOR
SYS ID: file:H:/java/projects/spy1/spy/inputDocument.xml
START DOCUMENT

<?xml version='1.0' encoding='UTF-8'?>** Warning, line 25, uri
http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent


Using original entity definition for "&quot;".

** Warning, line 26, uri http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent


Using original entity definition for "&amp;".

** Warning, line 27, uri http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent


Using original entity definition for "&lt;".

** Warning, line 28, uri http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent


Using original entity definition for "&gt;".

** Warninorg.xml.sax.SAXParseException: Element type "inputDocument" is

not declared.
at org.apache.crimson.parser.Parser2.error(Parser2.java:3160)
at
org.apache.crimson.parser.Parser2.maybeElement(Parser2.java:1322)

at org.apache.crimson.parser.Parser2.parseg, line 29, uri
http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent


Using original entity definition for "&apos;".

** Parsing error, line 5, uri

file:H:/java/projects/spy1/spy/inputDocument.xml
Element type "inputDocument" is not declared.

Internal(Parser2.java:500)
at org.apache.crimson.parser.Parser2.parse(Parser2.java:305)
at
org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:442)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:345)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:281)
at V.main(V.java:31)


--

Chris Smith

unread,
Nov 30, 2003, 6:39:57 PM11/30/03
to
Ray,

First of all, the suggestion to specify an HTML doctype was not a
particularly good one (sorry, Andrew!). Not only is your document not
HTML at all (and thus you're lying to the parser), but the HTML DTDs
specified are SGML DTDs anyway, hence the warnings you're getting when
trying to use them! You want to remove the DTD declaration, and then
fix the problem that's causing the parser to expect a DTD declaration.

Furthermore, the equally important (if not as obvious) issue is that the
validator is probably not validating against the XML schema anyway!
Again, you need to fix the real problem.

The real problem appears to be (just a guess here, since I don't have
much information) that you're using a SAX parser that doesn't know about
XML schema, and/or that you're not configuring it properly to use schema
anyway. You didn't read that web page you pointed out very well,
because you grabbed the code which simply sets the parser to
"validating" from the very beginning of the page, and you didn't do ANY
of the things that are mentioned later in the page, about setting the
parser to be namespace-aware (which mystically appears in your code, but
as the predicate of an "if (false)" statement), setting the
schemaLanguage property, and so forth.

I'd suggest you follow the instructions from the page of the Java
Tutorial first, and if that doesn't work, look into the possibility that
you have a parser configured that's not schema-aware at all. It is
possible to change the parser that's used by JAXP.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation

Ray Tayek

unread,
Dec 1, 2003, 4:11:54 AM12/1/03
to
Chris Smith wrote:
>... Not only is your document not HTML at all ...
> ... You want to remove the DTD declaration, and then
> fix the problem that's causing the parser to expect a DTD declaration.
>
ok

> Furthermore, the equally important (if not as obvious) issue is that the
> validator is probably not validating against the XML schema anyway!
> Again, you need to fix the real problem.
>
> The real problem appears to be (just a guess here, since I don't have
> much information) that you're using a SAX parser that doesn't know about
> XML schema, and/or that you're not configuring it properly to use schema
> anyway.

i will check this.

You didn't read that web page you pointed out very well,
> because you grabbed the code which simply sets the parser to
> "validating" from the very beginning of the page, and you didn't do ANY
> of the things that are mentioned later in the page, about setting the
> parser to be namespace-aware (which mystically appears in your code, but
> as the predicate of an "if (false)" statement), setting the
> schemaLanguage property, and so forth.

was trying it both ways to see if it made any difference. i grabbed
echo10 ans sorta assumed it woud do the right thing. it is not. and
doing a: saxParser.setProperty(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA);
throws a no reconnized (sigh).


>
> I'd suggest you follow the instructions from the page of the Java
> Tutorial first, and if that doesn't work, look into the possibility that
> you have a parser configured that's not schema-aware at all. It is
> possible to change the parser that's used by JAXP.
>

i was just grabbig whatever the factory made.

Sudsy

unread,
Dec 1, 2003, 8:24:38 AM12/1/03
to
Ray Tayek wrote:
<snip>

> was trying it both ways to see if it made any difference. i grabbed
> echo10 ans sorta assumed it woud do the right thing. it is not. and
> doing a: saxParser.setProperty(JAXP_SCHEMA_LANGUAGE,
> W3C_XML_SCHEMA);
> throws a no reconnized (sigh).
<snip>

> ray tayek http://tayek.com/ actively seeking mentoring or telecommuting
> work
> vice chair orange county java users group http://www.ocjug.org/
> hate spam? http://samspade.org/ssw/

You might want to check out Oracle's xdk if you're trying to validate
against an XML Schema. I've got some sample code available for the
asking.

Chris Smith

unread,
Dec 1, 2003, 11:27:56 AM12/1/03
to
Ray Tayek wrote:
> was trying it both ways to see if it made any difference. i grabbed
> echo10 ans sorta assumed it woud do the right thing. it is not. and
> doing a: saxParser.setProperty(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA);
> throws a no reconnized (sigh).

If I'm interpreting the above correctly, you're saying that the
setProperty call throws a SAXNotRecognizedException. If that's the
case, then you definitely have a parser that does not support XML
schema. You can obtain a parser that does validate against schema, and
then use it for the JAXP parser. For example, Xerces2-J is such a
parser (http://xml.apache.org/xerces2-j/index.html).

The method of informing JAXP to use a different SAX parser is described
in the API docs for SAXParserFactory.newInstance.

Ray Tayek

unread,
Dec 1, 2003, 12:06:13 PM12/1/03
to
Sudsy wrote:
> Ray Tayek wrote:
> <snip>
>
>> was trying it both ways to see if it made any difference. i grabbed
>> echo10 ans sorta assumed it woud do the right thing. it is not. and
>> doing a: saxParser.setProperty(JAXP_SCHEMA_LANGUAGE,
>> W3C_XML_SCHEMA);
>> throws a no reconnized (sigh).
>
...

>
>
> You might want to check out Oracle's xdk if you're trying to validate
> against an XML Schema. I've got some sample code available for the
> asking.
>

i'll check it out.

thanks
---

Ray Tayek

unread,
Dec 2, 2003, 9:17:51 AM12/2/03
to
Chris Smith wrote:

> Ray Tayek wrote:
>
>>was trying it both ways to see if it made any difference. i grabbed
>>echo10 ans sorta assumed it woud do the right thing. it is not. and
>>doing a: saxParser.setProperty(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA);
>>throws a no reconnized (sigh).
>
>
> If I'm interpreting the above correctly, you're saying that the
> setProperty call throws a SAXNotRecognizedException. If that's the
> case, then you definitely have a parser that does not support XML
> schema. You can obtain a parser that does validate against schema, and
> then use it for the JAXP parser. For example, Xerces2-J is such a
> parser (http://xml.apache.org/xerces2-j/index.html).
>
> The method of informing JAXP to use a different SAX parser is described
> in the API docs for SAXParserFactory.newInstance.
>

using a xalan at work,i am getting furthere (a different error message :)

0 new messages