How to ignore the <!DOCTYPE> part when parsing an XML file

112 views
Skip to first unread message

Johann LEGAYE

unread,
Jun 14, 2012, 10:09:47 AM6/14/12
to zorba...@googlegroups.com
Hello everyone,

I'm trying to parse XML files with this kind of code :

  parse-xml:parse("<from1>Jani</from1><from2>Jani</from2><from3>Jani</from3>",
     <opt:options>
       <opt:base-uri opt:value="urn:test"/>
       <opt:parse-external-parsed-entity/>
     </opt:options>


But when the XML file contains at the top a <!DOCTYPE> part, I get an internal error :
dynamic error [err:FODC0006]: in valid content passed to parse-xml:parse(): loader parsing error: internal error

How to ignore this <!DOCTYPE> part when parsing ?

Thanks for your help,

Johann
--
 
 

Johann LEGAYE
Ingénieur Développement 8 avenue Yves Brunaud - 31770 Colomiers Tél : 05.67.20.20.34 - Std : 05.67.20.20.30 - Fax : 05.67.20.20.30
 
 

William Candillon

unread,
Jun 14, 2012, 10:17:41 AM6/14/12
to zorba...@googlegroups.com
Hello Johann,

Thanks for raising this issue.
At the moment xml:parse() only accepts XML external entities as an input.
We're currently looking at a resolution for this bug.

Is there a workaround you can use in the meantime?

Kind regards,

William

--
You received this message because you are subscribed to the Google Groups "Zorba Users" group.
To post to this group, send email to zorba...@googlegroups.com.
To unsubscribe from this group, send email to zorba-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/zorba-users?hl=en.

Chris Hillery

unread,
Jun 14, 2012, 8:10:22 PM6/14/12
to Zorba Users
Nicolae gave a slightly longer answer to this question on the original
zorba-users mailing list, which I will copy here for reference:

----
While the error you receive is not very useful -- and we'll improve it
-- the reason you are getting it is that the option <opt:parse-
external-parsed-entity/> requries the input to be a well-formed
external parsed entity, and that does not allow for DOCTYPE
declarations.

But I agree that skipping such declarations would be useful and we
will add an option to do exactly that. In the mean time you can either
omit the <opt:parse-external-parsed-entity/> option or alternatively
remove the first line of the input, for example with the help of the
read-text() function from the File module (see the documentation here:
http://www.zorba-xquery.com/html/modules/expath/file#read-text-1 ).
----

Ceej
aka Chris Hillery
Reply all
Reply to author
Forward
0 new messages