Validating with nested schema and namespaces

448 views
Skip to first unread message

Moritz Schepp

unread,
Jan 8, 2016, 8:37:12 AM1/8/16
to nokogiri-talk
Hey guys,

I'm trying to validate this document with nokogiri against its XSD at http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd:

<?xml version="1.0" encoding="UTF-8"?>
<OAI-PMH
 
xmlns="http://www.openarchives.org/OAI/2.0/"
 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
                      http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"

>
 
<responseDate>2016-01-08T12:37:55Z</responseDate>
 
<request verb="GetRecord">http://test.host/api/oai_pmh/kinds</request>
 
<GetRecord>
 
<record>
 
<header>
   
<identifier>12345</identifier>
   
<datestamp>2016-01-08T12:37:55Z</datestamp>
 
</header>
 
<metadata>
   
<oai_dc:dc
     
xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
     
xmlns:dc="http://purl.org/dc/elements/1.1/"
     
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
                          http://www.openarchives.org/OAI/2.0/oai_dc.xsd"

   
>
     
<dc:identifier>12345</dc:identifier>
     
<dc:type>Type</dc:type>
     
<dc:type>Kind</dc:type>
     
<dc:type>EntityType</dc:type>


     
<dc:description>some description</dc:description>
   
</oai_dc:dc>
 
</metadata>
</record>
</GetRecord>
</OAI-PMH>

My first question is: Can I tell nokogiri to figure out the XSD Schema itself or do I always have to provide it manually creating a Nokogiri::XML::Schema?

But more importantly, the validation doesn't work, I use the code 

xsd_response = HTTPClient.new.get "http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"
xsd
= Nokogiri::XML::Schema(xsd_response.body)
doc
= Nokogiri::XML(doc_string)
xsd
.validate(doc).each do |error|
  puts
"#{error.line} :: #{error.message}"
end


and I keep getting:

23 :: Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}dc': No matching global element declaration available, but demanded by the strict wildcard.

Doesn't this mean that nokogiri doesn't find the xmlns:dc definitition on the oai_dc:dc element?

The validator at http://www.validome.org/xml/validate/ tells me that the doc is valid. Do you have an idea how I could validate it with nokogiri? This would be nice so I can integrate the validation into some unit tests.

Mike Dalessio

unread,
Jan 10, 2016, 11:43:33 PM1/10/16
to nokogiri-talk
Hi Moritz,

Some context: Nokogiri has two different implementations, one is libxml2 (MRI/CRuby) and the other is Xerces (JRuby). This document and schema fail with the same error under both. Here's what JRuby's Nokogiri says:

 :: cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'oai_dc:dc'.

Also worth noting is that if you ask validome to validate the schema at http://www.validome.org/grammar/ there are multiple errors as well.

Googling for the libxml2 error message gives this result:


which is a question like yours about a schema that includes

<xs:any namespace="##other" />

which is the `metadataType` spec that appears to be giving the error. The reply to that post indicates that the cause may be that multiple schemas are necessary to fully validate the grammar.




--
You received this message because you are subscribed to the Google Groups "nokogiri-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nokogiri-tal...@googlegroups.com.
To post to this group, send email to nokogi...@googlegroups.com.
Visit this group at https://groups.google.com/group/nokogiri-talk.
For more options, visit https://groups.google.com/d/optout.

Moritz Schepp

unread,
Jan 12, 2016, 7:10:25 PM1/12/16
to nokogiri-talk
Hey Mike,

thanks a lot for taking the time and for the thorough explanation! With the suggested workaround, our tests validate nicely.

I'm surprised though that this is a limitation of libxml, because it sounds a bit like a "wontfix".

BTW, I just passed

to 

and they both produced no errors. How exactly did you get any errors?

But in any case, we have a working solution. Thank you again.
Reply all
Reply to author
Forward
0 new messages