how do I debug a cryptic XML error?

19 views
Skip to first unread message

Laws

unread,
Feb 15, 2022, 7:50:12 PMFeb 15
to Clojure
So, I went to the government NVD website:


I downloaded the CPE Dictionary and unpacked it. It looks like standard XML. 

I copy and paste the standard XML example given on the Clojure XML documentation page:

 cpe-dictionary (-> "official-cpe-dictionary_v2.3.xml" io/resource io/file clj-xml/parse zip/xml-zip)

I get:

clojure.xml/startparse-sax                            xml.clj:  76                                                            ...  

       jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke  DelegatingMethodAccessorImpl.java:  43

           jdk.internal.reflect.NativeMethodAccessorImpl.invoke      NativeMethodAccessorImpl.java:  77

          jdk.internal.reflect.NativeMethodAccessorImpl.invoke0       NativeMethodAccessorImpl.java    

    com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse                 SAXParserImpl.java: 317

java.lang.IllegalArgumentException


I'm wondering, how do I figure out what is wrong here? I'm going to assume the government is offering reasonably standard XML, so where would the problem arise? How do I figure out a way around this? 

The CPE dictionary is 386 megabytes so I can't share the hold file here, but when I run "head" on it, the beginning looks like this:


<?xml version='1.0' encoding='UTF-8'?>

<cpe-list xmlns:config="http://scap.nist.gov/schema/configuration/0.1" xmlns="http://cpe.mitre.org/dictionary/2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:scap-core="http://scap.nist.gov/schema/scap-core/0.3" xmlns:cpe-23="http://scap.nist.gov/schema/cpe-extension/2.3" xmlns:ns6="http://scap.nist.gov/schema/scap-core/0.1" xmlns:meta="http://scap.nist.gov/schema/cpe-dictionary-metadata/0.2" xsi:schemaLocation="http://scap.nist.gov/schema/cpe-extension/2.3 https://scap.nist.gov/schema/cpe/2.3/cpe-dictionary-extension_2.3.xsd http://cpe.mitre.org/dictionary/2.0 https://scap.nist.gov/schema/cpe/2.3/cpe-dictionary_2.3.xsd http://scap.nist.gov/schema/cpe-dictionary-metadata/0.2 https://scap.nist.gov/schema/cpe/2.1/cpe-dictionary-metadata_0.2.xsd http://scap.nist.gov/schema/scap-core/0.3 https://scap.nist.gov/schema/nvd/scap-core_0.3.xsd http://scap.nist.gov/schema/configuration/0.1 https://scap.nist.gov/schema/nvd/configuration_0.1.xsd http://scap.nist.gov/schema/scap-core/0.1 https://scap.nist.gov/schema/nvd/scap-core_0.1.xsd">

  <generator>

    <product_name>National Vulnerability Database (NVD)</product_name>

    <product_version>4.9</product_version>

    <schema_version>2.3</schema_version>

    <timestamp>2022-01-25T04:50:56.780Z</timestamp>

  </generator>

  <cpe-item name="cpe:/a:%240.99_kindle_books_project:%240.99_kindle_books:6::~~~android~~">

    <title xml:lang="en-US">$0.99 Kindle Books project $0.99 Kindle Books (aka com.kindle.books.for99) for android 6.0</title>

Laws

unread,
Feb 15, 2022, 10:44:13 PMFeb 15
to Clojure

I changed the code a bit:

    cpe-dictionary (-> "official-cpe-dictionary_v2.3.xml"
                           (java.io.StringReader.)
                           (xml/parse))

        xmlzipper (clojure.zip/xml-zip cpe-dictionary)

Now I get this:

                                          clojure.data.xml/parse                   xml.clj:  84

                                          clojure.data.xml/parse                   xml.clj: 109

                                clojure.data.xml.tree/event-tree                  tree.clj:  70

                                             clojure.core/ffirst                  core.clj: 105

                                              clojure.core/first                  core.clj:  55

                                                             ...                               

                               clojure.data.xml.tree/seq-tree/fn                  tree.clj:  39

                                                clojure.core/seq                  core.clj: 139

                                                             ...                               

                          clojure.data.xml.jvm.parse/pull-seq/fn                 parse.clj:  78

com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next  XMLStreamReaderImpl.java: 652

javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]

                                     Message: Content is not allowed in prolog.

    location: #object[com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl$1 0x296bfddb "Line number = 1\nColumn number = 1\nSystem Id = null\nPublic Id = null\nLocation Uri= null\nCharacterOffset = 0\n"]




Laws

unread,
Feb 15, 2022, 11:10:12 PMFeb 15
to Clojure
Okay, this seemed to fix the problem:


        cpe-dictionary (-> "official-cpe-dictionary_v2.3.xml"
                           (java.io.FileInputStream.)
                           (xml/parse))


        xmlzipper (clojure.zip/xml-zip cpe-dictionary)

        xmlnode (-> xmlzipper
                    zip/down
                    zip/right
                    zip/node)


Reply all
Reply to author
Forward
0 new messages