SAX Parser

10 views
Skip to first unread message

Roy Mendelssohn - NOAA Federal

unread,
Aug 5, 2025, 5:21:43 PMAug 5
to 'Chris John - NOAA Affiliate' via ERDDAP
More recent versions of ERDDAP™ have instituted a SAX based parser to parse the xml files (from the change log is looks like the parser was first added to ERDDAP™ version 2.25), rather than the hand-rolled one that was in previous versions of ERDDAP™ and is still in the present version as an option. In the long-run the SAX based version is preferred, and there may come a time when it is the only option. The SAX parser is stricter than the original parser, so while your ERDDAP™ xml files may work now they could throw errors when switched to the SAX parser

In recent versions of ERDDAP™ the SAX parser is turned on in the setup.xml file by including:

<useSaxParser>true</useSaxParser>

I believe in the most recent versions this is “true” by default, so if you haven’t turned off the SAX parser and your xml files aren’t throwing any errors for versions 2.25 and later you are good to go.


What can you do to prepare if this isn’t the case?. Attached is a Java program that basically uses the same SAX parser so it should produce similar results. It is important to use this program because I have found that regular xml validators don’t catch some of these problems, and even xmlLint with the “sax” option does not find all of the problems. I found that the SAX parser seems to want to stop on the first error and couldn’t find a way around it, so the code below does two steps - the first just looks for some of the most common problems, that being having an & rather than &amp; and the second is having “—“ in comments. It will then also run the SAX parser until it fails. If and when the program makes it all the way through you should be good to go. The program prints the line number and column of the error as well as the type of error.

If you save the attached file as EnhancedXMLValidatorNew.java, then you do:

javac EnhancedXMLValidatorNew.java
java EnhancedXMLValidatorNew /path/to/your/xml/file

Also, if you know Java and can improve this program please do. It works for me so I thought I would share it. And yes you can do similar in other languages but the reason for using Java is because that is what ERDDAP™ is using so it should produce similar results.

-Roy

EnhancedXMLValidatorNew.java

Chris John - NOAA Affiliate

unread,
Aug 6, 2025, 9:49:57 AMAug 6
to Roy Mendelssohn - NOAA Federal, 'Chris John - NOAA Affiliate' via ERDDAP
Just a note, the SAX parser is only enabled in the current version if you have setup.xml include:

<useSaxParser>true</useSaxParser>

It does not default to enabled in the code.

There are some features only available with the SAX parser, so I do encourage you to migrate to it if you can.

Chris

--
You received this message because you are subscribed to the Google Groups "ERDDAP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to erddap+unsubscribe@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/erddap/5692DD75-51A6-4C97-9761-BA3391D9182F%40noaa.gov.


--
Christopher John (he/him)
NOAA Appointed Technical Director of ERDDAP™
Computer and Information Systems Manager, TSPi



Reply all
Reply to author
Forward
0 new messages