We are glad to announce the 0.9 release of Crawler-Commons. See the CHANGES.txt file included with the release for a full list of details. The main changes are the removal of DOM-based sitemap parser as the SAX equivalent introduced in the previous version has better performance and is also more robust.
You might need to change your code to replace SiteMapParserSAX
with SiteMapParser
. The parser is now aware of namespaces, and by default does not force the namespace to be the one recommended in the specification (http://www.sitemaps.org/schemas/sitemap/0.9
) as variants can be found in the wild. You can set the behaviour using the method setStrictNamespace(boolean).
As usual, the version 0.9 contains numerous improvements and bugfixes and all users are invited to upgrade to this version.