About a year ago, I published a set of benchmarks covering CSS parsing with the library, combined with three different SAC parsers. The most interesting part was about the relative speed of the different SAC implementations: css4j, Batik, and Steadystate cssparser.
It seemed a good idea to revisit now those benchmarks with newer versions of the library, so I ran them for 0.41 and 0.41.4 (under a different operating system: it was Windows 10 then, now is Ubuntu 18.04).
With css4j 0.41, Batik 1.9 and cssparser 0.9.24:
# Run complete. Total time: 00:41:23
Benchmark Mode Cnt Score Error Units
MyBenchmark.markParseCSSStyleSheet thrpt 200 234,921 ± 1,721 ops/s
MyBenchmark.markParseCSSStyleSheetBatik thrpt 200 429,943 ± 5,025 ops/s
MyBenchmark.markParseCSSStyleSheetSSParser thrpt 200 62,109 ± 0,324 ops/s
MyBenchmark.markSACParseCSSStyleSheet thrpt 200 405,453 ± 2,480 ops/s
MyBenchmark.markSACParseCSSStyleSheetBatik thrpt 200 1033,349 ± 7,297 ops/s
MyBenchmark.markSACParseCSSStyleSheetSSParser thrpt 200 73,461 ± 0,343 ops/s
The results for SAC (the last three) are consistent with those in last year's post, although css4j's NSAC implementation (fourth line) seems to be a bit faster. Now comes the output with css4j 0.41.4, Batik 1.9 and cssparser 0.9.26. Note that the benchmarks parse several times the HTML default style sheet in CSS4J, and that 9 rules were uncommented for 0.41.3 (see commit e28f1014):
# Run complete. Total time: 00:41:25
Benchmark Mode Cnt Score Error Units
MyBenchmark.markParseCSSStyleSheet thrpt 200 218,350 ± 2,407 ops/s
MyBenchmark.markParseCSSStyleSheetBatik thrpt 200 757,871 ± 5,454 ops/s
MyBenchmark.markParseCSSStyleSheetSSParser thrpt 200 56,070 ± 0,568 ops/s
MyBenchmark.markSACParseCSSStyleSheet thrpt 200 376,131 ± 3,567 ops/s
MyBenchmark.markSACParseCSSStyleSheetBatik thrpt 200 1094,449 ± 6,861 ops/s
MyBenchmark.markSACParseCSSStyleSheetSSParser thrpt 200 58,470 ± 1,430 ops/s
It is a 7% throughput decrease for css4j's NSAC implementation (the only parser that is able to process the new, uncommented rules), albeit interestingly Batik's parser seems to be faster. Cssparser, which now emits several error messages during the benchmark, is significantly slower instead. Two diverging ways to deal with parse errors.
Perhaps a new CSS sample should be used for future benchmarks, one that all the tested parsers are able to parse, but the behaviour is interesting. After all, real-world style sheets carry styles that both Batik and cssparser cannot process.