I've been doing some parsing research.
I've set up a jsperf that tries to test various ways of parsing input,
character by character, and acting accordingly.
http://jsperf.com/parsing-tests/4
This is a tricky test as it covers a few things. As such it's not a
micro-unit test that tests the difference between ++x or x++. Here's
what I've tried to cover:
"Get char from string", as string or number
(these are tested on all comparison tests, where applicable)
str[index]
str.charAt(index)
str.charCodeAt(index)
(these are only tested once, with the if-else, for completeness)
str.substring(index, index+1)
str.substr(index, 1)
str.slice(index, index+1)
implicit coercion string comparison
dynamic regex: new RegExp('.{index}(.)')[1] match and exec
regex per group
str.split('')
split foreach
foreach.apply(str, func)
str.replace
"Is char part of a set?"
switch
if-else
object as hash
array as hash
in object
one big regex
array.indexOf
string.indexOf
The results are certainly interesting, though it's fairly safe to say
that the "manual if-else" case wins.
If you have any other relevant tests to add in either category, please
tell me and I'll add them. I want to be as complete as possible on
this one.
In case you didn't recognize the main gist of the test, it's to
determine the start of the next token in JS source (ex the identifiers
and punctuators).
Preliminary results show that the manual if-else case is the best way
to go for char-by-char parsing. You would expect switch to perform
similar, but that really depends on the platform being tested. There's
no clear general second place to me.
Oddities:
Safari on the iPad (2 and 3) as well as on windows (5.1) prefer using
an array as a hash (by number). Kind of surprising since it makes a
sparse array with length=0xfff but only contains four elements. Safari
on the mac does not seem to have this bias.
IE9 performed kind of pathetic (on an idle decent machine). IE10 did
much better, although there are clear cases of optimization focus
there.
Firefox needs to work on switches.
JSperf needs bigger charts if there are many test cases :p
Any additions, comments and test runs are most welcome. Blog post coming soon.
Thanks for your help,
- peter