RegExp in Dart is extremely slow

338 views
Skip to first unread message

Gary Taylor

unread,
Apr 17, 2015, 9:37:06 AM4/17/15
to w...@dartlang.org
I'm converting a Java application that parses log files using various regex patterns.  Part of the application feeds a number of log messages into a series of regexes to see which messages match which regex.  In my test, I've got 68 messages feeding into 279 regexes.  Each message takes about 6 seconds to run through the "hasMatch" function for all the regexes.  6 seconds times 68 messages means it is taking over 6 minutes for this process.  This is not acceptable.  Here is the essential Dart code:

RegExp re = new RegExp(pattern);
bool matched = re.hasMatch(message); 

I decided to see how slow JavaScript was.  I replaced the Dart RegExp code with a call to a JavaScript function.  Each message now only takes about 0.06 seconds to run through the "test" function for all the regexes.  0.06 seconds times 68 messages means it is taking about 4 seconds for this process.  For an interactive program, this is acceptable.  Here is the essential Dart and JavaScript code:

Dart:
var matched = new JsObject(context['jsRegex']).callMethod('test', [message, pattern]); 

JavaScript:
var jsRegex = function() {
  this.test = function(message, pattern) {
    var re = new RegExp(pattern);
    var hasMatch = re.test(message);
    return hasMatch;
  };
}; 
 
If I'm not doing anything wrong, then the Dart team really needs to work on RegExp performance. 

Günter Zöchbauer

unread,
Apr 17, 2015, 10:03:49 AM4/17/15
to w...@dartlang.org
Are you running the Dart code in a console or a browser app?

Gary Taylor

unread,
Apr 17, 2015, 10:08:36 AM4/17/15
to w...@dartlang.org
It is a browser app.  I've been running my test code in Dartium, from the Dart Editor v 1.9.1.

--
You received this message because you are subscribed to the Google Groups "Dart Web Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web+uns...@dartlang.org.
Visit this group at http://groups.google.com/a/dartlang.org/group/web/.
To view this discussion on the web visit https://groups.google.com/a/dartlang.org/d/msgid/web/bfdefde7-1db1-4b72-9284-25b1c5aacf2a%40dartlang.org.

Jacob Macdonald

unread,
Apr 17, 2015, 10:08:48 AM4/17/15
to w...@dartlang.org
Which version of the sdk are  you using? I haven't tested it myself but the 1.9 release promises performance that is `up to 150x faster` than the previous implementation, which would bring it in line with the javascript performance you are seeing.

--

Gary Taylor

unread,
Apr 17, 2015, 10:55:45 AM4/17/15
to w...@dartlang.org
I was running Dart Editor v1.9.1, which has dart-sdk version 1.9.1 in it.  When I choose "Run in Dartium" on my main html file, I assume it also uses dart-sdk v1.9.1.

BTW, I just updated to Dart Editor v1.9.3.  I'm still getting the same slow RegExp performance.  

Is it possible the new RegEx engine isn't really there, yet?  Or, could I have an old dart-sdk hiding somewhere?  I scanned my system and didn't find any old "dart-sdk" folders.

Another BTW, when I choose "Run as JavaScript", the performance is very close to my own JavaScript testing times.  This means for our production app, I can still use the Dart RegExp and I don't have to use the "jsRegex workaround".

Srdjan Mitrovic

unread,
Apr 17, 2015, 12:13:40 PM4/17/15
to w...@dartlang.org
It would be very helpful if you could file a bug with a small benchmark that demonstrates the bad performance. We will then look at it ASAP.

Thanks,

- Srdjan 

Message has been deleted

Günter Zöchbauer

unread,
Apr 17, 2015, 2:33:38 PM4/17/15
to w...@dartlang.org
This is only relevant during development/testing because when built to JS it uses the JS implementation anyway.

Gary Taylor

unread,
Apr 17, 2015, 3:53:18 PM4/17/15
to w...@dartlang.org
I boiled it down to the bare minimum and opened bug issue 23249.

Seth Ladd

unread,
Apr 17, 2015, 3:55:58 PM4/17/15
to web
On Fri, Apr 17, 2015 at 12:53 PM, Gary Taylor <gtay...@gmail.com> wrote:
I boiled it down to the bare minimum and opened bug issue 23249.


Thank you very much for filing the issue!
 

Stephen Adams

unread,
Apr 17, 2015, 4:23:57 PM4/17/15
to w...@dartlang.org
I can't read the zip file from my chromebook so I'm not sure what the real code is.
Don't do 'new RegExp' in the loop.  The regexp has a once-per-RegExp cost associated with making the regexp suitable for JavaScript. You are paying this 'compile' cost each time.



final RegExp re1 = new RegExp(pattern1);
...
method(message) {
  bool matched = re1.hasMatch(message); 



Jacob Macdonald

unread,
Apr 17, 2015, 4:37:22 PM4/17/15
to w...@dartlang.org
I tried not doing the new RegExp in the loop but got the same results.

Gary Taylor

unread,
Apr 18, 2015, 8:51:27 AM4/18/15
to w...@dartlang.org
Looking at all the comments here and in the Dart Bug issue 23249, I restructured my test program to "pre-compile" all the pattern strings into a list of RegExp objects.  Then, I pass each message to the pre-compiled objects.  This time, the slow performance was ONLY on the FIRST message.  Subsequent messages went very, very fast.  I can live with this kind of performance, but running in JavaScript doesn't have this "first time slowness" problem.  It would be nice if Dart didn't either.

Kenneth Endfinger

unread,
Apr 19, 2015, 11:35:13 PM4/19/15
to w...@dartlang.org
It's a trade off. Slower first match = super fast subsequent matches. I would prefer the first one be slow than all of them be slightly less optimized.
Reply all
Reply to author
Forward
0 new messages