Regex speed

1,314 views
Skip to first unread message

stpr...@gmail.com

unread,
Aug 19, 2014, 4:24:26 PM8/19/14
to golan...@googlegroups.com
I am a complete newb to golang so please forgive the ignorance in my question if it's already been answered recently.

For performance reasons we are considering converting a large library of regex modules written in perl to go.  Are there any updated benchmarks that compare the regex speed in perl vs. go? From what I've read, go was  slower than perl in this regard but everything I can find seems to be at least a couple years old (with the note that golang is new and regex hasn't been "optimized" yet). 

Has regex performance in go been optimized yet and is the performance significantly better than that of perl? Thanks!

Andy Balholm

unread,
Aug 19, 2014, 5:47:00 PM8/19/14
to stpr...@gmail.com, golan...@googlegroups.com
Regex performance in Go has improved somewhat compared to what it was, but I don’t think there has been a concentrated effort on it.

On many regular expressions, Perl’s regex engine is quite a bit faster. But some regexes take ridiculously long in Perl, and match quite quickly in Go. See http://swtch.com/~rsc/regexp/regexp1.html for details. The author of that article has written two regular expression libraries: RE2 (in C) and the Go regexp package. Both of them use the approach advocated in the article, but RE2 has many more fast paths and low-level performance tweaks.

If regular expressions are consuming most of your programs run-time, switching from Perl to Go likely won’t help you much on performance, unless you are hitting the exponential-time corners of Perl’s regex algorithm. The solution is probably to figure out how to do what you’re doing without such heavy use of regular expressions. At one point I did a benchmark comparing the regexp package, RE2, and simple string functions like strings.Contains. I don’t remember the exact numbers, but it was something like: RE2 was 10 times as fast as the regexp package, and simple string functions were 10 times as fast as RE2. (Of course this was in Go; Perl is a slow enough language that it is often faster to send the work to the regex engine—written in C—than to do the work in Perl.) Rob Pike made some comments about the advantages of simple string manipulation over regexes at http://commandcenter.blogspot.com/2011/08/regular-expressions-in-lexing-and.html.

Even if your user interface is based on regular expressions, there are various tricks you can do to avoid firing up the regex engine so often. One example is http://swtch.com/~rsc/regexp/regexp4.html.



San

unread,
Aug 19, 2014, 5:53:15 PM8/19/14
to golan...@googlegroups.com
It depends
Regex in Go is (a lot) faster in some case because they don't use backtracking engine like traditional Perl regex.
The same reason make Go's regex has less features than Perl.
You may want to checkout
(Please note the (NOT SUPPORT) notice.)

I don't think you can converting lots of Perl regex to Go. (I tried that but you should try it yourself.)
Some notably missing feature is look around eg. (?!something)
If you cant everything else is out of question but if you can let try and see for yourself that backtracking engine or Go's approach is faster for your use case.

Andrew Gerrand

unread,
Aug 20, 2014, 9:30:41 AM8/20/14
to stpr...@gmail.com, golang-nuts
Go's regexp package isn't as fast as PCRE. (Andy provided some good background as to why this is so.)

But if you need PCRE in Go, it's only a "go get" away: http://godoc.org/github.com/mathpl/golang-pkg-pcre/src/pkg/pcre

Andrew


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

voidlogic

unread,
Aug 20, 2014, 11:11:08 AM8/20/14
to golan...@googlegroups.com, stpr...@gmail.com


On Wednesday, August 20, 2014 8:30:41 AM UTC-5, Andrew Gerrand wrote:
Go's regexp package isn't as fast as PCRE. (Andy provided some good background as to why this is so.)

But if you need PCRE in Go, it's only a "go get" away: http://godoc.org/github.com/mathpl/golang-pkg-pcre/src/pkg/pcre

Andrew



For what its worth, this is the PCRE binding I see Go projects using most often: https://github.com/glenn-brown/golang-pkg-pcre
 

stpr...@gmail.com

unread,
Aug 20, 2014, 9:25:31 PM8/20/14
to golan...@googlegroups.com
Another, related question. If I have, say, 200 regexes to process could I speed it up with golang? Each regex is independent of the other and right now I run them in perl, one after the other. Can I run these in parallel in golang?

San

unread,
Aug 20, 2014, 10:57:46 PM8/20/14
to golan...@googlegroups.com
On Thursday, August 21, 2014 8:25:31 AM UTC+7, stpr...@gmail.com wrote:
Another, related question. If I have, say, 200 regexes to process could I speed it up with golang? Each regex is independent of the other and right now I run them in perl, one after the other. Can I run these in parallel in golang?
 
If you mean use go routines to process regex concurrently I don't think it will help that much.
Since regex processing is CPU bounded no I/O blocking. (so no go routine context switching.)
At best you just only gain some performance from spanning multi-core machine.
Also even go routine is cheap it's not free.
Just my 2c.

egon

unread,
Aug 21, 2014, 1:51:53 AM8/21/14
to golan...@googlegroups.com


On Thursday, 21 August 2014 04:25:31 UTC+3, stpr...@gmail.com wrote:
Another, related question. If I have, say, 200 regexes to process could I speed it up with golang? Each regex is independent of the other and right now I run them in perl, one after the other. Can I run these in parallel in golang?


You might be better off explaining which regex-s are you running? There might be better algorithms to detect them.

If you want speed then D has currently one of the fastest regex engines.

+ egon
Reply all
Reply to author
Forward
0 new messages