Slow regex execution?

48 views
Skip to first unread message

Olivia Nelson

unread,
Oct 10, 2017, 4:43:16 AM10/10/17
to mozilla-rhino
It seems like rhino process the following javascript slowly, compared to PCRE

var path = '../../../../../../../../../../../../../../'
path.replace(/\/\.\//g, '').replace(/\/+/g, '/')

I'm wondering how regex replace is implemented in rhino, does it use the "Pattern" class in java?


Gregory Brail

unread,
Oct 10, 2017, 7:27:51 PM10/10/17
to mozill...@googlegroups.com
Rhino generates Java bytecode for Regex instead of using the java.util package. (The code base is pretty old.)

I've wondered whether it'd be faster to use the java.util stuff but I don't think it's an easy job. But it'd be awesome if someone had time to take a look...


--
You received this message because you are subscribed to the Google Groups "mozilla-rhino" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mozilla-rhino+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Olivia Nelson

unread,
Oct 10, 2017, 11:13:37 PM10/10/17
to mozilla-rhino
Thanks Gregory 

So where should I look at, and why it won't be an easy job?

On Wednesday, October 11, 2017 at 7:27:51 AM UTC+8, Gregory Brail wrote:
Rhino generates Java bytecode for Regex instead of using the java.util package. (The code base is pretty old.)

I've wondered whether it'd be faster to use the java.util stuff but I don't think it's an easy job. But it'd be awesome if someone had time to take a look...

On Tue, Oct 10, 2017 at 1:43 AM, Olivia Nelson <the.warl...@gmail.com> wrote:
It seems like rhino process the following javascript slowly, compared to PCRE

var path = '../../../../../../../../../../../../../../'
path.replace(/\/\.\//g, '').replace(/\/+/g, '/')

I'm wondering how regex replace is implemented in rhino, does it use the "Pattern" class in java?


--
You received this message because you are subscribed to the Google Groups "mozilla-rhino" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mozilla-rhin...@googlegroups.com.

Gregory Brail

unread,
Oct 11, 2017, 12:31:14 AM10/11/17
to mozill...@googlegroups.com
Well, org.mozilla.javascript.regexp.NativeRegExp is the class that starts it all -- it implements the standard "RegExp" from section 21.2 of ECMAScript.

If it turns out that Java RegExp is compatible with the syntax defined in that section, then it might not be a big deal to replace the implementation of that one class with one that uses the built-in Java stuff. OTOH if it's different then we'd have to wade in to the parser and code generator in that class and it might take some time. 

But if the stuff built in to Java is compatible, it'd be wonderful to simplify Rhino a bit!

To unsubscribe from this group and stop receiving emails from it, send an email to mozilla-rhino+unsubscribe@googlegroups.com.

Gregory Brail

unread,
Oct 11, 2017, 12:35:45 AM10/11/17
to mozill...@googlegroups.com
Also, it might be worth some profiling to see whether the problem is regexp and not the "replace" implementation. Recently some contributors did some work to make it more efficient but it's still possible that there's more that we can do.

Marc Guillemot

unread,
Oct 11, 2017, 1:50:02 AM10/11/17
to 'Gregory Brail' via mozilla-rhino
It is possible to use Java Regex in Rhino but you have to take care:
Java RegExp and JS RegExp differ.

In HtmlUnit we are using Java based regex in our Rhino fork by using
ScriptRuntime.setRegExpProxy but this needs to be tweaked:

https://sourceforge.net/p/htmlunit/code/HEAD/tree/trunk/htmlunit/src/main/java/com/gargoylesoftware/htmlunit/javascript/regexp/HtmlUnitRegExpProxy.java

https://sourceforge.net/p/htmlunit/code/HEAD/tree/trunk/htmlunit/src/main/java/com/gargoylesoftware/htmlunit/javascript/HtmlUnitContextFactory.java
(line 285)

Cheers,
Marc.
--
HtmlUnit support & consulting from the source
Blog: http://mguillem.wordpress.com


Le 11/10/2017 à 01:27, 'Gregory Brail' via mozilla-rhino a écrit :
> Rhino generates Java bytecode for Regex instead of using the java.util
> package. (The code base is pretty old.)
>
> I've wondered whether it'd be faster to use the java.util stuff but I
> don't think it's an easy job. But it'd be awesome if someone had time to
> take a look...
>
>
> On Tue, Oct 10, 2017 at 1:43 AM, Olivia Nelson
> <the.warl...@gmail.com <mailto:the.warl...@gmail.com>> wrote:
>
> It seems like rhino process the following javascript slowly,
> compared to PCRE
>
> |
> varpath ='../../../../../../../.././../../../../../'
Reply all
Reply to author
Forward
0 new messages