RegEx library for Declarative WebRequest API

127 views
Skip to first unread message

Dominic Battre

unread,
May 17, 2012, 3:47:25 PM5/17/12
to Chromium-dev
Hi.

I am working on the Declarative WebRequest API (current state), a Chrome extension API that allows configuring rules according to which network requests may be modified.

To support the rules of HTTPS Everywhere (and also for many other use cases), I need to implement regular expressions support.

I see three options at the moment:

- use the RegEx engine of ICU (third_party/icu/public/i18n/unicode/regex.h)
- use the RegEx engine of V8

Using the V8 RegEx engine has the big advantage that it supports the RegEx syntax used by HTTPS Everywhere.

Using RE2 would have the big advantage that it guarantees runtime linear in the size of the input, and can have fixed memory limits (and it doesn't expose a giant attack surface). When drewry and taviso did a security analysis this engine won. The RegEx engine will be executed on user-supplied regular expressions.

What is your opinion?

Adding RE2 would add ~980 kB of source code and a 463 kB shared library (built on Mac according to this Makefile: http://code.google.com/p/re2/source/browse/Makefile).

For the context, this is what I would like to support (and more complex expressions):

var rule = {
  conditions: [
    new chrome.declarativeWebRequest.RequestMatcher({
      url: { hostSuffix: '.example.com', schemes: ['http'] } })
  ],
  actions: [
    new chrome.declarativeWebRequest.RedirectRequest(
      {regex: ['http://(.*)example.com/(.*)', 'https://$1example.com/$2']}
      // exact syntax remains to be determined.
    )
  ]};

chrome.declarativeWebRequest.onRequest.addRules([rule]);

Best regards,
Dominic

Eric Roman

unread,
May 17, 2012, 8:48:19 PM5/17/12
to bat...@google.com, Chromium-dev
Allowing extensions to run arbitrary regular expressions in the browser process is scary.

Note that with V8's regular expression engine (irregexp), the runtime for evaluating patterns is unbounded.

This has been the cause of several renderer hangs in Chromium:


If the plan is to run regexps synchronously in the browser process then please be aware of the performance behaviors.

I don't know enough about regular expression engines to suggest a solution; it may be necessary to try and "validate" the pattern before handing it off.

--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

Aaron Boodman

unread,
May 17, 2012, 8:52:35 PM5/17/12
to ero...@chromium.org, bat...@google.com, Chromium-dev
The plan of record is to run the regular expressions in a renderer
using irregexp.

We were intrigued by re2 because it promises runtime linear with input
size, and security team reports that it is more robust, which might
make it appropriate to run in the browser. This could be a pretty
major perf win for the feature.

- a
Reply all
Reply to author
Forward
0 new messages