Problems with ngx.re.match

2,387 views
Skip to first unread message

Sparsh Gupta

unread,
May 16, 2012, 4:06:05 PM5/16/12
to open...@googlegroups.com
I followed instructions as per
http://wiki.nginx.org/HttpLuaModule#Special_PCRE_Sequences and made a
separate .lua file with my regular expressions. The problem I am
facing now is that I am reading regular expression pattern from mysql
table, hence I have the pattern in a variable

local pattern = row["regex"]

This pattern dont have extra backslashes as given in the link. I am
not a lua coder so I am finding it hard to add backslashes to this
variable

What I tried doing is manually running two different ngx.re.match and
what I found was:

local abc = ngx.re.match("http://ankitjain.in",[[http(s?)\:\/\/ankitjain\.in(\/)?(\?.*|#.*)?$]],"i")
-- Matches, abc is not nil

local abc = ngx.re.match("http://ankitjain.in","http(s?)\\:\\\/\\/ankitjain\\.in(\\/)?(\\?.*|#.*)?$","i")
-- Does not match, abc is nil

local abc = ngx.re.match("http://ankitjain.in",row["regex"],"i") --
Does not match, abc is nil


How can I work around this, so that correct reg matching happens

Thanks
Sparsh Gupta

agentzh

unread,
May 17, 2012, 12:01:04 AM5/17/12
to open...@googlegroups.com
n Thu, May 17, 2012 at 4:06 AM, Sparsh Gupta <spars...@gmail.com> wrote:
> I followed instructions as per
> http://wiki.nginx.org/HttpLuaModule#Special_PCRE_Sequences and made a
> separate .lua file with my regular expressions. The problem I am
> facing now is that I am reading regular expression pattern from mysql
> table, hence I have the pattern in a variable
>

All those syntactic sugar is for pleasing the Lua source code parser.
If you already have the regex in a Lua variable, then you don't need
them at all.

> local pattern = row["regex"]
>
> This pattern dont have extra backslashes as given in the link. I am
> not a lua coder so I am finding it hard to add backslashes to this
> variable
>

No, you don't need to escape backslashes in the data because you
already have them.

> What I tried doing is manually running two different ngx.re.match and
> what I found was:
>
> local abc = ngx.re.match("http://ankitjain.in",[[http(s?)\:\/\/ankitjain\.in(\/)?(\?.*|#.*)?$]],"i")
> -- Matches, abc is not nil
>
> local abc = ngx.re.match("http://ankitjain.in","http(s?)\\:\\\/\\/ankitjain\\.in(\\/)?(\\?.*|#.*)?$","i")
> -- Does not match, abc is nil
>

If you need to write down regexes literally in your Lua source, just
use the [[...]] quotes, do not struggle with escaping in "...". If you
already have the correct regex data in a Lua variable, forget about
[[...]], "...", and all those escaping rules.

> local abc = ngx.re.match("http://ankitjain.in",row["regex"],"i") --
> Does not match, abc is nil
>

Try printing out the value of row["regex"] used here? What is exactly
in your row["regex"] ?

Regards,
-agentzh

agentzh

unread,
May 17, 2012, 12:28:39 AM5/17/12
to open...@googlegroups.com
On Thu, May 17, 2012 at 4:06 AM, Sparsh Gupta <spars...@gmail.com> wrote:
>
> local abc = ngx.re.match("http://ankitjain.in","http(s?)\\:\\\/\\/ankitjain\\.in(\\/)?(\\?.*|#.*)?$","i")
> -- Does not match, abc is nil
>

BTW, this sample won't match because you have a syntax error in this
Lua code snippet. That is, you should get the following error in your
error.log when running the code above:

[error] ... invalid escape sequence near '\"http(s?)\:\'

That is, you're escaping "/" with "\" but "\/" is not a valid escaping
sequence in Lua at all (because "/" does not need to be escaped at
all). By fixing this issue, your example now matches on my side:

abc = ngx.re.match("http://ankitjain.in","http(s?)\\:\\/\\/ankitjain\\.in(\\/)?(\\?.*|#.*)?$","i")

BTW, you should use (?:...) instead of (...) in your regexes wherever
possible because the latter will introduce necessary subpattern
capturing which adds cost to the regex engine.

> local abc = ngx.re.match("http://ankitjain.in",row["regex"],"i") --
> Does not match, abc is nil

Why this one fails depends on the actual value of row["regex"] which I
cannot see from your mail. Please print its value out either with
ngx.say, or ngx.print, or ngx.log, or print.

The rule of thumb is: when in doubt, just print your regex string out
and see what it is.

Regards,
-agentzh

Sparsh Gupta

unread,
May 17, 2012, 3:25:02 AM5/17/12
to open...@googlegroups.com
>> local abc = ngx.re.match("http://ankitjain.in",row["regex"],"i") --
>> Does not match, abc is nil
>
> Why this one fails depends on the actual value of row["regex"] which I
> cannot see from your mail. Please print its value out either with
> ngx.say, or ngx.print, or ngx.log, or print.
>
> The rule of thumb is: when in doubt, just print your regex string out
> and see what it is.
>

I figured out that it was escaping issues only. The entries in mysql
DB had extra escaping which worked fine with PHPs preg_replace /
preg_match but failed to work with Lua. I cleaned it further and now
it seems to be working as expected

Thanks

> Regards,
> -agentzh
>
> --
> 邮件自: 列表“openresty”,专用于技术讨论!
> 发言: 请发邮件到 open...@googlegroups.com
> 退订: 请发邮件至 openresty+...@googlegroups.com
> 详情: http://groups.google.com/group/openresty
> 官网: http://openresty.org/
> 仓库: https://github.com/agentzh/ngx_openresty
> 建议: 提问的智慧 http://wiki.woodpecker.org.cn/moin/AskForHelp
> 教程: http://agentzh.org/misc/nginx/agentzh-nginx-tutorials-zhcn.html

rv...@hotmail.com

unread,
Jun 16, 2014, 12:38:55 AM6/16/14
to open...@googlegroups.com, spars...@gmail.com
Hello agentzh
For URLs with a '+' sign, I guess we need to add the escape character in ngx.re.match() because + is a special regex character.

Howevr, when I escape it, Lua returns an error. In particular, Lua does not seem to like any occurrence of '\' in ngx.re.match. I have temporarily resolved the issue by using string.find since the requirements are limited. However, I wonder
if this is a known issue and if there are any workarounds. I have not seen any so far.
Thanks

Yichun Zhang (agentzh)

unread,
Jun 16, 2014, 1:15:10 AM6/16/14
to openresty
Hello!

On Sun, Jun 15, 2014 at 9:38 PM, rvsw wrote:
> Hello agentzh
> For URLs with a '+' sign, I guess we need to add the escape character in ngx.re.match() because + is a special regex character.
>

The following nginx configuration snippet works for me:

location = /t {
content_by_lua '
local s = "a+b"
local m = ngx.re.match(s, [[a\\+b]], "jo")
if m then
ngx.say("matched: ", m[0])
else
ngx.say("no match.")
end
';
}

Accessing /t gives the following response body:

matched: a+b

If you test the Lua code snippet in an external .lua file instead of
inlining it directly in nginx.conf, then you can just write

local m = ngx.re.match(s, [[a\+b]], "jo")

The documentation link mentioned in an earlier post in this thread
should have explained the complications here clearly enough:

https://github.com/openresty/lua-nginx-module#special-pcre-sequences

> Howevr, when I escape it, Lua returns an error.

You'd better provide a minimal standalone example that can reproduce
the issue on our side. Vague natural language description is seldom
helpful for trouble shooting, unfortunately.

BTW, it is required to subscribe to the list before posting otherwise
your posts always require manual moderation. Also, English posts are
recommended to go to the openresty-en mailing list instead:
https://groups.google.com/group/openresty-en

Best regards,
-agentzh
Reply all
Reply to author
Forward
0 new messages