Re: [pcre-dev] Best practices for PCRE2 JIT

53 views
Skip to first unread message

Philip Hazel

unread,
May 11, 2022, 4:12:15 AM5/11/22
to Ervin Hegedüs, pcre2-dev, pcre...@exim.org
Hello,

The pcre-dev group you sent this to is for PCRE1, which is at end-of-life; the group is likely to disappear at some time. For PCRE2, please use pcre...@googlegroups.com. I have added that address to this email. (Further discussion could perhaps remove pcre-dev.) I don't know much about the internals of the JIT implementation in PCRE2. The JIT maintainer is on the pcre2-dev list so should see your query. 

Regards,
Philip


On Tue, 10 May 2022 at 20:13, Ervin Hegedüs via Pcre-dev <pcre...@exim.org> wrote:
Hi,

I'm a total beginner in PCRE2, now I'm "playing" with the code.

I reviewed the man pages and documentation. If I'm not mistaken, PCRE2
supports JIT (on most popular architectures - eg. AMD64), but the user has
to turn it on.

The pcre2jit man page says

*In some circumstances you may need to call additional functions. These are
described in the section entitled "Controlling the JIT stack" below.*

Is there any way to decide, in which case do I need to use those functions?
I mean the patterns are totally unknown to me, it could be very complex or
can be very simple.

Now I try this solution:

https://github.com/digitalwave/msc_retest/blob/pcre2support-jit/src/regex.cc#L161-L195

I chose 1MB, because man page says

*A maximum stack size of 512KiB to 1MiB should be more than enough for any
pattern.*

Do I need to add the manual stack allocation? Or would be enough just
simple call the

pcre2_jit_compile(re, PCRE2_JIT_COMPLETE);

and if the call fails, then match won't use JIT - but the application won't
terminate?

Calling the

pcre2_jit_stack_create(1, 1024 * 1024, NULL);

allocates the whole 1MB of memory? I just ask, because I showed some
details about compiled regex in that code, and it says:

DETAILED INFORMATION:
=====================
    PCRE2_INFO_BACKREFMAX: 0
  PCRE2_INFO_CAPTURECOUNT: 0
    PCRE2_INFO_DEPTHLIMIT: 4294967295
       PCRE2_INFO_JITSIZE: 76976
     PCRE2_INFO_MINLENGTH: 3
    PCRE2_INFO_MATCHLIMIT: 4294967295
          PCRE2_INFO_SIZE: 20149

The PCRE2_INFO_JITSIZE: 76976 means it uses almost 77kB of 1MB, but the 1MB
has been allocated? So if I have 200 regexes, then it means the application
needs +200MB memory?


Thanks for your helps,



a.
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev

Ervin Hegedüs

unread,
May 11, 2022, 4:37:31 AM5/11/22
to Philip Hazel, pcre2-dev
Hi Philip,

sorry for my mistake - and thanks for clarification.

Right, I won't send this mail to the new list.

I've joined that list - I hope I will get the answer.



Regards,


a.

Reply all
Reply to author
Forward
0 new messages