[Django] #30899: Lazily compile large regular expressions

32 views
Skip to first unread message

Django

unread,
Oct 22, 2019, 1:19:54 PM10/22/19
to django-...@googlegroups.com
#30899: Lazily compile large regular expressions
-------------------------------------------------+------------------------
Reporter: Adam (Chainz) Johnson | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------------------+------------------------
Inspired by [this article from Instagram](https://instagram-
engineering.com/python-at-scale-strict-modules-c0bb9245c834)

Currently Django lazily compiles some regular expressions such as those in
validators:
https://github.com/django/django/blob/54ea290e5bbd19d87bd8dba807738eeeaf01a362/django/core/validators.py#L17

This is to save on import time.

There are other import-time `re.compile` calls throughout the codebase,
these could be migrated to lazy regex compiles to save on their import
time:

https://github.com/django/django/search?q=re.compile&unscoped_q=re.compile

--
Ticket URL: <https://code.djangoproject.com/ticket/30899>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Oct 23, 2019, 2:36:08 AM10/23/19
to django-...@googlegroups.com
#30899: Lazily compile regular expressions.
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: nobody
Johnson |
Type: | Status: new
Cleanup/optimization |

Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/30899#comment:1>

Django

unread,
Oct 23, 2019, 5:03:15 PM10/23/19
to django-...@googlegroups.com
#30899: Lazily compile regular expressions.
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Hasan
Johnson | Ramezani
Type: | Status: assigned
Cleanup/optimization |

Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Hasan Ramezani):

* owner: nobody => Hasan Ramezani
* status: new => assigned


Comment:

I can work on this an prepare a patch.
Should I move the
[https://github.com/django/django/blob/54ea290e5bbd19d87bd8dba807738eeeaf01a362/django/core/validators.py#L17
_lazy_re_compile function] to
[https://github.com/django/django/blob/master/django/utils/text.py
django.utils.text]?

--
Ticket URL: <https://code.djangoproject.com/ticket/30899#comment:2>

Django

unread,
Oct 25, 2019, 5:12:11 AM10/25/19
to django-...@googlegroups.com
#30899: Lazily compile regular expressions.
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Hasan
Johnson | Ramezani
Type: | Status: assigned
Cleanup/optimization |
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Adam (Chainz) Johnson):

Yeah django.utils.text sounds like a sensible location to me. Keeping it
private (starting with underscore, undocumented) for now seems wise.

--
Ticket URL: <https://code.djangoproject.com/ticket/30899#comment:3>

Django

unread,
Oct 26, 2019, 11:14:02 AM10/26/19
to django-...@googlegroups.com
#30899: Lazily compile regular expressions.
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Hasan
Johnson | Ramezani
Type: | Status: assigned
Cleanup/optimization |
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Hasan Ramezani):

* has_patch: 0 => 1


Comment:

[https://github.com/django/django/pull/11977 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/30899#comment:4>

Django

unread,
Oct 29, 2019, 4:26:21 AM10/29/19
to django-...@googlegroups.com
#30899: Lazily compile regular expressions.
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Hasan
Johnson | Ramezani
Type: | Status: assigned
Cleanup/optimization |
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/30899#comment:5>

Django

unread,
Oct 29, 2019, 4:56:00 AM10/29/19
to django-...@googlegroups.com
#30899: Lazily compile regular expressions.
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Hasan
Johnson | Ramezani
Type: | Status: assigned
Cleanup/optimization |
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"c4cba148d8356596da80c4d93a96fb335e4b0b6b" c4cba148]:
{{{
#!CommitTicketReference repository=""
revision="c4cba148d8356596da80c4d93a96fb335e4b0b6b"
Refs #30899 -- Moved _lazy_re_compile() to the django.utils.regex_helper.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/30899#comment:6>

Django

unread,
Oct 29, 2019, 4:56:01 AM10/29/19
to django-...@googlegroups.com
#30899: Lazily compile regular expressions.
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Hasan
Johnson | Ramezani
Type: | Status: closed
Cleanup/optimization |

Component: Core (Other) | Version: master
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"e3d0b4d5501c6d0bc39f035e4345e5bdfde12e41" e3d0b4d5]:
{{{
#!CommitTicketReference repository=""
revision="e3d0b4d5501c6d0bc39f035e4345e5bdfde12e41"
Fixed #30899 -- Lazily compiled import time regular expressions.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/30899#comment:8>

Django

unread,
Oct 29, 2019, 4:56:01 AM10/29/19
to django-...@googlegroups.com
#30899: Lazily compile regular expressions.
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Hasan
Johnson | Ramezani
Type: | Status: assigned
Cleanup/optimization |

Component: Core (Other) | Version: master
Severity: Normal | Resolution:

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"39a34d4bf94bc8325119bc23b64f3a041a85dd2d" 39a34d4]:
{{{
#!CommitTicketReference repository=""
revision="39a34d4bf94bc8325119bc23b64f3a041a85dd2d"
Refs #30899 -- Made _lazy_re_compile() support bytes.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/30899#comment:7>

Reply all
Reply to author
Forward
0 new messages