[Django] #28977: Change Local Memory Cache to Use LRU

26 views
Skip to first unread message

Django

unread,
Jan 1, 2018, 8:13:59 PM1/1/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-----------------------------------------------+------------------------
Reporter: Grant Jenks | Owner: nobody
Type: New feature | Status: new
Component: Core (Cache system) | Version: master
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-----------------------------------------------+------------------------
The current local memory cache (locmem) in Django uses a pseudo-random
culling strategy. Rather than random, the OrderedDict data type can be
used to implement an LRU eviction policy. A prototype implementation is
already used by functools.lru_cache and Python 3 now supports
OrderedDict.move_to_end and OrderedDict.popitem to ease the
implementation.

I'm willing to work on a pull request that changes locmem to use an LRU
eviction strategy but I wanted to first check whether it would be
accepted. I did some research to find a good reason for the locmem's
random culling strategy but did not find one.

There's also a bit of prior art at https://pypi.python.org/pypi/django-
lrucache-backend.

--
Ticket URL: <https://code.djangoproject.com/ticket/28977>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 2, 2018, 10:27:07 AM1/2/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------

Reporter: Grant Jenks | Owner: nobody
Type: New feature | Status: new
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Someday/Maybe

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* stage: Unreviewed => Someday/Maybe


Comment:

I'm not sure how to evaluate the pros and cons of that change. I guess the
idea should be raised on the DevelopersMailingList.

--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:1>

Django

unread,
Jan 3, 2018, 12:27:04 PM1/3/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned

Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Grant Jenks):

* owner: nobody => Grant Jenks
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:2>

Django

unread,
Jan 4, 2018, 1:01:16 PM1/4/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Someday/Maybe
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Grant Jenks):

I have created the example changes at
https://github.com/grantjenks/django/tree/ticket_28977 The only thing I
remain unsure about is whether django.utils.synch.RWLock needs to be
deprecated. View the commit at
https://github.com/grantjenks/django/commit/b06574f6713d4b7d367d7a11e0268fb62f5fd1d1

I will ask for feedback on the developers mailing list next.

--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:3>

Django

unread,
Jan 8, 2018, 4:47:27 PM1/8/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Grant Jenks):

* has_patch: 0 => 1
* stage: Someday/Maybe => Accepted


Comment:

Pull request created at https://github.com/django/django/pull/9555

Django developers mailing list thread at
https://groups.google.com/forum/#!topic/django-developers/Gz2XqtoYmNk

Summarizing the two responses: Josh Smeaton (author of django-lrucache-
backend, project cited in the initial post) was in favor of providing a
better default option. Adam Johnson was also +1. Josh and Adam were also
interested in providing a way to disable cache key validation. I'm +0 on
that change so long as the default is to validate the key. I'd rather not
add those changes myself.

I did some very simple benchmarking locally:

Here's the performance of cache.get for the current implementation:

{{{
$ python manage.py shell
Python 3.6.3 (default, Oct 5 2017, 22:47:21)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from django.core.cache import cache

In [2]: cache.set(b'foo', b'bar')

In [3]: %timeit cache.get(b'foo')
14.2 µs ± 109 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
}}}

And here's the performance of cache.get for the new implementation:

{{{
$ python manage.py shell
Python 3.6.3 (default, Oct 5 2017, 22:47:21)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from django.core.cache import cache

In [2]: cache.set(b'foo', b'bar')

In [3]: %timeit cache.get(b'foo')
6.29 µs ± 140 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
}}}

I also tried Josh's django-lrucache-backend implementation:

{{{
$ python manage.py shell
Python 3.6.3 (default, Oct 5 2017, 22:47:21)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.

In [3]: from django.core.cache import caches

In [4]: cache = caches['local']

In [5]: cache.set(b'foo', b'bar')

In [6]: %timeit cache.get(b'foo')
10.1 µs ± 135 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
}}}

It's not a great benchmark but it's encouraging. The new implementation
appears to be faster than both the current implementation and the django-
lrucache-backend. I haven't profiled but I would guess
collections.OrderedDict is very fast, as is threading.RLock both of which
are introduced by these changes.

--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:4>

Django

unread,
Jan 10, 2018, 11:04:59 AM1/10/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Grant Jenks):

Josh asked on django-developers if I would run more extensive benchmarking
against the current changes. Here's the results of the locmem backend
using the python-diskcache benchmark harness. Note that these were
generated with only one process as locmem does no cross-process
synchronization.

The current implementation benchmark results:

{{{
========= ========= ========= ========= ========= ========= =========
=========
Timings for locmem
-------------------------------------------------------------------------------
Action Count Miss Median P90 P99 Max
Total
========= ========= ========= ========= ========= ========= =========
=========
get 88986 16526 12.159us 19.789us 28.849us 154.734us
1.266s
set 9030 0 14.067us 15.974us 30.994us 150.919us
134.134ms
delete 983 0 11.921us 13.828us 26.941us 32.663us
12.563ms
Total 98999
1.413s
========= ========= ========= ========= ========= ========= =========
=========
}}}

And the new implementation benchmark results:

{{{
========= ========= ========= ========= ========= ========= =========
=========
Timings for locmem
-------------------------------------------------------------------------------
Action Count Miss Median P90 P99 Max
Total
========= ========= ========= ========= ========= ========= =========
=========
get 88986 16526 5.007us 5.245us 14.067us 80.109us
449.330ms
set 9030 0 5.960us 7.153us 16.689us 111.103us
58.852ms
delete 983 0 4.053us 5.007us 8.821us 18.835us
4.200ms
Total 98999
512.381ms
========= ========= ========= ========= ========= ========= =========
=========
}}}

Displayed is four percentiles of timing: Median (50th), 90th, 99th, and
Max (100th). The results are better across the board with a 50% speedup
typical.

--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:5>

Django

unread,
Jan 10, 2018, 11:59:16 AM1/10/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Grant Jenks):

One more set of benchmarks. The above results benchmark in a single-
process and single-thread environment. There's therefore never a time when
the locks contend with one another for access. I thought it would be
useful to observe performance with more contention and my development
machine has eight logical processors so here's the results in a single-
process and eight-thread environment.

The current implementation:

{{{
========= ========= ========= ========= ========= ========= =========
=========
Timings for locmem
-------------------------------------------------------------------------------
Action Count Miss Median P90 P99 Max
Total
========= ========= ========= ========= ========= ========= =========
=========

get 712648 85412 14.067us 863.075us 2.461ms 14.337ms
160.973s
set 71226 0 204.563us 238.180us 282.288us 7.419ms
12.356s
delete 8118 0 201.941us 235.081us 281.811us 410.080us
1.395s
Total 791992
174.724s


========= ========= ========= ========= ========= ========= =========
=========
}}}

The new implementation:

{{{
========= ========= ========= ========= ========= ========= =========
=========
Timings for locmem
-------------------------------------------------------------------------------
Action Count Miss Median P90 P99 Max
Total
========= ========= ========= ========= ========= ========= =========
=========

get 712637 84222 146.151us 165.224us 201.225us 9.591ms
106.855s
set 71241 0 149.012us 167.847us 204.802us 9.604ms
10.875s
delete 8114 0 144.958us 163.078us 195.026us 552.893us
1.200s
Total 791992
118.930s


========= ========= ========= ========= ========= ========= =========
=========
}}}

The results overall display less variation but the median "get" time is
10x slower (14 microseconds vs 146 microseconds). Despite that large
slowdown, the total time for "get" operations is ~33% faster (160 seconds
vs 106 seconds). The "set" and "delete" operations have comparable
performance.

The LRU eviction policy turns every read into a write which obsoletes the
django.utils.synch.RWLock used in the current implementation. The RWLock
was capable of giving multiple readers concurrent access but with the
additional overhead. Therefore we see in the "no contention" scenario that
the overhead is less by using threading.RLock. In the "high contention"
scenario the additional overhead gives reads an advantage at the median
but a larger penalty at the 99th percentile.

I think these tradeoffs are easy to accept. Note that the benchmark does
not include a miss-rate penalty which the LRU-eviction policy will almost
certainly improve compared with the current random-eviction policy.

--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:6>

Django

unread,
Jan 10, 2018, 1:08:25 PM1/10/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Adam (Chainz) Johnson):

* cc: Adam (Chainz) Johnson (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:7>

Django

unread,
Jan 11, 2018, 6:25:24 PM1/11/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Josh Smeaton):

* cc: josh.smeaton@… (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:8>

Django

unread,
Jan 15, 2018, 3:09:01 AM1/15/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Carlton Gibson):

* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:9>

Django

unread,
Jan 15, 2018, 10:45:56 AM1/15/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* needs_better_patch: 0 => 1
* stage: Ready for checkin => Accepted


Comment:

Some test coverage is missing as noted on the PR.

--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:10>

Django

unread,
Jan 17, 2018, 12:48:41 AM1/17/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Grant Jenks):

Pull request updated to exercise cache.incr code path.

--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:11>

Django

unread,
Jan 24, 2018, 3:55:06 AM1/24/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: assigned
Component: Core (Cache system) | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Carlton Gibson):

* needs_better_patch: 1 => 0


* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:12>

Django

unread,
Jan 24, 2018, 12:26:45 PM1/24/18
to django-...@googlegroups.com
#28977: Change Local Memory Cache to Use LRU
-------------------------------------+-------------------------------------
Reporter: Grant Jenks | Owner: Grant
| Jenks
Type: New feature | Status: closed

Component: Core (Cache system) | Version: master
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham <timograham@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"d38a3169a426516623929ff8c2b2c9703d801b75" d38a3169]:
{{{
#!CommitTicketReference repository=""
revision="d38a3169a426516623929ff8c2b2c9703d801b75"
Fixed #28977 -- Changed local-memory cache to use LRU culling.

LRU culling turns every read into a kind of write to the cache: cache keys
are moved to the first position in the OrderedDict when they are
retrieved.
The RWLock which permitted multiple readers while prioritizing a single
writer is obsolete since all accesses are now writes.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/28977#comment:13>

Reply all
Reply to author
Forward
0 new messages