We all know Python loves its objects. It really loves strings. The other
day I profiled Python and checked malloc calls in the CGI script that runs
on our site most frequently. With about a 3-second run-time it called
malloc over 66,000 times with half being from inside stringobject.c. (Lists
were the next most frequently allocated objects, accounting for about 10,000
malloc calls.) I figured it was a place that could use a little help.
I wrote a possibly better (read: not evaluated in a performance way yet)
string object allocator last night that treats stringobjects of <= 128 bytes
(including the stringobject header) special. The allocator rounds up
requests for small objects to the next higher power of two (<=32 -> 32, <=64
-> 64, and <=128 -> 128). No stringobject can contain less than 20 bytes by
default (stringobject headers are normally 20 bytes). 16kbyte blocks for
32-, 64-, and 128-byte stringobjects are then allocated as necessary, and a
free list is maintained, currently chained together through the ob_type
field.
The net effect is to cut way back on the number of malloc calls. I doubt
that by itself this is such a big performance win on my system, but malloc
needs to add some bookkeeping overhead of its own, so I figure while I might
only be saving a small amount in time, I'm also saving a small amount in
space, and it all adds up. (Tim Peters observed that ten 1% improvements
would probably net you a bigger gain than a single 10% improvement.) On
systems with less industrial-strength mallocs, cutting way back on malloc
calls may be a big win.
I then added some conditionally compiled counters to stringobject.c and a
Python-visible interface in stropmodule.c that counts the number of string
allocations and deallocations for each of the four categories, returning a
list of each. On that 3-second or so CGI script I got:
Alloc Dealloc
<= 32 30950 10824
<= 64 6440 3170
<= 128 1394 1076
> 128 2025 707
This was measured right at the end, just before sys.stdout was closed by the
script. I figure most of the large allocations were probably related to
imports. I haven't yet had a chance to try this with my database server.
It uses small strings as dict keys *heavily* however, so I expect more
skewed numbers.
I'll post a copy of the modified stringobject.c and stropmodule.c to my web
site:
http://www.automatrix.com/~skip/python/
I'm late for dinner, but I'll try and get them over there tonight or
tomorrow.
(By the way, there's nothing string-specific about this small object
allocator. I was thinking of chaining the free list through another field
and setting the type field once, but that would make it string-specific.
With a little work it could be the basis for a general small object
allocator as it now exists.)
Cheers,
--
Skip Montanaro | Python - it's not just for scripting anymore...
s...@calendar.com | http://www.python.org/
(518)372-5583 | Musi-Cal ------> http://concerts.calendar.com/