pickle performance on larger objects

Geoffrey Talvola

unread,

Jul 17, 2002, 5:13:25 PM7/17/02

to

Try using binary pickle format (pass in 1 as an extra third argument to
dump()). It's quite a bit faster both dumping and loading.

- Geoff

> -----Original Message-----
> From: Sam Penrose [mailto:spen...@intersight.com]
> Sent: Wednesday, July 17, 2002 5:09 PM
> To: pytho...@python.org
> Subject: pickle performance on larger objects
>
>
> On a recent project we decided to use pickle for some quick-and-dirty
> object persistence. The object in question is a list of 3,000
> dictionaries
> whose keys and values are short (< 100 character) strings--about 1.5
> megs worth of character data in total. Loading this object
> from a pickle
> using cPickle took so long we assumed something was broken.
>

Sam Penrose

unread,

Jul 17, 2002, 5:09:07 PM7/17/02

to

On a recent project we decided to use pickle for some quick-and-dirty
object persistence. The object in question is a list of 3,000
dictionaries
whose keys and values are short (< 100 character) strings--about 1.5
megs worth of character data in total. Loading this object from a pickle
using cPickle took so long we assumed something was broken.

In fact, loading is just slow. A list of 10,000 identical dictionaries
whose keys and values are short strings takes many seconds to load on
modern hardware. Some details:
i. A python process which is loading a pickle will use a lot of RAM
relative to the pickle's size on disk, roughly an order of
magnitude more on Mac OS X.
ii. Performance appears to scale linearly with changes in the size of
the list or its dicts until you run out of RAM.
iii.Python pickle is only about 5x slower than cPickle as the list
gets long, except that it uses more RAM and therefore hits a big
RAM-to-diskswap performance falloff sooner.
iv. You *can* tell a Mac's performance by its MHz. An 800 MHz PIII
running Windows is almost exactly twice as fast as a 400 MHz G4
running Mac OS X, both executing the following code from the
command line. With 25 items in the dictionaries and 10K dicts
used, the former took just under a minute using cPickle, the
latter two minutes.
v. Generating a list of 3K heterogeneous dicts of 25 items (our real
data) by reading in a 750k text file and splitting it up takes on
the order of a second.

Sample run on 400 MHz G4, 448 megs of RAM:

>>> time_cPickle_Load()
dumping list of 10 dicts:
dumped: 0.00518298149109
loading list of 10 dicts:
loaded: 0.1170129776
dumping list of 100 dicts:
dumped: 0.0329120159149
loading list of 100 dicts:
loaded: 0.849031090736
dumping list of 1000 dicts:
dumped: 0.397919893265
loading list of 1000 dicts:
loaded: 8.18722295761
dumping list of 10000 dicts:
dumped: 4.42434895039
loading list of 10000 dicts:
loaded: 133.906162977

#---code follows----------------//
def makeDict(numItems=25):
d = {}
for i in range(numItems):
k = 'key%s' % i
v = 'value%s' % i
d[k] = v
return d

def time_cPickle_Load():
import time
now = time.time
from cPickle import dump, load
filename = 'deleteme.pkl'

for i in (10, 100, 1000, 10000):
data = [makeDict() for j in range(i)]
output = open(filename, 'w')
startDump = now()
print "dumping list of %s dicts:" % i
dump(data, output)
print "dumped:", now() - startDump
output.close()
input = open(filename)
startLoad = now()
print "loading list of %s dicts:" % i
x = load(input)
print "loaded:", now() - startLoad
input.close()

Sam Penrose

unread,

Jul 17, 2002, 6:07:38 PM7/17/02

to

Doh! That's what I get for reading the docs too quickly.

With bin = 1:

>>> time_cPickle_Load()
dumping list of 10 dicts:

dumped: 0.00358998775482

loading list of 10 dicts:

loaded: 0.00386297702789

dumping list of 100 dicts:

dumped: 0.0141079425812

loading list of 100 dicts:

loaded: 0.0700759887695

dumping list of 1000 dicts:

dumped: 0.188324928284 # ~ 2x speedup

loading list of 1000 dicts:

loaded: 0.379488945007 # ~ 20x speedup

dumping list of 10000 dicts:

dumped: 2.23442697525 # ~ 2x speedup

loading list of 10000 dicts:

loaded: 3.07298398018 # ~ 40x speedup

memory usage increases by about 20%, FWIW. For my particular use case
cPickle is still several (many ?) times slower than just recreating the
object by reading in a file. What implications this has for best
practices in persistence of larger objects I do not know, but I hope the
data point is of interest to others.

On Wednesday, July 17, 2002, at 02:13 PM, Geoffrey Talvola wrote:

> Try using binary pickle format (pass in 1 as an extra third argument to
> dump()). It's quite a bit faster both dumping and loading.
>
> - Geoff
>
>> -----Original Message-----
>> From: Sam Penrose [mailto:spen...@intersight.com]
>> Sent: Wednesday, July 17, 2002 5:09 PM
>> To: pytho...@python.org
>> Subject: pickle performance on larger objects
>>
>>

Paul Rubin

unread,

Jul 17, 2002, 6:43:35 PM7/17/02

to

Sam Penrose <spen...@intersight.com> writes:
> On a recent project we decided to use pickle for some quick-and-dirty
> object persistence. The object in question is a list of 3,000
> dictionaries
> whose keys and values are short (< 100 character) strings--about 1.5
> megs worth of character data in total. Loading this object from a pickle
> using cPickle took so long we assumed something was broken.

What happens (besides losing cross-version portability) if you use
marshal instead of pickle?

Kevin Altis

unread,

Jul 17, 2002, 10:08:59 PM7/17/02

to

I'm quite surprised by your numbers, but perhaps it is because you didn't
use a binary pickle? The companies sample included with PythonCard uses the
flatfileDatabase module in the PythonCard framework to load and save a
binary pickle file that is around 1.2MB, a little over 6600 records stored
as one big list of 6600+ dictionaries, so single load and dump calls are
used to get and save the data. On my machine, loading takes a fraction of a
second. The original data is in XML format and another version of the data
was output via pprint and both of those take dramatically longer to convert
back to a usuable list of dictionaries in memory.

If you are interested in exploring this further, contact me directly, and I
can tell you how to run some additonal experiments. Then maybe we can
summarize for the list. If you want to install MachoPython, wxPython Mac,
and PythonCard on your system, you can just run the companies sample
directly with the -l command-line option to enable logging. Alternatively,
we can get the relevant data-centric parts of the flatfileDatabase module
and the companies data set onto your box and you can play with it in a
non-gui script.

ka
---
Kevin Altis
al...@semi-retired.com
http://www.pythoncard.org/

"Sam Penrose" <spen...@intersight.com> wrote in message
news:mailman.1026940226...@python.org...

Michael Hudson

unread,

Jul 18, 2002, 5:59:37 AM7/18/02

to

Sam Penrose <spen...@intersight.com> writes:

> memory usage increases by about 20%, FWIW. For my particular use case
> cPickle is still several (many ?) times slower than just recreating
> the object by reading in a file.

Are you sure that the file doesn't have an associated .pyc? For then
you're testing marshal perfomance, and that's always going to kill
pickle. Just a thought.

Cheers,
M.

--
We've had a lot of problems going from glibc 2.0 to glibc 2.1.
People claim binary compatibility. Except for functions they
don't like. -- Peter Van Eynde, comp.lang.lisp

François Pinard

unread,

Jul 18, 2002, 6:59:23 AM7/18/02

to

[Sam Penrose]

> For my particular use case cPickle is still several (many ?) times slower

> than just recreating the object by reading in a file. What implications
> this has for best practices in persistence of larger objects I do not
> know, but I hope the data point is of interest to others.

It also corresponds to what I saw while rewriting `rebox.el' into `rebox.py'.
`rebox.py' refills boxed comments in various programming languages, I
made it to produce a bigger example for Pymacs, and also to make `rebox'
itself more maintainable: I find Python easier to maintain than Lisp.

When used in batch to rebox a single comment, `rebox.py' spends almost
all of its time in initialisation, and I thought I could make it more
speedy by initialising once and saving the result into a pickle, meant to
be reloaded in later runs. Pickle loading was not significantly faster
than re-initialising afresh, so I dropped it to favour simplicity.

--
François Pinard http://www.iro.umontreal.ca/~pinard

Jeremy Hylton

unread,

Jul 18, 2002, 4:55:54 PM7/18/02

to

Sam Penrose <spen...@intersight.com> wrote in message news:<mailman.1026943747...@python.org>...

> memory usage increases by about 20%, FWIW. For my particular use case
> cPickle is still several (many ?) times slower than just recreating the
> object by reading in a file. What implications this has for best
> practices in persistence of larger objects I do not know, but I hope the
> data point is of interest to others.

MWH's comment about marshal is worth keeping in mind. cPickle is
doing a lot of work that marshal isn't doing. Where marshal calls
fwrite() directly in most cases, cPickle wraps fwrite() in a C
function that w/ Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS. All
those extra C functions certainly add up when you've got some many
objects.

It's also checking for cycles in the containers, which means it has to
do a lot of extra bookkeeping for each dict it finds. You can disable
that by setting the fast attribute on the pickler:

p = cPickle.Pickler()
p.fast = 1

Jeremy

Sam Penrose

unread,

Jul 18, 2002, 4:37:10 PM7/18/02

to

> Sam Penrose <spen...@intersight.com> writes:
>
> > memory usage increases by about 20%, FWIW. For my particular use case
> > cPickle is still several (many ?) times slower than just recreating
> > the object by reading in a file.

Michael Hudson asks:

> Are you sure that the file doesn't have an associated .pyc? For then
> you're testing marshal perfomance, and that's always going to kill
> pickle. Just a thought.

Sorry, I wasn't clear. The file in question is a database dump; the
object being created a class instance with a few attributes, including
the list of dicts referred to in my original post.