Problem with shelve / nested lists

Michael Schmitt

unread,

Jul 13, 2002, 7:53:10 PM7/13/02

to

Hello.
I'm trying to use a shelve as a disc based dictionary, but have problems
with nested lists. If I store a nested list as value, inner lists seem to
be immutable.

According to the library reference, shelves can store "essentially
arbitrary Python objects" and "recursive data types".

What is my misunderstanding?

Python2.2:
>>> import shelve
>>> d1= shelve.open('dict.dat','c')
>>> d1['bla']= [1,2,[10,20]]
>>> d1['bla'][2]
[10, 20]
>>> d1['bla'][2].append(30)
>>> d1['bla'][2]
[10, 20]

Using a dictionary instead of the shelve, lets append to the inner list.
>>> d2= {}
>>> d2['bla']= [1,2,[10,20]]
>>> d2['bla'][2]
[10, 20]
>>> d2['bla'][2].append(30)
>>> d2['bla'][2]
[10, 20, 30]

Thanks for any hint,
Michael

Alex Martelli

unread,

Jul 14, 2002, 4:08:23 AM7/14/02

to

Michael Schmitt wrote:

> Hello.
> I'm trying to use a shelve as a disc based dictionary, but have problems
> with nested lists. If I store a nested list as value, inner lists seem to
> be immutable.

It's a well-known but underdocumented defect of shelves whose
values are mutable objects -- mutations don't "take" unless you
take very specific steps to help. I pointed it out both in the
Python Cookbook (official launch in 2 weeks, at OSCON) and in
the Nutshell (target date October), but that doesn't help much
right now:-).

Easiest way to reproduce the problem:

>>> import shelve
>>> s=shelve.open('x.y')
>>> s['a']=range(3)
>>> s['a']
[0, 1, 2]
>>> s['a'].append(4)
>>> s['a']
[0, 1, 2]
>>>

Workaround:

>>> x = s['a']
>>> x.append(4)
>>> s['a'] = x
>>> s['a']
[0, 1, 2, 4]
>>>

I.e. you must arrange to assign the new value to s[thekey] --
it's only on assignment to its items that the shelve object
notices any changes.

Proposed fixes mooted for Python 2.3 are probably not going
to happen. Having s "cache" all the s[whatever] that are
being accessed (just in case they might be mutated) would, it
is felt, have enormous memory costs in the common case where
the objects are just being examined, not mutated.

> According to the library reference, shelves can store "essentially
> arbitrary Python objects" and "recursive data types".

They do. They just don't give you exactly those objects when
you fetch s[x], but rather, copies (reconstructed from the
objects' pickles) -- and then promptly forget about whatever
they just returned (holding on to it might, it is felt, have
too-high memory costs). So, when you call mutating methods
on s[whatever], you're mutating a copy -- the shelve object
still holds on to the original pickle, and if and when you
ask for s[whatever] again you'll get another fresh copy.
Notice...:

>>> s['a'] is s['a']
0
>>>

each s['a'] has different object identity, so "is" is not
satisfied.

> What is my misunderstanding?

I think you expect sensible behavior, sensible behavior is
hard to supply here, and the fact that shelve's behavior is
NOT sensible is not clearly documented anywhere (at least
until my books come out:-).

I also think this situation is exceedingly unfortunate.

Python and its libraries have VERY few gotchas that are at
all comparable to this veritable trap laid out for the
perfectly reasonable unwary user expecting perfectly reasonable
behavior. This makes such traps stand out as _particularly_
troublesome when they do happen.

Unfortunately, I have no 'magic wand' fix that will both
make shelve's behavior reasonable AND still provide decent
performance in common use cases:-(.

Alex

Fredrik Lundh

unread,

Jul 14, 2002, 7:12:00 AM7/14/02

to

Michael Schmitt wrote:

> I'm trying to use a shelve as a disc based dictionary, but have problems
> with nested lists. If I store a nested list as value, inner lists seem to
> be immutable.
>
> According to the library reference, shelves can store "essentially
> arbitrary Python objects" and "recursive data types".

the shelve only stores things when you write

d1['bla'] = value

and only fetches things when you use d['bla']
in an expression.

if you write

d1['bla'][2].append(30)

you'll fetch a value from the shelve, modify the
return value, but you're not writing it back.

(with this in mind, this isn't that different from
asking why, say

file.read(20) + "hello"

doesn't write "hello" to the file)

to make this work as you want, think in read/write terms,
and make sure you always assign to the shelve when you've
modified a value, so it can write it back:

value = d1['bla']
value[2].append(30)
d1['bla'] = value

</F>