New Ordered Dictionery to Criticise

Fuzzyman

unread,

Nov 28, 2005, 12:10:57 PM11/28/05

to

Sorry for this hurried message - I've done a new implementation of out
ordered dict. This comes out of the discussion on this newsgroup (see
blog entry for link to archive of discussion).

See the latest blog entry to get at it :
http://www.voidspace.org.uk/python/weblog/index.shtml

Criticism solicited (honestly) :-)

We (Nicola Larosa and I) haven't yet made any optimisations - but there
are two implementations to play with.

One allows you to access the keys attribute as if it was a sequence (as
well as a method).

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

"Martin v. Löwis"

unread,

Nov 28, 2005, 3:13:20 PM11/28/05

to

Fuzzyman wrote:
> Criticism solicited (honestly) :-)

A couple of minor points:
- I would drop 2.2 compatibility
- self=self isn't needed in the functions, because of
nested scopes
- popitem(self) can be rewritten as

def popitem(self):
try:
key = self._sequence.pop()
except IndexError:
# always try to give the same exception string as
# dict
raise KeyError("popitem(): dictionary is empty")
return key, self.pop(key)

- One problem with the FancyDict is that it allows
d.keys.append(100)

Regards,
Martin

Delaney, Timothy (Tim)

unread,

Nov 28, 2005, 5:28:49 PM11/28/05

to pytho...@python.org

"Martin v. Löwis" wrote:

> A couple of minor points:
> - I would drop 2.2 compatibility

Unfortunately, there are a *lot* of systems out there where only 2.2 is support (Red Hat 3.0 anyone?).

I know we'd like to be able to not support earlier versions (yes, I saw today's messages on the djgpp port) but I think where it's not too arduous it's worthwhile.

>From my quick scan, that's pop().

Tim Delaney

Christoph Zwerschke

unread,

Nov 28, 2005, 6:20:54 PM11/28/05

to

Fuzzyman wrote:
> Sorry for this hurried message - I've done a new implementation of out
> ordered dict. This comes out of the discussion on this newsgroup (see
> blog entry for link to archive of discussion).

Thanks. I'll try to check it out and put my oar in over the next
weekend. One thing I already noticed: str() and repr() both output the
OrderedDict as if it was an ordinary dictionary. For str() this may be
acceptable, but repr() should output "a valid Python expression that
could be used to recreate an object with the same value".

-- Christoph

Fuzzyman

unread,

Nov 29, 2005, 4:23:06 AM11/29/05

to

Martin v. Löwis wrote:
> Fuzzyman wrote:
> > Criticism solicited (honestly) :-)
>
> A couple of minor points:
> - I would drop 2.2 compatibility

There are a lot of cheap hosting accounts where Python 2.2 is all that
is available. I would only drop support if there is some *compelling*
reason to do so.

> - self=self isn't needed in the functions, because of
> nested scopes

Cool, I'll test this out.

> - popitem(self) can be rewritten as
>
> def popitem(self):
> try:
> key = self._sequence.pop()
> except IndexError:
> # always try to give the same exception string as
> # dict
> raise KeyError("popitem(): dictionary is empty")
> return key, self.pop(key)
>

Yup, that's nicer - thanks.

> - One problem with the FancyDict is that it allows
> d.keys.append(100)

Are you sure ?

No file given.
Alias System Command
------------------------------
cls cls
copy copy
ddir dir /ad /on
dir dir /on
echo echo
ldir dir /ad /on
ls dir /on
mkdir mkdir
ren ren
rmdir rmdir
------------------------------
Total number of aliases: 10
Movable Python
IPython Interactive Shell. See the manual for a list of features and
tips.
Ctrl-D to exit.

>>> from odict import FancyODict
>>> d = FancyODict()
>>> d
Out[3]:{}

>>> d.keys
----------------------> d.keys()
Out[4]:[]

>>> d.keys.append('anything')
---------------------------------------------------------------------------
exceptions.TypeError Traceback (most
recent call
last)

\configobj4\pythonutils\odict.py in append(self, item)
713 def __iadd__(self, other): raise TypeError('Can\'t add in
place to keys')
714 def __imul__(self, n): raise TypeError('Can\'t multiply
keys in place')
--> 715 def append(self, item): raise TypeError('Can\'t append
items to keys')
716 def insert(self, i, item): raise TypeError('Can\'t insert
items into keys')
717 def pop(self, i=-1): raise TypeError('Can\'t pop items from
keys')

TypeError: Can't append items to keys

>>>

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

>
> Regards,
> Martin

Fuzzyman

unread,

Nov 29, 2005, 4:27:02 AM11/29/05

to

Christoph Zwerschke wrote:
> Fuzzyman wrote:
> > Sorry for this hurried message - I've done a new implementation of out
> > ordered dict. This comes out of the discussion on this newsgroup (see
> > blog entry for link to archive of discussion).
>
> Thanks. I'll try to check it out and put my oar in over the next
> weekend.

Cool, thanks.

Someone has commented on my blog that passing lists to ``keys``,
``values``, and ``items`` to assign to them "smells funny".

They've suggested new methods ``setkeys``, ``setitems``, ``setvalues``
instead. Any opinions ?

> One thing I already noticed: str() and repr() both output the
> OrderedDict as if it was an ordinary dictionary. For str() this may be
> acceptable, but repr() should output "a valid Python expression that
> could be used to recreate an object with the same value".
>

Yup. If you look in the code, there is already a commented out version
of ``__repr__`` that does this.

*Unfortunately*, it means changing *all* the doctests - so it will have
to wait until one of us can be bothered. :-)

It will happen in the course of time I guess.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

> -- Christoph

"Martin v. Löwis"

unread,

Nov 29, 2005, 5:39:06 PM11/29/05

to

Fuzzyman wrote:
>>- One problem with the FancyDict is that it allows
>> d.keys.append(100)
>
>
> Are you sure ?

Not at all. This was from inspection only; I propably
misinterpreted the code.

Regards,
Martin

Fuzzyman

unread,

Dec 1, 2005, 6:38:37 AM12/1/05

to

Fuzzyman wrote:
> Sorry for this hurried message - I've done a new implementation of out
> ordered dict. This comes out of the discussion on this newsgroup (see
> blog entry for link to archive of discussion).
>
> See the latest blog entry to get at it :
> http://www.voidspace.org.uk/python/weblog/index.shtml
>

Hello all,

I've just done a new "beta 2" version. It has a full version of
FancyODict with the custome "callable sequence objects" for keys,
values and items. They are almost completely covered by tests.

You can download the new(er) version from :
http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=odictbeta2.py

Full discussion of the remaining issues below, or at :
http://www.voidspace.org.uk/python/weblog/arch_d7_2005_11_26.shtml#e147

Progress on the updated implementation of dict continues. (I hestitate
to say *new* version, as it's just a heavy makeover for the old code -
which was basically sound).

``FancyODict`` is now a full implementation of an Ordered Dictionary,
with custom *callable sequence objects* for ``keys``, ``values``, and
``items``. These can be called like normal methods, but can also be
accessed directly as sequence objects. This includes assigning to,
indexing, and slicing - as well as all the other relevant sequence
methods. {sm;:-)}

I've also added an optional index to ``OrderedDict.popitem``.

I'm sure there are lots of ways this can be optimised for efficiency -
but the new objects have got pretty full test coverage.

You can download the new version (for testing) from `odict Beta 2
<http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=odictbeta2.py>`_

The following issues still remain :

* ``FancyOdict`` is a separate class from ``OrderedDict``.

Because this version is *undoubtably* less efficient than
OrderedDict, my current thinking is that I should leave them separate
(and have both available). Being able to operate on the
keys/values/items as sequences is for convenience only.

Anyone got a suggestion for a better name than ``FancyODict`` ?

* You can no longer access the key order directly. The old ``sequence``
attribute is depracated and will eventually go away.

You can currently alter the order (of keys, values *and* items) by
passing an iterable into those methods.

Someone has suggested that this "smells bad" - and it ought to be
done through separate `setkeys``, ``setvalues``, and ``setitems``
methods.

I'm *inclined* to agree, but I don't feel strongly about it. Anyone
else got any opinions ?

* ``repr`` ought to return a value that ``eval`` could use to turn back
into an OrderedDict.

I have actually done an implementation of this, but it would mean
that *all* the doctests need to be changed. I *will* do this at some
point.

* Slice assignment.

The semantics for slice assignment are fiddly.

For example, what do you do if in a slice assignment a key collides
with an existing key ?

My current implementation does what an ordinary dictionary does,
the new value overwrites the previous one. This means that the
dictionary can reduce in size as the assignment progresses. {sm;:?}

I think this is preferable to raising an error and preventing
assignment. It does prevent an optimisation whereby I calculate the
indexes of all the new items in advance.

It also means you can't rely on the index of a key from a slice
assignment, unless you know that there will be no key collisions.

In general I'm *against* preventing programmers from doing things,
so long as the documentation carries an appropriate warning.

An example will probably help illustrate this :

.. raw:: html

{+coloring}

d = OrderedDict()
d[1] = 1
d[2] = 2
d[3] = 3
d[4] = 4
d.keys()
[1, 2, 3]

# fetching every other key
# using an extended slice
# this actually returns an OrderedDict
d[::2]
{1: 1, 3: 3}

# we can assign to every other key
# using an ordered dict
d[::2] = OrderedDict([(2, 9), (4, 8)])
len(d) == 4
False

d
{2: 9, 4: 8}

"""
Because of the key collisions the length of
d has changed - it now only has two keys instead
of four.
"""

{-coloring}

Bengt Richter

unread,

Dec 2, 2005, 2:06:47 AM12/2/05

to

On 1 Dec 2005 03:38:37 -0800, "Fuzzyman" <fuzz...@gmail.com> wrote:

>
>Fuzzyman wrote:
>> Sorry for this hurried message - I've done a new implementation of out
>> ordered dict. This comes out of the discussion on this newsgroup (see
>> blog entry for link to archive of discussion).
>>
>> See the latest blog entry to get at it :
>> http://www.voidspace.org.uk/python/weblog/index.shtml
>>
>
>Hello all,
>
>I've just done a new "beta 2" version. It has a full version of
>FancyODict with the custome "callable sequence objects" for keys,
>values and items. They are almost completely covered by tests.
>
>You can download the new(er) version from :
>http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=odictbeta2.py
>

Ran my tests on it: this one may be of interest:
__________________________ entrypoint: test_popitem ___________________________

def test_popitem():
d = CandidateDict([(k,i*100) for i,k in enumerate('abcde')])
> assert d.popitem() == ('e', 400)

[C:\pywk\ut\test\test_odicts.py:228]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
...
...
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

def __getattr__(self, name):
"""
Implemented so that access to ``sequence`` raises a warning.

>>> d = OrderedDict()
>>> d.sequence
[]
"""
if name == 'sequence':
warn('use of sequence attribute is deprecated. Use keys method '
'instead.', DeprecationWarning)
# NOTE: Still (currently) returns a direct reference. Need to
# because code that uses sequence will expect to be able to
# mutate it in place.
return self._sequence
else:
# NOTE: strictly unnecessary, but it raises the appropriate error
> getattr(self, name)

[c:\pywk\clp\odictbeta2.py:309]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Recursion detected (same locals & position)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============= tests finished: 19 passed, 9 failed in 1.57 seconds =============

I'm playing this game, but I wonder how much of it really makes sense ;-)

>
> Someone has suggested that this "smells bad" - and it ought to be
>done through separate `setkeys``, ``setvalues``, and ``setitems``
>methods.
>
> I'm *inclined* to agree, but I don't feel strongly about it. Anyone
>else got any opinions ?

I don't see a big difference. What's important, if you ask me, is
whether you get an idea of what will happen when you pass the iterable as
an argument. OTTOMH that suggests that keys/values/items[slice] = iterable
have more expectation-generating precedent, and in fact perhaps should define
what is to happen if no other constraints are violated.

>
>* ``repr`` ought to return a value that ``eval`` could use to turn back
>into an OrderedDict.
>
> I have actually done an implementation of this, but it would mean
>that *all* the doctests need to be changed. I *will* do this at some
>point.

Point in favor of py.test's separation of test/tested I guess.

>
>* Slice assignment.
>
> The semantics for slice assignment are fiddly.
>
> For example, what do you do if in a slice assignment a key collides
>with an existing key ?

IMO, a slice defines the sliced-out as well as the untouched, and I think
it is an error to collide with the unouched part, since it requires either
modifying the "untouched" or ignoring incoming slice data.
Incoming can go by the rules of a plain dict constuctor, except preserving order,
and the rest should really be untouched IMO. BTW, I think new keys in the
incoming slice data should NOT wind up at the end of the overall dict, since
that violates the "untouched" part.

>
> My current implementation does what an ordinary dictionary does,
>the new value overwrites the previous one. This means that the
>dictionary can reduce in size as the assignment progresses. {sm;:?}

That part is ok, though I wonder if some might want to have a strict-replacement
mode for an odict, so slices can't change sizes.
BTW just thought of a name: itemdict -- to suggest the item-list-ordered semantics.

>
> I think this is preferable to raising an error and preventing
>assignment. It does prevent an optimisation whereby I calculate the
>indexes of all the new items in advance.

a strict_replacement=True kwyord arg to the descriptor could let you
do both.

>
> It also means you can't rely on the index of a key from a slice
>assignment, unless you know that there will be no key collisions.

Well, if you guarantee no name collisions in the source items by using
an ordered dict as source, and make collisions with the "untouched"
parts of a slice assignment an error, then you can rely on it.
IMO that would be a useful middle way.

>
> In general I'm *against* preventing programmers from doing things,
>so long as the documentation carries an appropriate warning.

Me too, in general ;-)

>
> An example will probably help illustrate this :
>
>.. raw:: html
>
> {+coloring}
>
> d = OrderedDict()
> d[1] = 1
> d[2] = 2
> d[3] = 3
> d[4] = 4
> d.keys()
> [1, 2, 3]

Where's the 4?
(I.e., why not just take your examples off an interactive screen? ;-)

>
> # fetching every other key
> # using an extended slice
> # this actually returns an OrderedDict
> d[::2]
> {1: 1, 3: 3}

OrderedDict([(1, 1), (3, 3)]) # in future, right?
(i.e. '%s(%s)' %( type(d).__name__. d.items()) # whatever the final name ;-)

>
> # we can assign to every other key
> # using an ordered dict

Why the restriction to ordered dict, other than as a convenient
way to pre-normalize the item list? Doesn't update accept a plain item list?

> d[::2] = OrderedDict([(2, 9), (4, 8)])

i.e., why not accept plain [(2, 9), (4, 8)] here?
> len(d) == 4
> False
BTW, this one I don't do yet. I reject aslice.step not in (None, 1)
I seemed a bit too weird. But I guess I could just let the programmer decide ;-)

>
> d
> {2: 9, 4: 8}
>
> """
> Because of the key collisions the length of
> d has changed - it now only has two keys instead
> of four.
> """

IMO the above collision is an error that should raise an exception,
because the collision is with a key outside the (albeit interlaced)
target slice, so you are effectively assigning outside the slice.

BTW, what about slices of keys and values? E.g. does

d.keys[:] = ['prefix_'+ key for key in d.keys()]

make sense? Note that it is different from d[:] = something.

Similarly d.values, thoug d.items[sliceinst] might be the same
as d[sliceinst] for assignment. d[sliceinst] returns an ordered dict though,
where as d.items[sliceinst] returns the same item list slice as
d.items()[sliceinst].

Regards,
Bengt Richter

Fuzzyman

unread,

Dec 2, 2005, 9:03:43 AM12/2/05

to

Hello Bengt,

Yep the line :

getattr(self, name)

is a bug.

I've replaced it now (not yet "published") with
object.__getattr__(self, name)
It could equally just be replaced with a raise AttributeError

You've picked up on another typo in the code, whichis odd because
``popitem`` is tested.

Oh well - works now.

[snip..]

> > You can currently alter the order (of keys, values *and* items) by
> >passing an iterable into those methods.
> I'm playing this game, but I wonder how much of it really makes sense ;-)
>

Can you explain what you mean ?

> >
> > Someone has suggested that this "smells bad" - and it ought to be
> >done through separate `setkeys``, ``setvalues``, and ``setitems``
> >methods.
> >
> > I'm *inclined* to agree, but I don't feel strongly about it. Anyone
> >else got any opinions ?
> I don't see a big difference. What's important, if you ask me, is
> whether you get an idea of what will happen when you pass the iterable as
> an argument. OTTOMH that suggests that keys/values/items[slice] = iterable
> have more expectation-generating precedent, and in fact perhaps should define
> what is to happen if no other constraints are violated.

Don't completely follow.

Are you saying that

d = OrderedDict(some_iterable)
d.keys[slice] = iterable

is more *understandable* ("expectation-generating precedent" WTF) ?

The alternative being : d.keys(iterable)

In which case I'm inclined to agree. *However* - having custom objects
for items/keys/values is undoubtably less efficient.

Therefore I'm supplying two classes OrderedDict and FancyODict (name
not yet fixed).

OrderedDict needs a way of changing the keys - so it needs either

d.keys(iterable)

*or*

d.setkeys(iterable)

Which is preferable, is my question.

I think you're saying you don't care :-)
In which case I will leave it as it presently is.

> >
> >* ``repr`` ought to return a value that ``eval`` could use to turn back
> >into an OrderedDict.
> >
> > I have actually done an implementation of this, but it would mean
> >that *all* the doctests need to be changed. I *will* do this at some
> >point.
> Point in favor of py.test's separation of test/tested I guess.
>

Probably. :-)

doctest is part of the standard library - which is a point in it's
favour.

> >
> >* Slice assignment.
> >
> > The semantics for slice assignment are fiddly.
> >
> > For example, what do you do if in a slice assignment a key collides
> >with an existing key ?
> IMO, a slice defines the sliced-out as well as the untouched, and I think
> it is an error to collide with the unouched part, since it requires either
> modifying the "untouched" or ignoring incoming slice data.
> Incoming can go by the rules of a plain dict constuctor, except preserving order,
> and the rest should really be untouched IMO. BTW, I think new keys in the
> incoming slice data should NOT wind up at the end of the overall dict, since
> that violates the "untouched" part.
>

"new keys in the incoming slice data should NOT wind up at the end of
the overall dict"

I'm not a 100% certain I understand you, and I possibly didn't explain
myself properly.

If you assign to a slice, and there is a key collision. The old
(pre-existing) key is removed and the new inserted at the expected
index - so it doesn't end up at the "end".

> >
> > My current implementation does what an ordinary dictionary does,
> >the new value overwrites the previous one. This means that the
> >dictionary can reduce in size as the assignment progresses. {sm;:?}
> That part is ok, though I wonder if some might want to have a strict-replacement
> mode for an odict, so slices can't change sizes.
> BTW just thought of a name: itemdict -- to suggest the item-list-ordered semantics.
> >
> > I think this is preferable to raising an error and preventing
> >assignment. It does prevent an optimisation whereby I calculate the
> >indexes of all the new items in advance.
> a strict_replacement=True kwyord arg to the descriptor could let you
> do both.
>

Then I have to implement *both* ways of doing it ! I'll consider it.

> >
> > It also means you can't rely on the index of a key from a slice
> >assignment, unless you know that there will be no key collisions.
> Well, if you guarantee no name collisions in the source items by using
> an ordered dict as source, and make collisions with the "untouched"
> parts of a slice assignment an error, then you can rely on it.
> IMO that would be a useful middle way.
>

I'll consider that as well. As I said I don't like raising errors when
I don't have to. If the programmer wants to guarantee there are no
collisions then he can check himself. :-)

> >
> > In general I'm *against* preventing programmers from doing things,
> >so long as the documentation carries an appropriate warning.
> Me too, in general ;-)

[snip..]

> > # we can assign to every other key
> > # using an ordered dict
> Why the restriction to ordered dict, other than as a convenient
> way to pre-normalize the item list? Doesn't update accept a plain item list?
>

Allow slice assignment of a list of tuples like update... hmmm...

Maybe if anyone asks for it "in the wild".

> > d[::2] = OrderedDict([(2, 9), (4, 8)])
> i.e., why not accept plain [(2, 9), (4, 8)] here?
> > len(d) == 4
> > False
> BTW, this one I don't do yet. I reject aslice.step not in (None, 1)
> I seemed a bit too weird. But I guess I could just let the programmer decide ;-)

I didn't see any reason to restrict it, seeing as for my (unoptimised)
implementation it doesn't make any difference.

> >
> > d
> > {2: 9, 4: 8}
> >
> > """
> > Because of the key collisions the length of
> > d has changed - it now only has two keys instead
> > of four.
> > """
> IMO the above collision is an error that should raise an exception,
> because the collision is with a key outside the (albeit interlaced)
> target slice, so you are effectively assigning outside the slice.
>

Yeah - maybe. Still not sure about this.

> BTW, what about slices of keys and values? E.g. does
>
> d.keys[:] = ['prefix_'+ key for key in d.keys()]
>
> make sense? Note that it is different from d[:] = something.
>

Yep - it allows you to change the keys in place. (Retaining the
position in the sequence)

> Similarly d.values, thoug d.items[sliceinst] might be the same
> as d[sliceinst] for assignment. d[sliceinst] returns an ordered dict though,
> where as d.items[sliceinst] returns the same item list slice as
> d.items()[sliceinst].
>

Yep. Lots to consider. Only a few minor tweaks and my implementation is
basically working enough to release.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

> Regards,
> Bengt Richter