Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Dictionary from list?

3 views
Skip to first unread message

Jim Correia

unread,
Oct 18, 2001, 11:42:57 PM10/18/01
to
Suppose I have a list with an even number of elements in the form

[key1, value1, key2, value2, key3, value3]

If I were writing Perl I could write

my %hash = @array;

and it would do the coercion for me, and build the hash appropriately.

I'm a python newbie. I know how to do it "by hand" with a loop, but is
there a built in conversion operater that will let me do something
simply like the perl assignment?

Jim

Michael Hudson

unread,
Oct 19, 2001, 4:57:46 AM10/19/01
to
Jim Correia <jim.c...@pobox.com> writes:

No. Write the loop.

(You could probably do something horrible with reduce, but don't).

I've never found myself needing to do this, but that may be because
I'm not used to having a convenient way of going from
[k1,v1,k2,v2,...] to a dict.

Cheers,
M.

--
After a heavy night I travelled on, my face toward home - the comma
being by no means guaranteed. -- paraphrased from cam.misc

Ignacio Vazquez-Abrams

unread,
Oct 19, 2001, 5:48:16 AM10/19/01
to
On Fri, 19 Oct 2001, Michael Hudson wrote:

> Jim Correia <jim.c...@pobox.com> writes:
>
> > Suppose I have a list with an even number of elements in the form
> >
> > [key1, value1, key2, value2, key3, value3]
> >
> > If I were writing Perl I could write
> >
> > my %hash = @array;
> >
> > and it would do the coercion for me, and build the hash appropriately.
> >
> > I'm a python newbie. I know how to do it "by hand" with a loop, but is
> > there a built in conversion operater that will let me do something
> > simply like the perl assignment?
>
> No. Write the loop.
>
> (You could probably do something horrible with reduce, but don't).

No, I don't think that reduce() could help, although list comprehensions
might.

> I've never found myself needing to do this, but that may be because
> I'm not used to having a convenient way of going from
> [k1,v1,k2,v2,...] to a dict.

I don't know anything about Perl's internals (and not much about it's
externals either, but that's a different story ;) ), but it may be that Perl's
hash is simply an array in disguise, whereas in Python mappings and sequences
are discrete objects.

--
Ignacio Vazquez-Abrams <ign...@openservices.net>

"As far as I can tell / It doesn't matter who you are /
If you can believe there's something worth fighting for."
- "Parade", Garbage


Michael Hudson

unread,
Oct 19, 2001, 6:26:03 AM10/19/01
to
Ignacio Vazquez-Abrams <ign...@openservices.net> writes:

[perl's %hash = @array]


> > (You could probably do something horrible with reduce, but don't).
>
> No, I don't think that reduce() could help, although list comprehensions
> might.

OK, you asked for it <wink>:

/>> def beargh(d):
|.. unique = []
|.. def ouch(x,y):
|.. if x is unique:
|.. return y
|.. else:
|.. d[x] = y
|.. return unique
|.. return ouch
\__
->> d = {}
->> reduce(beargh(d), ['a', 1, 'b', 2])
[]
->> d
{'a': 1, 'b': 2}

you may have underestimated what I meant by horrible :-)

Cheers,
M.

--
Counting lines is probably a good idea if you want to print it out
and are short on paper, but I fail to see the purpose otherwise.
-- Erik Naggum, comp.lang.lisp

Ivan A. Vigasin

unread,
Oct 19, 2001, 8:56:34 AM10/19/01
to
On Fri, 19 Oct 2001, Michael Hudson wrote:

> />> def beargh(d):
> |.. unique = []
> |.. def ouch(x,y):
> |.. if x is unique:
> |.. return y
> |.. else:
> |.. d[x] = y
> |.. return unique
> |.. return ouch
> \__
> ->> d = {}
> ->> reduce(beargh(d), ['a', 1, 'b', 2])
> []
> ->> d
> {'a': 1, 'b': 2}

Idea is great! It comes from functional programming?

Your code doesn't work in my Python 2.1 ...

t1.py:1: SyntaxWarning: local name 'd' in 'beargh' shadows use of 'd' as global in nested scope 'ouch'
def beargh(d):
t1.py:1: SyntaxWarning: local name 'unique' in 'beargh' shadows use of 'unique' as global in nested scope 'ouch'
def beargh(d):
Traceback (most recent call last):
File "t1.py", line 12, in ?


reduce(beargh(d), ['a', 1, 'b', 2])

File "t1.py", line 4, in ouch
if x is unique:
NameError: global name 'unique' is not defined


I slighly modified your code and got the following:

class beargh:
def __init__(self,d):
self.d = d
def __call__(self,x,y):
if x == []:
return y
else:
self.d[x] = y
return []

d = {}
reduce( beargh(d), ['a', 1, 'b', 2] )
print d

Regards, Ivan <v...@parallelgraphics.com>
ICQ: 22181170


Jim Correia

unread,
Oct 19, 2001, 9:17:54 AM10/19/01
to
In article <u8ze8v...@python.net>, Michael Hudson <m...@python.net>
wrote:

> > I'm a python newbie. I know how to do it "by hand" with a loop, but is
> > there a built in conversion operater that will let me do something
> > simply like the perl assignment?
>
> No. Write the loop.

That's unfortunate - a simple assignment would be better. In the
simplest case of my usage, the loop (while ultra short) could be 50% of
the code.

> I've never found myself needing to do this, but that may be because
> I'm not used to having a convenient way of going from
> [k1,v1,k2,v2,...] to a dict.

The particular situation is the python implementation is going to be a
command line script. The calling convention for this script is to pass
arguments on the command line in key value pairs. The perl
implementation looks like (in the simplest case)

my %args = @ARGV;

foreach(keys %args)
{
print("$_: $args{$_}\n");
}

And it would be called from the command line as

perl myScript.pl name fred age 23 occupation "gravel worker"

I'd like to implement the script with the same calling conventions in
python, and have an "easy" (typing wise, not conceptually, but I guess
I'll have to write a reusable function) way to take the arguments on the
command line and turn them into a dictionary.

Jim

Michael Hudson

unread,
Oct 19, 2001, 9:18:10 AM10/19/01
to
"Ivan A. Vigasin" <v...@ParallelGraphics.COM> writes:

> On Fri, 19 Oct 2001, Michael Hudson wrote:
>
> > />> def beargh(d):
> > |.. unique = []
> > |.. def ouch(x,y):
> > |.. if x is unique:
> > |.. return y
> > |.. else:
> > |.. d[x] = y
> > |.. return unique
> > |.. return ouch
> > \__
> > ->> d = {}
> > ->> reduce(beargh(d), ['a', 1, 'b', 2])
> > []
> > ->> d
> > {'a': 1, 'b': 2}
> Idea is great!

You mean that for real? I think it's foul. It abuses the
implementation details of reduce to do something that function was
never intended to do.

> It comes from functional programming?

No. There are side-effects all over the place!

> Your code doesn't work in my Python 2.1 ...
>
> t1.py:1: SyntaxWarning: local name 'd' in 'beargh' shadows use of 'd' as global in nested scope 'ouch'
> def beargh(d):
> t1.py:1: SyntaxWarning: local name 'unique' in 'beargh' shadows use of 'unique' as global in nested scope 'ouch'
> def beargh(d):
> Traceback (most recent call last):
> File "t1.py", line 12, in ?
> reduce(beargh(d), ['a', 1, 'b', 2])
> File "t1.py", line 4, in ouch
> if x is unique:
> NameError: global name 'unique' is not defined

You needed a "from __future__ import nested_scopes" in there
somewhere.

> I slighly modified your code and got the following:
>
> class beargh:
> def __init__(self,d):
> self.d = d
> def __call__(self,x,y):
> if x == []:
> return y
> else:
> self.d[x] = y
> return []
>
> d = {}
> reduce( beargh(d), ['a', 1, 'b', 2] )
> print d

Actually, that works almost as well as my code -- you get problems
with input like ['a', 1, [], 2], but you have problems with that
anyway as lists aren't hashable.

Just in case anyone hasn't got the message yet:

THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.

Cheers,
M.

--
Now this is what I don't get. Nobody said absolutely anything
bad about anything. Yet it is always possible to just pull
random flames out of ones ass.
-- http://www.advogato.org/person/vicious/diary.html?start=60

Rich Harkins

unread,
Oct 19, 2001, 9:38:38 AM10/19/01
to
Here's one (albeit ugly) possibility using list comprehensions and a
sufficiently new Python:

# Setup variables
d={}
l=[n1,v1,n2,v2,n3,v3,...]

# Here's the money line...
[d.__setitem__(l[i],l[i+1]) for i in range(0,len(l),2)]

It is trading one type of loop for another but it will work. I did some
performance analysis and it does turn out that using a list comprehension
here is slower than using a loop, not to mention less readable to new
Pythonists. I recommend the loop form myself, it's easy to read and it's
fast enough to do the job.

Rich

> --
> http://mail.python.org/mailman/listinfo/python-list
>


Andrei Kulakov

unread,
Oct 19, 2001, 11:50:33 AM10/19/01
to

You know, this sort of thing comes up often - X is a built-in feature in
perl, why do I have to do it manually in python? I think this is a
justified design decision - there's simply too many built-in things perl
does. You can't remember all of them and if you look at someone else's
perl program you have to scratch your head a lot. Python is small, lean
and clean, and that's one of the best things about it.

- Andrei

--
Cymbaline: intelligent learning mp3 player - python, linux, console.
get it at: cy.silmarill.org

Erik Max Francis

unread,
Oct 19, 2001, 2:03:52 PM10/19/01
to
Andrei Kulakov wrote:

> You know, this sort of thing comes up often - X is a built-in feature
> in
> perl, why do I have to do it manually in python? I think this is a
> justified design decision - there's simply too many built-in things
> perl
> does. You can't remember all of them and if you look at someone else's
> perl program you have to scratch your head a lot. Python is small,
> lean
> and clean, and that's one of the best things about it.

In particular, what this is really revealing is that a Perl associative
array is really represented as a list, with each successive pair
representing a key and a value. (When writing list and associative
array literals, you see the same thing.)

This goes back to Perl's weakly-typed approach to everything. Since
Python is not weakly typed, emulating this behavior is not only not
convenient, but is not even advisable.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, US / 37 20 N 121 53 W / ICQ16063900 / &tSftDotIotE
/ \ When one is in love, a cliff becomes a meadow.
\__/ (an Ethiopian saying)
7 sisters productions / http://www.7sisters.com/
Web design for the future.

Quinn Dunkan

unread,
Oct 19, 2001, 4:56:51 PM10/19/01
to
On Fri, 19 Oct 2001 13:17:54 GMT, Jim Correia <cor...@barebones.com> wrote:
>In article <u8ze8v...@python.net>, Michael Hudson <m...@python.net>
>wrote:
>
>> > I'm a python newbie. I know how to do it "by hand" with a loop, but is
>> > there a built in conversion operater that will let me do something
>> > simply like the perl assignment?
>>
>> No. Write the loop.
>
>That's unfortunate - a simple assignment would be better. In the
>simplest case of my usage, the loop (while ultra short) could be 50% of
>the code.

This doesn't make any sense to me. If a two line loop is 50% of your program,
then your program is four lines. If you're going to play meaningless
line-counting games at least come up with some more impressive numbers :)

And assignment that uses a heuristic to convert an array into a hash is not
really "simple".

>> I've never found myself needing to do this, but that may be because
>> I'm not used to having a convenient way of going from
>> [k1,v1,k2,v2,...] to a dict.
>
>The particular situation is the python implementation is going to be a
>command line script. The calling convention for this script is to pass
>arguments on the command line in key value pairs. The perl
>implementation looks like (in the simplest case)
>
>my %args = @ARGV;
>
>foreach(keys %args)
>{
> print("$_: $args{$_}\n");
>}
>
>And it would be called from the command line as
>
>perl myScript.pl name fred age 23 occupation "gravel worker"

import sys
d = {}
assert (len(sys.argv)-1) % 2 == 0, \
'argv must consist of pairs, key "%s" has no value' % sys.argv[-1]
for i in range(1, len(sys.argv), 2):
d[sys.argv[i]] = sys.argv[i+1]

for k, v in d.items():
print k + ': ' + v

Note that the python version actually checks to make sure the input makes
sense. There's no way to know how the perl version reacts to bad input except
by testing it or reading the documentation. I suspect perl will silently give
the last key a nil value, but once again there's no way to know for sure by
just looking at it. Hopefully you and everyone who reads your code has all
the little details in the camel book memorized.

This is a good demonstration of why many people prefer the explicit python
approach. What if you want bad input to report an error? What if you want it
to not be an error, but you want the default value to be something other than
nil? What if you start off wanting it to be an error, but later decide it
should give a default value? If you wrote this in perl using assignment,
you'd have to write a function and then track down all those assignments (have
fun checking every assignment in a large program) and replace them with a
function call.

Learning a new language involves more than learning the syntax and libraries.
Consider python an opportunity to gain another perspective on the practice of
programming.

>I'd like to implement the script with the same calling conventions in
>python, and have an "easy" (typing wise, not conceptually, but I guess
>I'll have to write a reusable function) way to take the arguments on the
>command line and turn them into a dictionary.

A two line loop is pretty easy. If you want to do this a lot then yes, you
should define a function, in which case 'd = dictconv(a)' is the same number
of lines than '%d = @a;'. If you want to quibble, then yes, it's 7 characters
longer, but consider that you don't have to type '%@;' which reduces the
difference to 4 characters. '%@' involves pressing shift twice, so that
further reduces it to a 2 character difference. You could then name the
function dconv and type 1 *less* character! Isn't python great?

Tim Hammerquist

unread,
Oct 19, 2001, 6:16:54 PM10/19/01
to
Me parece que Erik Max Francis <m...@alcyone.com> dijo:
[ snip ]

> In particular, what this is really revealing is that a Perl associative
> array is really represented as a list, with each successive pair
> representing a key and a value.

That's incorrect, or at least deceptively vague. Perl hashes may be
_assigned_ a list which is parsed and initialized using the algorithm
you described. This can easily be implemented in Python by simply
subclassing and providing a constructor to this effect.

When passed unadorned to the print function, they are also interpolated
as lists are. This, also, can be implemented using the __repr__ method
in the subclass mentioned above.

Internally, however, Perl hashes and Python dictionaries are treated
very similarly. It is merely their interface that differs. Is this not
the advantage of OOP?

I'm not by any means recommending subclasses UserDict to create a
PerlDict or some such. It's merely to demonstrate how simple it is to
change Python's dicts to behave as Perl hashes, and vice versa.

> This goes back to Perl's weakly-typed approach to everything. Since
> Python is not weakly typed, emulating this behavior is not only not
> convenient, but is not even advisable.

This has nothing to do weak-typing, strong-typing, or touch-typing. It
has to do with a language's data-type interface.

Were you perhaps thinking of Tcl?

Tim
--
The biggest problem with communication is the illusion that it has occurred.

Jim Correia

unread,
Oct 19, 2001, 9:03:36 PM10/19/01
to
In article <slrn9t0il...@sill.silmarill.org>,
Andrei Kulakov <si...@optonline.net> wrote:

> You know, this sort of thing comes up often - X is a built-in feature in
> perl, why do I have to do it manually in python? I think this is a
> justified design decision - there's simply too many built-in things perl
> does. You can't remember all of them and if you look at someone else's
> perl program you have to scratch your head a lot. Python is small, lean
> and clean, and that's one of the best things about it.

I wasn't insisting that there must be on in python, just asking if there
was :-)

Jim Correia

unread,
Oct 19, 2001, 9:04:43 PM10/19/01
to
In article <3BD06B08...@alcyone.com>,

Erik Max Francis <m...@alcyone.com> wrote:

> In particular, what this is really revealing is that a Perl associative
> array is really represented as a list, with each successive pair
> representing a key and a value. (When writing list and associative
> array literals, you see the same thing.)

It reveals no such thing. What it reveals is that there is a magic
coercion that happens at assign time.

Jim Correia

unread,
Oct 19, 2001, 9:14:46 PM10/19/01
to
In article <slrn9t14si...@cruzeiro.ugcs.caltech.edu>,
qu...@cruzeiro.ugcs.caltech.edu (Quinn Dunkan) wrote:

> On Fri, 19 Oct 2001 13:17:54 GMT, Jim Correia <cor...@barebones.com> wrote:
> >In article <u8ze8v...@python.net>, Michael Hudson <m...@python.net>
> >wrote:
> >
> >> > I'm a python newbie. I know how to do it "by hand" with a loop, but is
> >> > there a built in conversion operater that will let me do something
> >> > simply like the perl assignment?
> >>
> >> No. Write the loop.
> >
> >That's unfortunate - a simple assignment would be better. In the
> >simplest case of my usage, the loop (while ultra short) could be 50% of
> >the code.
>
> This doesn't make any sense to me. If a two line loop is 50% of your program,
> then your program is four lines. If you're going to play meaningless
> line-counting games at least come up with some more impressive numbers :)

It was a stupid argument.

> And assignment that uses a heuristic to convert an array into a hash is not
> really "simple".

I'm used to the perl syntax for this operation, and it would be easier
to type. That is all.

> import sys
> d = {}
> assert (len(sys.argv)-1) % 2 == 0, \
> 'argv must consist of pairs, key "%s" has no value' % sys.argv[-1]
> for i in range(1, len(sys.argv), 2):
> d[sys.argv[i]] = sys.argv[i+1]
>
> for k, v in d.items():
> print k + ': ' + v

That's what I ended up doing.

> Note that the python version actually checks to make sure the input makes
> sense. There's no way to know how the perl version reacts to bad input except
> by testing it or reading the documentation. I suspect perl will silently give
> the last key a nil value, but once again there's no way to know for sure by
> just looking at it.

It fails at runtime with an odd number of elements in hash list error.

> Hopefully you and everyone who reads your code has all
> the little details in the camel book memorized.

I was just asking if there was a built in coercion available.
(Coercions aren't unheard of in scripting languages.) This wasn't a my
language is better than your language war. I don't have particularly
strong feelings about perl or python one way or the other (I do most of
my work in compiled languages - C mostly) but use the right tool, or
sometimes the convenient tool, for the job.

> This is a good demonstration of why many people prefer the explicit python
> approach. What if you want bad input to report an error?

There is nothing stopping you from checking and doing that in perl
either.

> What if you want it to not be an error, but you want the default
> value to be something other than nil? What if you start off wanting
> it to be an error, but later decide it should give a default value?

I don't see anything stopping you from doing this in perl (or anything
else for that matter).

> If you wrote this in perl using assignment, you'd have to write a
> function and then track down all those assignments (have fun checking
> every assignment in a large program) and replace them with a function
> call.

But you are ignoring the initial constraints :-). In this particular
situation the arguments are passed to the script as key/value pairs.
The conversion is done at the top of "main" and the dictionary is used
throughout the rest of the program.

> Learning a new language involves more than learning the syntax and libraries.
> Consider python an opportunity to gain another perspective on the practice of
> programming.

No need to be condescending. I've got plenty of "practice" and
experience programming as well as shipping large, complex, high quality
products to customers. I didn't ask the question so you can start a
pissing contest of credentials.

> A two line loop is pretty easy. If you want to do this a lot then yes, you
> should define a function, in which case 'd = dictconv(a)' is the same number
> of lines than '%d = @a;'. If you want to quibble, then yes, it's 7 characters
> longer, but consider that you don't have to type '%@;' which reduces the
> difference to 4 characters.

I was looking for the perl conversion because it was easier to type, and
I didn't have to carry around an extra conversion function (either in an
external file or by cutting and pasting into otherwise single, one file,
portable scripts). That's why a language intrinsic was desired. Since
there isn't one, the two line loop will have to suffice.

Martien Verbruggen

unread,
Oct 19, 2001, 9:28:55 PM10/19/01
to
On Fri, 19 Oct 2001 11:03:52 -0700,

Erik Max Francis <m...@alcyone.com> wrote:
> Andrei Kulakov wrote:
>
>> You know, this sort of thing comes up often - X is a built-in feature
>> in
>> perl, why do I have to do it manually in python? I think this is a
>> justified design decision - there's simply too many built-in things
>> perl
>> does. You can't remember all of them and if you look at someone else's
>> perl program you have to scratch your head a lot. Python is small,
>> lean
>> and clean, and that's one of the best things about it.
>
> In particular, what this is really revealing is that a Perl associative
> array is really represented as a list, with each successive pair
> representing a key and a value. (When writing list and associative
> array literals, you see the same thing.)

Euhmmm... No, I don't agree.

What this reveals is that in Perl lists, arrays and hashes
are totally different things, but they can be assigned to each other
(with some restrictions). A list is not a Perl data type, it's something
that exists internally in Perl and gets produced by subroutine calls,
and certain operations. Even that thing with the parentheses and
the commas isn't really a list; it produces a list at runtime.

To say that Perl hashes are implemented as lists really misses the point
of what a hash is and what a list is. If hashes were lists, then they
wouldn't be any faster than an array, and their existence would be
totally superfluous. The last time I read the bits of perl's sources
that pertain to hashes, they were still implemented as a hash :)

> This goes back to Perl's weakly-typed approach to everything. Since
> Python is not weakly typed, emulating this behavior is not only not
> convenient, but is not even advisable.

I don't see why this is relevant. Python is strongly typed, yes, but
that doesn't mean you _could_ in no circumstance make tuples or lists
and hashes assignment-compatible, even if only in one direction. C is
strongly typed, but it allows many assigments with automatic conversions
between variables with differing (but assignment-compatible!) types.

That Perl is weakly typed is beside the point. That Perl has
well-defined automatic conversion behaviour between (almost) all of its
internal types is closer to the point.

That during the design of python some decisions were made to have strict
boundaries between these things, and not allow assignments to work from
one type to another is probably more the issue.

Martien
--
|
Martien Verbruggen | Begin at the beginning and go on till
| you come to the end; then stop.
|

Tim Peters

unread,
Oct 20, 2001, 2:42:55 PM10/20/01
to
[Michael Hudson, on making a dict out of a [key, value, key, value, ...]
list]
> ...

> I've never found myself needing to do this, but that may be because
> I'm not used to having a convenient way of going from
> [k1,v1,k2,v2,...] to a dict.

It's quite handy in Perl! Picture parsing a simple textual database with
key + value lines (like, e.g., mail headers). In Python you can use
re.findall() to parse them all up in one gulp, but then you're left with a
flat [k,v,...] list. In Perl you assign the list to a hash, and you're
done. Now run this backwards: since this is so convenient in Perl, Perl
programs often create textual mini-databases in this format, and before you
know it "Python sucks for everything I do" <wink>. OTOH, in Python I often
maintain textual mini-databases by writing repr(some_dict) to a file, then
reconstruct the dict blazingly fast later via
eval(open('that_file').read()) -- and before you know it Perl sucks for
everything I do.

dictionary() is a constructor in 2.2, and I spent a lot of time worrying
about what kind of arguments it should take. I believe a flat [k,v,...]
list would have been the most useful thing to accept; but it had to at least
accept a mapping object, and that's all it accepts (for now). Everyone's
first thought seems to be that dictionary() should accept a list of (key,
value) tuples -- but there are few core functions that produce such a list
(dict.items(), zip() and some cases of map() are all that pop to mind), so I
had a hard time picturing a good use for that (yet another way to spell
dict.copy() is not a good use).

lost-in-consequences-ly y'rs - tim


Marcin 'Qrczak' Kowalczyk

unread,
Oct 20, 2001, 3:43:50 PM10/20/01
to
Sat, 20 Oct 2001 14:42:55 -0400, Tim Peters <tim...@home.com> pisze:

> It's quite handy in Perl! Picture parsing a simple textual database with
> key + value lines (like, e.g., mail headers). In Python you can use
> re.findall() to parse them all up in one gulp, but then you're left with a
> flat [k,v,...] list.

findall(pattern, string)
Return a list of all non-overlapping matches in the string.

If one or more groups are present in the pattern, return a
list of groups; this will be a list of tuples if the pattern
has more than one group.

It seems that it will return [(k,v),(k,v),...], no?

> dictionary() is a constructor in 2.2, and I spent a lot of time worrying
> about what kind of arguments it should take. I believe a flat [k,v,...]
> list would have been the most useful thing to accept;

I believe in [(k,v),(k,v),...]. This representation of dictionaries
is much easier to iterate over, and much easier to generate with list
comprehensions.

And it's the inverse of d.items().

And IMHO it's more elegant: the list is homogeneous. It fits statically
typed languages too, unlike [k,v,k,v,...] (it's used in Haskell for
example). I'm using it in my little dynamically typed language.

> Everyone's first thought seems to be that dictionary() should accept
> a list of (key, value) tuples -- but there are few core functions
> that produce such a list

Is there any function which produces [k,v,k,v,...]?

--
__("< Marcin Kowalczyk * qrc...@knm.org.pl http://qrczak.ids.net.pl/
\__/
^^ SYGNATURA ZASTĘPCZA
QRCZAK

Andrew Dalke

unread,
Oct 20, 2001, 8:34:30 PM10/20/01
to
Uncle Tim:

>but there are few core functions that produce such a list
>(dict.items(), zip() and some cases of map() are all that pop to mind), so
I
>had a hard time picturing a good use for that

Not a core function, but

args, files = getopt.getopt(...)
args = dictionary(args)

would be handy.

Andrew

Ivan A. Vigasin

unread,
Oct 19, 2001, 10:20:10 AM10/19/01
to
On Fri, 19 Oct 2001, Michael Hudson wrote:

> > Idea is great!
> You mean that for real? I think it's foul. It abuses the
> implementation details of reduce to do something that function was
> never intended to do.

I mean, that I never thought, that reduce can be used this way.

> > It comes from functional programming?
> No. There are side-effects all over the place!

You right. There are side-effects, but people, who never heard about FP just use loops in such case.

> Just in case anyone hasn't got the message yet:
>
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.
> THE CODE I POSTED WAS DISGUSTING. DON'T EMULATE IT.

I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.
I see.

Martin von Loewis

unread,
Oct 21, 2001, 8:35:25 AM10/21/01
to
Jim Correia <jim.c...@pobox.com> writes:

> I was just asking if there was a built in coercion available.
> (Coercions aren't unheard of in scripting languages.)

And indeed, Python does have conversions. Since variables are untyped,
it never does them on assignment. Also, this particular conversion is
not supported in the standard library. If there would be a frequent
need for it, it probably would be added.

Note that the list->dictionary conversion is not as obvious as you
think. Given

[(1,2),(3,4)]

would you like to get

a) {1:2, 3:4}

or

b) {(1,2) : (3,4)}

Your answer is probably b), but many Python people would rather expect
a) (since it would be the reverse of the .items() operation).

Regards,
Martin

Martin von Loewis

unread,
Oct 21, 2001, 8:39:13 AM10/21/01
to
Jim Correia <jim.c...@pobox.com> writes:

> I wasn't insisting that there must be on in python, just asking if there
> was :-)

I think this thread would have died quite early if you did just
that... Instead, after being told that this conversion is not
supported as a built-in, you responded

"This is quite unfortunate."

People then have read this into insisting.

Regards,
Martin

Magnus Lie Hetland

unread,
Oct 21, 2001, 2:34:20 PM10/21/01
to
"Martin von Loewis" <loe...@informatik.hu-berlin.de> wrote in message
news:j4d73hj...@informatik.hu-berlin.de...
> Jim Correia <jim.c...@pobox.com> writes:
[snip]

> Note that the list->dictionary conversion is not as obvious as you
> think. Given
>
> [(1,2),(3,4)]
>
> would you like to get
>
> a) {1:2, 3:4}
>
> or
>
> b) {(1,2) : (3,4)}
>
> Your answer is probably b), but many Python people would rather expect
> a) (since it would be the reverse of the .items() operation).

Why would one want b?! (I must have missed some posts here...)

- Magnus, who thinks a is the obvious choice.

> Regards,
> Martin

--
Magnus Lie Hetland
http://hetland.org

Jim Correia

unread,
Oct 21, 2001, 7:48:57 PM10/21/01
to
In article <j4adylj...@informatik.hu-berlin.de>,

It was probably a poor choice of words. It was late and I was in a
rush. It should have said I would have preferred that there was a built
in conversion rather than rolling my own. The problem, however, has
long since been solved.

Thanks.

Jim

Quinn Dunkan

unread,
Oct 22, 2001, 3:19:46 AM10/22/01
to
On Sat, 20 Oct 2001 01:14:46 GMT, Jim Correia <jim.c...@pobox.com> wrote:
>> Hopefully you and everyone who reads your code has all
>> the little details in the camel book memorized.
>
>I was just asking if there was a built in coercion available.
>(Coercions aren't unheard of in scripting languages.) This wasn't a my
>language is better than your language war. I don't have particularly
>strong feelings about perl or python one way or the other (I do most of
>my work in compiled languages - C mostly) but use the right tool, or
>sometimes the convenient tool, for the job.

Sorry if I came across as a jerk. I read more into your post than you
intended.

Dynamic languages generally can't convert at assignment because there are no
declarations to tell them what to convert to (perl's @$% are effectively type
declarations). Python's '=' has a different philosophy than a C++-ish
operator= anyway.

The only way to coerce things is to explicitly call the coerce() function on
them. Python's built-in operators do this for numeric types, but that's about
it. You can also give objects a __coerce__ method. But it really doesn't do
the sort of thing I think you mean when you say "coerce".

I think a list->dict converter would be a perfectly reasonable addition to the
library, but it's not there to my knowledge.

There is a built-in function in mxTools, but it expects a list of tuples and
it's not the the stdlib anyway.

Tim Peters

unread,
Oct 22, 2001, 3:49:53 AM10/22/01
to
[Andrew Dalke]

> Not a core function, but
>
> args, files = getopt.getopt(...)
> args = dictionary(args)
>
> would be handy.

And Marcin Kowalczyk had some good abstract arguments -- but a concrete
example somebody would actually use does more for me <wink>.

So what should dictionary(x) do?

If x is of a mapping type, it currently (2.2b1) returns a dict with the same
(key, value) pairs as x. "Is a mapping type" == is an instance of
dictionary (including subclasses of dictionary), or, failing that, responds
to x.keys() without raising AttributeError.

If we try x.keys() and do see AttributeError, then what? It's not 2.2-ish
to insist on a list -- any iterable object should work. So we try to
iterate over x. Now it gets harder: what kinds of objects are we prepared
to see when iterating x? Insisting on a 2-tuple (or subclass of tuple) is
most efficient. More general is to treat these as iterable objects in their
own right, and either take the first two objects iteration produces, or
insist that an attempt to generate a third object raise StopIteration. The
latter is more Pythonic, because it matches what "the obvious" loop does:

d = {}
for k, v in x:
d[k] = v

In 2.2, x can be any iterable object there, and so can the objects produced
by iterating *over* x. Extreme example of the latter:

>>> f = file('f1', 'w')
>>> f.write('a\nb\n')
>>> f.close()
>>> g = file('f2', 'w')
>>> g.write('c\nd\n')
>>> g.close()
>>> h = file('f3', 'w')
>>> h.write('1\n2\n3\n')
>>> h.close()
>>> for k, v in map(file, ('f1', 'f2', 'f3')):
... print k, v
...
a
b

c
d

Traceback (most recent call last):

File "<stdin>", line 1, in ?
ValueError: too many values to unpack
>>>

That is, we can't unpack a file with 3 lines into a two-vrbl assignment
target. Is 2.2 great or what <0.9 wink>?!

OK, I convinced myself that's the only explainable thing to be done,
although I grate at the gross inefficiences; but I can special-case 2-tuples
and 2-lists for speed, so that's OK.

Next question: Should this really work by provoking the argument with
various protocols and letting exceptions steer it? Or should you be
explicit about that you're passing an iterable object producing iterable
objects producing 2 objects each?

dictionary() currently takes a single optional argument, named 'mapping':

>>> dictionary(mapping={1: 2})
{1: 2}
>>> dictionary({1: 2}) # same thing
{1: 2}
>>>

Should the new behavior require use of a differently-named keyword argument?
My guess is yes. What will prevent this from getting implemented is a
3-year rancorous debate about the name of the new argument <0.7 wink>.

Note that the obvious workalike loop:

d = {}
for k, v in x:
d[k] = v

deals with duplicate k values by silently overwriting all but the last
association seen.

"tough-luck"-comes-to-mind-ly y'rs - tim


Anthony Baxter

unread,
Oct 22, 2001, 10:43:54 AM10/22/01
to

>>> "Tim Peters" wrote

> Should the new behavior require use of a differently-named keyword argument?
> My guess is yes. What will prevent this from getting implemented is a
> 3-year rancorous debate about the name of the new argument <0.7 wink>.

I think "Reginald" is the only obvious name for the argument that we
can all agree on.

Anthony

--
Anthony Baxter <ant...@interlink.com.au>
It's never too late to have a happy childhood.


Martin von Loewis

unread,
Oct 22, 2001, 12:27:23 PM10/22/01
to
"Magnus Lie Hetland" <m...@idi.ntnu.no> writes:

> > b) {(1,2) : (3,4)}
> >
> > Your answer is probably b), but many Python people would rather expect
> > a) (since it would be the reverse of the .items() operation).
>
> Why would one want b?! (I must have missed some posts here...)

Because the original poster wanted that

['k1','v1','k2','v2']

is converted to

{'k1':'v1', 'k2':'v2'}

Regards,
Martin

Andrew Dalke

unread,
Oct 22, 2001, 12:44:50 PM10/22/01
to
Tim:

> [Andrew Dalke]
> Not a core function, but
...

> would be handy.
>And Marcin Kowalczyk had some good abstract arguments -- but a concrete
>example somebody would actually use does more for me <wink>.

>So what should dictionary(x) do?

Oh, don't get me wrong. I'm one of those anchors trying to slow
the development of new core features in Python. (New libraries
is a different thing.) If dictionary(list) stays an error you
won't hear any complaints about me.

But I'm also one of those people who likes details -- (forest?
tree? I like the BARK! :) so thought it would be helpful
to point out other functions that returned a list of 2-ples.

>Note that the obvious workalike loop:
>
>d = {}
>for k, v in x:
> d[k] = v

Yep. Works for me.

Andrew
da...@dalkescientific.com


Chris Barker

unread,
Oct 22, 2001, 1:24:14 PM10/22/01
to
Tim Peters wrote:

> So what should dictionary(x) do?

Personally, I would find:

dict = dictionary(sequence_of_keys, sequence_of_values)

most valuable to me because I have found the need for it enought times
that I have written my own function to do it, and have used it a lot.

If it were:

dict = dictionary(sequence_of_keys_value_pairs)

I could do:

dict = dictionary(zip(sequence_of_keys, sequence_of_values))

So that would be me second choice.

All I can say about:

dictionary([k,v,k,v,k,v]) is

YECH! (IMHO, of course)

-Chris

--
Christopher Barker,
Ph.D.
ChrisH...@home.net --- --- ---
http://members.home.net/barkerlohmann ---@@ -----@@ -----@@
------@@@ ------@@@ ------@@@
Oil Spill Modeling ------ @ ------ @ ------ @
Water Resources Engineering ------- --------- --------
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------

James_...@i2.com

unread,
Oct 22, 2001, 1:29:56 PM10/22/01
to

Tim Peters wrote:
>dictionary() is a constructor in 2.2, and I spent a lot of time worrying
>about what kind of arguments it should take. I believe a flat [k,v,...]
>list would have been the most useful thing to accept; but it had to at
least
>accept a mapping object, and that's all it accepts (for now). Everyone's

>first thought seems to be that dictionary() should accept a list of (key,
>value) tuples -- but there are few core functions that produce such a list

>(dict.items(), zip() and some cases of map() are all that pop to mind), so
I
>had a hard time picturing a good use for that (yet another way to spell
>dict.copy() is not a good use).

Since there was a stalemate on the pros and cons of overloading/confusing
the dictionary constructor ...
isn't this a reasonable place to take advantage of the new 2.2 class
methods and provide:

dictionary.newFromList([k,v,...])

(or some other shorter/more-appealing method name of preference)? I think
class methods provide a nice way to introduce alternative "constructors"
without having to add special modules for the sole purpose of serving as a
home for module-based constructor-esque functions and without having to add
more builtin functions, etc.

(Guilty as charged of more Smalltalk-hardwiring-of-the-brain thinking ;-)

Jim

Steve Holden

unread,
Oct 22, 2001, 7:17:31 PM10/22/01
to
<James_...@i2.com> wrote ...
Erm, wouldn't it make more sense (since there isn't a dictionary to start
with) to implement a "toDict()" method of lists rather than a dictionary
constructor?

Just a thought.

regards
Steve
--
http://www.holdenweb.com/

Tim Peters

unread,
Oct 23, 2001, 1:10:31 AM10/23/01
to
[Andrew Dalke]

> Oh, don't get me wrong. I'm one of those anchors trying to slow
> the development of new core features in Python. (New libraries
> is a different thing.) If dictionary(list) stays an error you
> won't hear any complaints about me.

I appreciate that. However, the builtin dictionary() cow has already
escaped the 2.2 barn, so the question now is whether we let it roam the
Python pasture with two broken legs, or put a spiffy sequence saddle on it
so you can gallop on it in comfort into the mooooonlight.

> But I'm also one of those people who likes details -- (forest?
> tree? I like the BARK! :)

grow-up-it's-a-cow-not-a-dog-ly y'rs - tim


Tim Peters

unread,
Oct 23, 2001, 1:21:04 AM10/23/01
to
[Steve Holden]

> Erm, wouldn't it make more sense (since there isn't a dictionary to start
> with) to implement a "toDict()" method of lists rather than a dictionary
> constructor?

There's already a dictionary() constructor in 2.2, and that's not going to
change (it would be too odd if every scalar and container builtin type had a
builtin constructor except for dicts). One question is what it should do;
all it can do in 2.2b1 is basically silly, amounting to yet another way to
spell "shallow-copy the argument".

Sticking a toDict() method on lists isn't appealing regardless because
"iterable object" is the natural decomposition in 2.2 (for example, a tuple
of two-element lists is just as good of an input argument as a list of
two-element tuples, ditto a generator producing a sequence of address-book
objects that supply an iterator over their name and address fields, and so
on -- "a list" just isn't special here).


Andrew Dalke

unread,
Oct 23, 2001, 2:40:26 AM10/23/01
to
>grow-up-it's-a-cow-not-a-dog-ly y'rs - tim

Moof!

Andrew
da...@dalkescientific.com

Steve Holden

unread,
Oct 23, 2001, 6:58:31 AM10/23/01
to
"Tim Peters" <tim...@home.com> wrote ...

> [Steve Holden]
> > Erm, wouldn't it make more sense (since there isn't a dictionary to
start
> > with) to implement a "toDict()" method of lists rather than a dictionary
> > constructor?
>
> There's already a dictionary() constructor in 2.2, and that's not going to
> change (it would be too odd if every scalar and container builtin type had
a
> builtin constructor except for dicts). One question is what it should do;
> all it can do in 2.2b1 is basically silly, amounting to yet another way to
> spell "shallow-copy the argument".
>
Well, I certainly agree that

TypeError: argument must be of a mapping type

isn't impressive. If it has to be a mapping already then making a dictionary
out of it doesn't really help a lot (though I suppose it might speed up
access for mappings with complex implementations behind them).

> Sticking a toDict() method on lists isn't appealing regardless because
> "iterable object" is the natural decomposition in 2.2 (for example, a
tuple
> of two-element lists is just as good of an input argument as a list of
> two-element tuples, ditto a generator producing a sequence of address-book
> objects that supply an iterator over their name and address fields, and so
> on -- "a list" just isn't special here).
>

Sorry, must have been "Not Thinking in Java" for a second there. Given the
rich variety of choices which *could* be made here, I don't see an argument
that any *specific* choice would be less silly than any other. Certainly not
really any argument for Perl's particular coercion.

Russell E. Owen

unread,
Oct 23, 2001, 12:32:46 PM10/23/01
to
Thank you for a most mooving posting -- funniest I've read in a
programming newsgroup in a long time.

For what it's worth, I'm bullish on Chris Barker's suggestion of some
way of converting list of keys, list of values to a dictionary. Like
Chris, I've written my own and use it a fair bit.

I have no opinion about converting the two proposed flavors of single
lists to dicts (i.e. [key1, value1, key2, value2...] vs. [(key1,
value1), (key2, value2)...]; they both sound useful, are both easy to
implement via user-written functions, and if either is implemented as a
built-in, then folks are likely to have a cow about the missing one.

Regards,

-- Russell


In article <mailman.100381388...@python.org>,

news...@md5.ca

unread,
Oct 23, 2001, 12:50:06 PM10/23/01
to
Tim Peters <tim...@home.com> wrote:
> It's quite handy in Perl! Picture parsing a simple textual database with
> key + value lines (like, e.g., mail headers). In Python you can use
> re.findall() to parse them all up in one gulp, but then you're left with a
> flat [k,v,...] list. In Perl you assign the list to a hash, and you're
> done. Now run this backwards: since this is so convenient in Perl, Perl
> programs often create textual mini-databases in this format, and before you
> know it "Python sucks for everything I do" <wink>. OTOH, in Python I often
> maintain textual mini-databases by writing repr(some_dict) to a file, then
> reconstruct the dict blazingly fast later via
> eval(open('that_file').read()) -- and before you know it Perl sucks for
> everything I do.

there's reason for that. Often you have to take third party text
databases and modify them on daily basis. Perl is champion for quick and
dirty tasks. I am presently learning python and please with structure
it gives my programs, but displeased with absence of richness of
information flow. Its like going from bar to bar and pint glasses in new
bar are half the size of them in previous place. And price is same.(i
realize pint is measuring so it is absurd - that how I feel). Perl is
irreplacable, but there are languages that I would before do it all in
perl now I see others. Like python.

> dictionary() is a constructor in 2.2, and I spent a lot of time worrying
> about what kind of arguments it should take. I believe a flat [k,v,...]

Aha! thats perlism. I don't think guido wants to submit to the order of
perl.

regards as python newbie,
p.

--
Research causes cancer in rats.
110461387
http://gpg.md5.ca
http://perlpimp.com

Terry Reedy

unread,
Oct 23, 2001, 12:53:05 PM10/23/01
to

"Tim Peters" <tim...@home.com> wrote in message
news:mailman.100381388...@python.org...

> I appreciate that. However, the builtin dictionary() cow has
already
> escaped the 2.2 barn, so the question now is whether we let it roam
the
> Python pasture with two broken legs, or put a spiffy sequence saddle
on it
> so you can gallop on it in comfort into the mooooonlight.

Fact: a Python dictionary literal is a sequence of key:value pairs of
literals, with the first becoming a hashable object, and only such a
sequence.

Observation: ':' is analogous to '( , )', which could have been the
syntax chosen (though I'm glad it wasn't).

Therefore:

Proposed rule 1: the dictionary() constructor should accept a sequence
of pairs of objects, with the first being keyable (hashable). In
particular, it should invert dict.items.

Note: by type, 'pair' might mean duple only; by interface, it would
mean an object reporting a length of two and yielding objects with
indexes 0 and 1.

Comment: once pair is defined, this rule gives a uniform target for
conversion from other formats, including those generated by other
software systems. I currently vote for the latter.

Proposed rule 2: dictionary() should reject any other sequence, just
as does the internal constructor-from-literals.

Paraphrase: conversions from the many other possible formats should be
handled externally from dictionary().

Opinion 3: Given the fact and comments above, these two rules should
be easy to understand and teach.

Terry J. Reedy

James_...@i2.com

unread,
Oct 23, 2001, 1:13:32 PM10/23/01
to

Russell E. Owen wrote:
>For what it's worth, I'm bullish on Chris Barker's suggestion of some
>way of converting list of keys, list of values to a dictionary. Like
>Chris, I've written my own and use it a fair bit.
>
>I have no opinion about converting the two proposed flavors of single
>lists to dicts (i.e. [key1, value1, key2, value2...] vs. [(key1,
>value1), (key2, value2)...]; they both sound useful, are both easy to
>implement via user-written functions, and if either is implemented as a
>built-in, then folks are likely to have a cow about the missing one.

Well, not to beat a dead cow, but (or cow butt, if you prefer) ...

I say take advantage of class methods (new in 2.2) and use them as
alternative constructors (factory methods, if you will) instead of badly
overloading the nominal constructor, as in (picking your own favorite
names, of course):

d1 = dictionary.fromKeyValueSeq([key1, value1, key2, value2...])

d2 = dictionary.fromKeyValuePairs([(key1, value1), (key2, value2)...])

etc.

(i.e., use the well known "eat a cow and have a cow, too" design pattern).

Jim

Huaiyu Zhu

unread,
Oct 23, 2001, 3:49:37 PM10/23/01
to
On Tue, 23 Oct 2001 01:21:04 -0400, Tim Peters <tim...@home.com> wrote:
>There's already a dictionary() constructor in 2.2, and that's not going to
>change (it would be too odd if every scalar and container builtin type had a
>builtin constructor except for dicts). One question is what it should do;
>all it can do in 2.2b1 is basically silly, amounting to yet another way to
>spell "shallow-copy the argument".

There have been several good arguments about why [(k,v), (k,v) ...] is
better than [k, v, k, v, ...]. Here are some more observations:

- I found Perl's hash initialization quite non-intuitive when I was using
Perl two years ago. Most text files put k,v in the same line, and most
regular expression extract them at once. It takes a little bit extra work
to put k,v pairs into a flat array.

- In Python the sequence-of-pairs structure is easily created by d.items(),
re.match.groups and possibly others. It is quite handy for sorting,
filtering and other processings. It is often necessary to put them into a
dictionary afterwards, and a constructor would be helpful.

- Perhaps a pair of functions flatten(x) and collect(x, n) would be useful
when dealing with flat sequences, even when it is not necessarily
associated with dictionaries.

- As to the question of what the constructor should do in borderline cases, is
there a specific reason that it should not behave exactly like this?

def dict(x):


d = {}
for k, v in x: d[k] = v

return d

There is no need to differentiate between tuples, lists, iterators, etc.
Any sequence whose every element is a sequence of length two should work.
Everything else should raise an exception.

- The form of ([k, k, ...], [v, v, ...]) is less frequently used (in my case
anyway) and can be easily converted using zip().

I keep a file utils.py around that include a many little functions like
the above, and I find myself doing "from utils import dict" quite often.

Another function I use is items(x) returning x.items() if x is dictionary,
or [(i,x[i]), ...] if x is sequence. With dict(), items() and zip() I
could write many programs useful for both lists and dicts.

Huaiyu

Rainer Deyke

unread,
Oct 23, 2001, 4:12:51 PM10/23/01
to
"Huaiyu Zhu" <hua...@gauss.almadan.ibm.com> wrote in message
news:slrn9tbier...@gauss.almadan.ibm.com...

> There have been several good arguments about why [(k,v), (k,v) ...] is
> better than [k, v, k, v, ...]. Here are some more observations:

<snip arguments>

Add to that:

- When dealing with '[k, v, k, v, ...]' sequences, it is easy to
accidentally mess up the alignment and create a '[k, k, v, k, v, ...]'
sequence instead (with disasterous results).


--
Rainer Deyke (ro...@rainerdeyke.com)
Shareware computer games - http://rainerdeyke.com
"In ihren Reihen zu stehen heisst unter Feinden zu kaempfen" - Abigor


Barry A. Warsaw

unread,
Oct 23, 2001, 11:04:22 PM10/23/01
to

>>>>> "REO" == Russell E Owen <ow...@astrono.junkwashington.emu> writes:

REO> Thank you for a most mooving posting -- funniest I've read in
REO> a programming newsgroup in a long time.

Yes, but udderly worthless.

http://barry.wooz.org/poems/milkme2.html-ly y'rs,
-Barry

Aahz Maruch

unread,
Oct 24, 2001, 6:18:11 AM10/24/01
to
In article <mailman.1003771893...@python.org>,

<James_...@i2.com> wrote:
>
>Since there was a stalemate on the pros and cons of
>overloading/confusing the dictionary constructor ... isn't this a
>reasonable place to take advantage of the new 2.2 class methods and
>provide:
>
>dictionary.newFromList([k,v,...])

That's what subclassing is for:

class MyNewDict(dictionary):
....

d = MyNewDict()
d.fillFromList([k,v,...])
--
--- Aahz <*> (Copyright 2001 by aa...@pobox.com)

Hugs and backrubs -- I break Rule 6 http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

Sieg heil!

Marcin 'Qrczak' Kowalczyk

unread,
Oct 27, 2001, 12:21:19 PM10/27/01
to
Tue, 23 Oct 2001 19:49:37 +0000 (UTC), Huaiyu Zhu <hua...@gauss.almadan.ibm.com> pisze:

> There have been several good arguments about why [(k,v), (k,v) ...] is
> better than [k, v, k, v, ...]. Here are some more observations:

Here is why Perl does it its way: it didn't have nested data structures
at the time this was designed! You could only put scalars in a list,
i.e. strings or numbers. (You still can, but scalars now include
references to arrays.)

--
__("< Marcin Kowalczyk * qrc...@knm.org.pl http://qrczak.ids.net.pl/
\__/
^^ SYGNATURA ZASTĘPCZA
QRCZAK

Tim Peters

unread,
Oct 27, 2001, 12:53:43 PM10/27/01
to
[James_...@i2.com]

> ...
> isn't this a reasonable place to take advantage of the new 2.2 class
> methods and provide:
>
> dictionary.newFromList([k,v,...])
>
> (or some other shorter/more-appealing method name of preference)?

I don't think so, simply because it would make dictionary unique. Python
already had str(), unicode(), int(), long(), float(), complex(), list() and
tuple() builtins, and adding dictionary() and file() to 2.2 was thus natural
(for Python).

> I think class methods provide a nice way to introduce alternative
> "constructors" without having to add special modules for the sole
> purpose of serving as a home for module-based constructor-esque
> functions

We're not adding any such modules.

> and without having to add more builtin functions, etc.

If the name "dictionary" was presumed available in

dictionary.newFromList()

above without first having to import some special module, then "dictionary"
was perforce a builtin name anyway, function or not. I know you focus on
the "function" part, but it's the "builtin" part that matters <wink>.

> (Guilty as charged of more Smalltalk-hardwiring-of-the-brain thinking ;-)

I've heard that Smalltalk's class-method constructors work very well there.
Python took a different approach long ago, and it also appears to work well.


Tim Peters

unread,
Oct 27, 2001, 1:16:43 PM10/27/01
to
[Terry Reedy]

> Fact: a Python dictionary literal is a sequence of key:value pairs of
> literals, with the first becoming a hashable object, and only such a
> sequence.

Sounds more like a definition than a fact <wink>.

> Observation: ':' is analogous to '( , )', which could have been the
> syntax chosen (though I'm glad it wasn't).
>
> Therefore:
>
> Proposed rule 1: the dictionary() constructor should accept a sequence
> of pairs of objects, with the first being keyable (hashable). In
> particular, it should invert dict.items.

In current CVS, it does. In 2.2-speak, it accepts an iterable object
producing iterable objects producing exactly 2 objects. It also accepts a
mapping object (as it did in 2.2b1), and also accepts nothing (ditto -- it
returns {} then, much as list() returns [] and tuple() returns ()).

> Note: by type, 'pair' might mean duple only; by interface, it would
> mean an object reporting a length of two and yielding objects with
> indexes 0 and 1.

"By interface" is important, but the details there are off for 2.2 --
iterable objects don't have to support len or indexing. For example,

class AddressBookEntry:
# with .firstname, .lastname attributes
...
def __iter__(self):
return iter((self.firstname, self.lastname))

A sequence of AddressBookEntry instances is OK to pass to dictionary(),
despite that an AddressBookEntry defines neither __len__ nor __getitem__. A
generator yielding instances of AddressBookEntry is also fine; etc.

> Comment: once pair is defined, this rule gives a uniform target for
> conversion from other formats, including those generated by other
> software systems. I currently vote for the latter.

I didn't understand "the latter" here, unless it's a vote for "other
software systems", in which case I'm keen to see the patch <wink>.

> Proposed rule 2: dictionary() should reject any other sequence, just
> as does the internal constructor-from-literals.

"Sequence" is a slippery word. dictionary() continues to accept a mapping
object too. Of course, "mapping object" is also a slippery phrase.

> Paraphrase: conversions from the many other possible formats should be
> handled externally from dictionary().

Indeed, we're going to cheerfully endure abuse for sticking to that.

> Opinion 3: Given the fact and comments above, these two rules should
> be easy to understand and teach.

Yup.


Terry Reedy

unread,
Oct 27, 2001, 4:07:26 PM10/27/01
to

"Tim Peters" <tim...@home.com> wrote in message
news:mailman.1004203049...@python.org...

> [Terry Reedy]
> > Fact: a Python dictionary literal is a sequence of key:value pairs
of
> > literals, with the first becoming a hashable object, and only such
a
> > sequence.
>
> Sounds more like a definition than a fact <wink>.

You've touched on a subtle difference of role-related viewpoints
<return wink>. Implemented definitions create observable facts. For
you as a core coder, the above *is* a definition (mandated by your
partner and bought into by you) to be maintained as you revise the
code. For me as a user, it is a current observable fact which I
cannot change. For me as an advocate/supporter of the idea that the
behaviour of dictionary() with respect to sequences should parallel
that defined fact, the rhetorical point is that it is not merely 'a'
definition found in a book on some shelf but 'the' intended and
observed behavior that we users are currently familiar with and
hopefully comfortable with.

>>...


> > Proposed rule 1: the dictionary() constructor should accept a
sequence
> > of pairs of objects, with the first being keyable (hashable). In
> > particular, it should invert dict.items.
>
> In current CVS, it does. In 2.2-speak, it accepts an iterable
object
> producing iterable objects producing exactly 2 objects.

To me, 'exactly 2' implies that dictionary() calls pair.next() a third
time and objects if it succeeds (as I believe it should, see below).
Presently true?

> > Note: by type, 'pair' might mean duple only; by interface, it
would
> > mean an object reporting a length of two and yielding objects with
> > indexes 0 and 1.
>
> "By interface" is important, but the details there are off for
2.2 --
> iterable objects don't have to support len or indexing.

Whoops. I didn't recurse on the generalization from fixed sequence to
iterator. But there is a point to the 'length of two' bit (again, see
below).

> For example,
>
> class AddressBookEntry:
> # with .firstname, .lastname attributes
> ...
> def __iter__(self):
> return iter((self.firstname, self.lastname))
>
> A sequence of AddressBookEntry instances is OK to pass to
dictionary(),
> despite that an AddressBookEntry defines neither __len__ nor
__getitem__. A
> generator yielding instances of AddressBookEntry is also fine; etc.

Here's the 'below' twice referred to above: Would AddressBookEntry
instances still be OK if the last line were, for instance, changed to

return iter((self.firstname, self.lastname,
self.address))

My point about a length of exactly two is a) the observation that
silently ignoring items beyond two is not consistent with the
processing of literals ('d={1:2:3}', for example, is a SyntaxError)
and b) my feeling that doing so would as often be wrong, creating a
silent bug, as right.

> > Comment: once pair is defined, this rule gives a uniform target
for
> > conversion from other formats, including those generated by other
> > software systems. I currently vote for the latter.
>
> I didn't understand "the latter" here, unless it's a vote for "other
> software systems", in which case I'm keen to see the patch <wink>.

Whoops again, this time in respect to wording. I was referring to a
broad interface definition of 'pair' (versus a narrow structural
definition). Since you have already implemented such, no patch is
needed.

My main point is that clearly defining the sequence part of the domain
of dictionary() and doing so by analogy with the domain of the literal
constructor gives a clear target for conversion programs from the
potentially unbounded set of other forms that which you and I seem to
agree should remain outside that domain.

> > Proposed rule 2: dictionary() should reject any other sequence,
just
> > as does the internal constructor-from-literals.
>
> "Sequence" is a slippery word. dictionary() continues to accept a
mapping
> object too. Of course, "mapping object" is also a slippery phrase.

I am only discussing dictionary(<sequence>) and have no opinions about
dictionary(<mapping>), nor information on what the alternatives might
be.

> > Paraphrase: conversions from the many other possible formats
should be
> > handled externally from dictionary().
>
> Indeed, we're going to cheerfully endure abuse for sticking to that.

Relative to the alternative, so will I for advocating such.

> > Opinion 3: Given the fact and comments above, these two rules
should
> > be easy to understand and teach.
>
> Yup.

Glad we agree ;<)

Terry J. Reedy

Greg Chapman

unread,
Oct 28, 2001, 12:39:05 PM10/28/01
to
On Sat, 27 Oct 2001 13:16:43 -0400, "Tim Peters" <tim...@home.com> wrote:

>"Sequence" is a slippery word. dictionary() continues to accept a mapping
>object too. Of course, "mapping object" is also a slippery phrase.

On the subject of slippery mappings and the dictionary constructor, consider a
subclass of dictionary which overrides __getitem__ to transform the value stored
in the (inherited) dictionary structure to the "real" value stored in the
(logical) mapping defined by the dictionary subclass. (For example: a
dictionary subclass which supports a defined order of iteration by using nodes
in a linked list to store the keys and values, and then storing the keys and
nodes in the dictionary structure). In the 2.2b1, if an instance of such a
subclass is passed to the constructor (or to the update method) of a normal
dictionary, the overridden __getitem__ is ignored because of the PyDict_Check
near the top of PyDict_Merge. I'd like to suggest that that check be changed to
PyDict_CheckExact (which apparently does not exist yet, but would be analogous
to PyString_CheckExact). This would shunt dictionary subclasses into the
generic mapping branch of PyDict_Merge, which would allow an overridden
__getitem__ to work.

Alternatively, the PyDict_Check could be supplemented by a check to see if
__getitem__ has been overridden (and if so, using the generic code). However,
this would not help a subclass which transforms its keys in some way (so that
PyMapping_Keys returns a different set of keys than that stored in the
dictionary structure). (I suppose the check could be extended to look for an
overridden keys method.)

---
Greg Chapman

Tim Peters

unread,
Oct 29, 2001, 12:33:50 AM10/29/01
to
[Terry Reedy]
> ...

> To me, 'exactly 2' implies that dictionary() calls pair.next() a third
> time and objects if it succeeds (as I believe it should, see below).
> Presently true?

It means that if len(list(iterable_object)) != 2, you get an exception about
the length, while if len(list(iterable_object)) == 2 you do not. Exactly
how "exactly two" is determined is not defined, because over-defining errot
cases constrains the implementation in pointless ways. Tickling pair.next()
a third time is one possible implementation. But, for example, if the
iterable object happens to be a tuple or list, it would be a *silly*
implementation to build an iterator at all, since tuples and lists store
their sizes explicitly, and "exactly 2" is directly determinable for them
(FYI, the current implementation does avoid building iterators for tuples
and lists; maybe it won't be the time 2.2final is released, but "exactly 2"
will remain the rule).

...


>> class AddressBookEntry:
>> # with .firstname, .lastname attributes
>> ...
>> def __iter__(self):
>> return iter((self.firstname, self.lastname))
>>
>> A sequence of AddressBookEntry instances is OK to pass to
>> dictionary(), despite that an AddressBookEntry defines neither __len__
>> nor __getitem__. A generator yielding instances of AddressBookEntry is
>> also fine; etc.

> Here's the 'below' twice referred to above: Would AddressBookEntry
> instances still be OK if the last line were, for instance, changed to
>
> return iter((self.firstname, self.lastname, self.address))

No; in that case you'd get a ValueError exception complaining that the
number of elements produced by the iterator isn't exactly 2. Ditto if
changed to

return iter([self.firstname])

etc.

> My point about a length of exactly two is a) the observation that
> silently ignoring items beyond two is not consistent with the
> processing of literals ('d={1:2:3}', for example, is a SyntaxError)
> and b) my feeling that doing so would as often be wrong, creating a
> silent bug, as right.

Exactly so.

> ...
> Glad we agree ;<)

Indeed, it appears unstoppable <wink>.


Guido van Rossum

unread,
Oct 29, 2001, 2:58:42 AM10/29/01
to
Subclassing list and dictionary etc. should be seen as an experimental
feature; I don't want to fully fix the semantics yet in all cases.

I think of these types as having two sets of interfaces: an internal,
fixed one (e.g. PyDict_SetItem) and an external, overridable one
(e.g. __setitem__). The external operations are defined in terms of
the internal ones. When you invoke an operation with a Python
notation (e.g. dict[key]), you invoke the external operation, and
hence overrides will have the expected effect. But many accesses from
inside the Python VM (e.g. dictionary use by exec) and from other
methods (e.g. dict.update) invoke the internal versions of various
operations, and hence won't see the overrides.

IMO, fixing these internal accesses to always use the overrides would
remove much of the attractiveness of subclassing built-in types; I
believe it would slow things down too much given the current state of
the implementation.

If you want a vaguely dictionary-like object, write a class that
doesn't derive from dictionary but implements the mapping protocol; if
you want a base class that implements most of the operations already,
start with UserDict.

Subclassing a built-in type is appropriate when either (a) you want to
use it in a context where a genuine list/dictionary/file/etc.; or (b)
you want the speed advantage of the built-in type. In both cases you
have to live with some restrictions. Remapping the fundamental
accessors (like __getitem__) is probably not a good idea in either
case. Adding new state and behavior is fine.

We should document this more clearly and in more detail.

--Guido van Rossum (home page: http://www.python.org/~guido/)

Tim Peters

unread,
Oct 29, 2001, 2:01:28 AM10/29/01
to
[Greg Chapman]

> On the subject of slippery mappings and the dictionary constructor,
> consider a subclass of dictionary which overrides __getitem__ to
> transform the value stored in the (inherited) dictionary structure to
> the "real" value stored in the (logical) mapping defined by the
> dictionary subclass. (For example: a dictionary subclass which supports
> a defined order of iteration by using nodes in a linked list to store
> the keys and values, and then storing the keys and nodes in the
> dictionary structure). In the 2.2b1, if an instance of such a
> subclass is passed to the constructor (or to the update method)
> of a normal dictionary, the overridden __getitem__ is ignored because
> of the PyDict_Check near the top of PyDict_Merge.

Yes. Note that this isn't unique to dicts -- if, for example, you subclass
list, list(instance_of_that_list_subclass) doesn't look up list.__getitem__
either. They all work this way.

> I'd like to suggest that that check be changed to PyDict_CheckExact
> (which apparently does not exist yet, but would be analogous
> to PyString_CheckExact). This would shunt dictionary subclasses into the
> generic mapping branch of PyDict_Merge, which would allow an overridden
> __getitem__ to work.

I agree that it would. Whether it *should* is something you'll have to take
up with Guido. Subclassing builtins is a tricky business.

Note that you *can* fiddle your subclass to change what's stored, by
overriding __init__. A list example is clearer than a dict one simply
because less involved:

class L(list):
def __init__(self, arg):
list.__init__(self, arg)
for i in range(len(arg)):
self[i] = arg[i] * 100

def __getitem__(self, i):
return 666

x = L(range(3))
print x, list(x)

That prints [0, 100, 200] twice. The second one isn't what you want today
(or so I predict), but it's clear as mud (to me) what most people will want
most often. Is list() *defined* in terms of __getitem__? Not really, not
even under the covers (it's defined more in terms of the iterator protocol
now). What is dictionary() defined in terms of? It simply isn't spelled
out yet. With the dictionary() in current CVS, dictionary(subclass) won't
pay attention to a __getitem__ override, and dictionary(subclass.items())
won't either but will (trivially) pay attention to an .items() override.

Python usually resolves questions of this nature by picking the answer
that's easier to explain. Since subclassing builtins in Python can't
override the builtin representation, I bet Guido will say current 2.2b1
behavior is easier to explain. It's sure debatable, though.

> Alternatively, the PyDict_Check could be supplemented by a check to see
> if __getitem__ has been overridden (and if so, using the generic
> code). However, this would not help a subclass which transforms its
> keys in some way (so that PyMapping_Keys returns a different set of keys
> than that stored in the dictionary structure). (I suppose the check
> could be extended to look for an overridden keys method.)

That's the problem, isn't it? You can't guess what's going to happen
without studying the implementation code. Since subclassing builtins is
brand new, Python takes "shortcuts" *all over the place* internally (the
marvel to me isn't that you discovered this about dict.update(), but that
you didn't stumble over 100 others before it <wink>).

They weren't shortcuts before 2.2 -- they were just the obvious ways to
implement things. Now they look like shortcuts, "avoiding" thousands of
lines of fiddly new code to worry about "oops -- maybe this isn't a 'real'
str, list, tuple, dict, file, complex, int, long, float, let's do a
long-winded dance to see whether it's a subclass".

I expect it will take years to resolve all that. In the meantime, exactly
when the assorted magic methods get called for instances of subclasses of
builtins is going to be unclear and sometimes surprising. I also expect
changes to be driven by compelling use cases (for example, people have been
very vocal over the years about wanting to pass a dictionary substitute to
eval(), so it's no accident that case works for dict subclasses in 2.2b1).
And I also expect that some arguably good changes will never get made (e.g.,
because of the central role dicts play throughout Python's implementation,
anything catering to dict subclasses that slows "real dicts"-- even a
little --is going to be a difficult sell).

good-thing-python-never-gave-a-rip-about-theoretical-purity<wink>-ly
y'rs - tim


Greg Chapman

unread,
Oct 29, 2001, 2:35:06 PM10/29/01
to
On Mon, 29 Oct 2001 02:58:42 -0500, Guido van Rossum <gu...@python.org> wrote:

>
>If you want a vaguely dictionary-like object, write a class that
>doesn't derive from dictionary but implements the mapping protocol; if
>you want a base class that implements most of the operations already,
>start with UserDict.
>
>Subclassing a built-in type is appropriate when either (a) you want to
>use it in a context where a genuine list/dictionary/file/etc.; or (b)
>you want the speed advantage of the built-in type. In both cases you
>have to live with some restrictions. Remapping the fundamental
>accessors (like __getitem__) is probably not a good idea in either
>case. Adding new state and behavior is fine.
>
>We should document this more clearly and in more detail.
>

In fact I have a UserDict-like class which implements the behavior I want, and
which I will continue to use. Given the new subclasses, it seemed logical to
try to convert this class into a dictionary subclass, since the intent is for it
to work identically to a dictionary except that it has the additonal property
that the order of iteration is defined. My thinking was that as a subclass, I
could use it in situation (a), where a genuine dictionary is required. However,
I see now that this class is not a subclass of dictionary (since it breaks
fundamental aspects of its purported superclass), but an implementation of a
subtype of the mapping type implemented by dictionary. So the class can use a
dictionary as part of its implementation, but it is not itself a dictionary.

I agree that it would be very helpful to have some documentation of what parts
of the built-in types should be considered fundamental.

By the way, in thinking about situation (a), I was wondering how much of a
slow-down would be involved if PyDict_XXX calls were changed to PyMapping_XXX
(I'm not advocating that, I was just curious). Anyway, while doing so, I
happened to notice the following two macros in abstract.h:

#define PyMapping_DelItemString(O,K) PyDict_DelItemString((O),(K))
#define PyMapping_DelItem(O,K) PyDict_DelItem((O),(K))

I assume those are oversights? It seems to me they should delegate to
PyObject_DelItem (and PyObject_DelItemString, which will have to be added).

Anyway, thanks for your reply, and thanks for Python!

---
Greg Chapman

Guido van Rossum

unread,
Oct 31, 2001, 1:02:16 PM10/31/01
to
Greg Chapman <glch...@earthlink.net> writes:

> By the way, in thinking about situation (a), I was wondering how much of a
> slow-down would be involved if PyDict_XXX calls were changed to PyMapping_XXX
> (I'm not advocating that, I was just curious).

I haven't measured it but am convinced that it would be an enormous
slowdown.

> Anyway, while doing so, I
> happened to notice the following two macros in abstract.h:
>
> #define PyMapping_DelItemString(O,K) PyDict_DelItemString((O),(K))
> #define PyMapping_DelItem(O,K) PyDict_DelItem((O),(K))
>
> I assume those are oversights? It seems to me they should delegate to
> PyObject_DelItem (and PyObject_DelItemString, which will have to be added).

Yes, these are oversights. I've submitted a SF bug report so they
will eventually be fixed.

0 new messages