I'd like to do the following in more succint code:
if k in b:
a=b[k]
else:
a={}
b[k]=a
a['A']=1
In perl it is just one line: $a=$b->{"A"} ||={}.
Thanks,
Geoffrey
b.setdefault(k, a)
--
Lawrence, oluyede.org - neropercaso.it
"It is difficult to get a man to understand
something when his salary depends on not
understanding it" - Upton Sinclair
I am afraid it is not the same. b.setdefault(k, {}) will always create
an empty dict, even if k is in b, as demonstrated in the below code.
b={}
def f(i):
print "I am evaluated %d" % i
return i
b.setdefault('A', f(1))
b.setdefault('A', f(2))
b
Define b as a default dict:
b = defaultdict(dict)
b[k]['A'] = l
Carl Banks
I'm afraid you've asked a non sequiter:
euler 40% cat test.pl
$a=$b->{"A"} ||={} ;
print "$a\n" ;
$b->{"B"} = 0 ;
$a=$b->{"B"} ||={} ;
print "$a\n" ;
$b->{"X"} = 15 ;
$a=$b->{"X"} ||={} ;
print "$a\n" ;
euler 41% perl test.pl
HASH(0x92662a0)
HASH(0x926609c)
15
James
--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095
It is not supposed to be used this way.
$b is supposed to be a hash-table of hash-table. If a key exists in
$b, it points to another hash table. The $a=$b->{"A"} ||={} pattern is
useful when you want to add records to the double hash table.
For example, if you have a series of records in the format of (K1, K2,
V), and you want to add them to the double hash-table, you can do
$a=$b->{K1} || ={}
$a->{K2}=V
a = b.setdefault('A', {})
This combines all two actions together:
- Sets b['A'] to {} if it is not already defined
- Assigns b['A'] to a
More info on dict methods here:
No, this has already been proposed and discarded. The OP does NOT
want this, because it always generates an empty {} whether it is
needed or not. Not really a big hardship, but if the default value
were some expensive-to-construct container class, then you would be
creating one every time you wanted to reference a value, on the chance
that the key did not exist.
Carl Banks' post using defaultdict is the correct solution. The
raison d'etre for defaultdict, and the reason that it is the solution
to the OP's question, is that instead of creating a just-in-case
default value every time, the defaultdict itself is constructed with a
factory method of some kind (in practice, it appears that this factory
method is usually the list or dict class constructor). If a reference
to the defaultdict gives a not-yet-existing key, then the factory
method is called to construct the new value, that value is stored in
the dict with the given key, and the value is passed back to the
caller. No instances are created unless they are truly required for
initializing an entry for a never-before-seen key.
-- Paul
I think the most direct translation of the OP's example to Python
would be:
# setup b as special flavor of dict, that creates a
# new dict for not-yet-created keys
b = defaultdict(dict)
# actual python impl of $a=$b->{k}||={}
a = b[k]
# assign 1 to retrieved/created dict
a['A'] = 1
-- Paul
That's great, but the perl code you provided does not behave identically
to the python code you provided, so your requirments were not well stated.
(Does my annoyance with perl and its ugly syntax show here?)
I think this demonstrates the Python version of what you describe.
-- Paul
from collections import defaultdict
data = [
('A','B',1), ('A','C',2), ('A','D',3), ('B','A',4),
('B','B',5), ('B','C',6), ('B','D',7),
]
def defaultdictFactory():
return defaultdict(dict)
table = defaultdict(defaultdictFactory)
for k1,k2,v in data:
table[k1][k2] = v
for kk in sorted(table.keys()):
print "-",kk
for jj in sorted(table[kk].keys()):
print " -",jj,table[kk][jj]
prints:
- A
- B 1
- C 2
- D 3
- B
- A 4
- B 5
- C 6
- D 7
What is the best solution in Perl need not be the best solution in
Python. In Python you should just use a tuple as your dict key, i.e.
a[k1,k2] = v, unless you have some other constraints you're not telling
us.
HTH,
--
Carsten Haese
http://informixdb.sourceforge.net
I use tuples this way all the time. It is indeed very neat. But it is
not a replacement for double hash-table. If I want to retrieve
information just by K1, it is not efficient to index on (K1, K2).
It looks like defaultdict is the solution for this kind of thing.
Thanks all for your help.
Define "efficient". As in typing? Lookup should be the same speed for
all keys because its a hash-table.
If you have to look up all values associates with k1 and any k2, you're right,
that's not efficient. That would fall under "other constraints you're not
telling us." I'm not a mind reader.
-Carsten
try:
a = b[k]
except KeyError: #or except IndexError: if b is a list/tuple and not a dict
a = {}
b[k] = a
a['A'] = 1
Indeed, exceptions are handled faster than "if/else" loops. As it was
mentionned earlier, One neat solution in Perl may not be the perfect one
in Python.
Cheers,
Sébastien
Wow. This solution is interesting. I'll try this. Thanks.
Yeah, I should have mentioned that I actually want to group the data
by K1 and then by K2.
When I made my response, it occurred to me that Python could be
improved (maybe) if one could overload dict.get() to use a factory,
like so:
b = {}
a = b.get(k,factory=dict)
a['A'] = 1
That's a slight improvement (maybe) over defaultdict since it would
still allow the same dict to have the membership check in other
places. I'm not so sure overloading get to let it modify the dict is
a good idea, though.
Actually, it'd probably be fairly uncontroversial to add a factory
keyword to dict.setdefault(). At least insofar as setdefault is
uncontroversial.
Carl Banks
The python-dev thread that discussed the feature before implementation
(as so often is the case) exercised many of the possible design paths,
and would be a useful read. [Damn, now I have to
well-known-search-engine it]. Aah, right - I'd forgotten how long it
took to get it right. This would be a suitable starting-point - it's the
beginning of the /third/ round of discussion:
http://mail.python.org/pipermail/python-dev/2006-February/061485.html
Many alternatives were discussed, and my memory at this distance is that
Guido had good reasons for choosing the exact API he did for defaultdict.
regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------
That is certainly true, but does it matter? You waste a very small
amount of time creating a dict you don't use.
$ python -m timeit '{}'
1000000 loops, best of 3: 0.247 usec per loop
On my machine 250 ns gets you a new dict...
--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick
I think that if you truly want to emulate a perl hash then you would
want this which does the above but recursively.
from collections import defaultdict
class hash(defaultdict):
def __init__(self):
defaultdict.__init__(self, hash)
D=hash()
D[1][2][3][4]=5
D[1][4][5]=6
print D
# File: autovivifying_dict.py
from collections import defaultdict
class hash(defaultdict):
""" Used like a dict except sub-dicts automagically created as
needed
Based on: http://groups.google.com/group/comp.lang.python/msg/f334fbdafe4afa37
>>> D=hash()
>>> D[1][2][3][4]=5
>>> D[1][4][5]=6
>>> D
hash({1: hash({2: hash({3: hash({4: 5})}), 4: hash({5: 6})})})
>>> hash({1: hash({2: hash({3: hash({4: 5})}), 4: hash({7: 8})})})
hash({1: hash({2: hash({3: hash({4: 5})}), 4: hash({7: 8})})})
>>>
"""
def __init__(self, *a, **b):
defaultdict.__init__(self, hash, *a, **b)
def __repr__(self):
return "hash(%s)" % (repr(dict(self)),)
def _test():
import doctest
doctest.testmod()
if __name__ == "__main__":
_test()
- Paddy.