Hi,
I have a dict() unique
like this {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
4 5 1
5 4 1
4 4 2
2 3 1
4 3 2
Any ideas?
Thanks in advance Giuseppe
> Hi,
> I have a dict() unique
> like this > {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
> and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
> 4 5 1
> 5 4 1
> 4 4 2
> 2 3 1
> 4 3 2
> Any ideas?
> Thanks in advance > Giuseppe
> -- > http://mail.python.org/mailman/listinfo/python-list
>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
>> and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
>> 4 5 1
>> 5 4 1
>> 4 4 2
>> 2 3 1
>> 4 3 2
> for key in dict:
> print key[0], key[1], dict[key]
This might read more cleanly with tuple unpacking:
for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost
(I'm making the assumption that this is a edge/cost graph...use
appropriate names according to what they actually mean)
> On Aug 9, 2012 9:17 PM, <giuseppe.amatu...@gmail.com> wrote:
>> Hi,
>> I have a dict() unique
>> like this
>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
>> and i want to print to a file without the brackets comas and semicolon in
>> order to obtain something like this?
>> 4 5 1
>> 5 4 1
>> 4 4 2
>> 2 3 1
>> 4 3 2
>> Any ideas?
>> Thanks in advance
> How's this?
> from __future__ import print_function
> output = open("out.txt", "w")
> for (a, b), c in d.items():
> print(a, b, c, file=output)
On 08/09/2012 10:11 PM, giuseppe.amatu...@gmail.com wrote:
> Hi,
> I have a dict() unique
> like this
> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
> and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
> 4 5 1
> 5 4 1
> 4 4 2
> 2 3 1
> 4 3 2
> Any ideas?
> Thanks in advance
> Giuseppe
Boring explicit solution:
d = {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
for key, val in d.items():
v1,v2 = key
fout.write("%d %d %d\n" % (v1, v2, val))
> On Aug 9, 2012 9:17 PM, <giuseppe.amatu...@gmail.com> wrote:
>> Hi,
>> I have a dict() unique
>> like this
>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
>> and i want to print to a file without the brackets comas and semicolon in
>> order to obtain something like this?
>> 4 5 1
>> 5 4 1
>> 4 4 2
>> 2 3 1
>> 4 3 2
>> Any ideas?
>> Thanks in advance
> How's this?
> from __future__ import print_function
> output = open("out.txt", "w")
> for (a, b), c in d.items():
> print(a, b, c, file=output)
> On 08/09/12 15:22, Roman Vashkevich wrote:
>>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
>>> and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
>>> 4 5 1
>>> 5 4 1
>>> 4 4 2
>>> 2 3 1
>>> 4 3 2
>> for key in dict:
>> print key[0], key[1], dict[key]
> This might read more cleanly with tuple unpacking:
> for (edge1, edge2), cost in d.iteritems(): # or .items()
> print edge1, edge2, cost
> (I'm making the assumption that this is a edge/cost graph...use
> appropriate names according to what they actually mean)
>> On 08/09/12 15:22, Roman Vashkevich wrote:
>>>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
>>>> and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
>>>> 4 5 1
>>>> 5 4 1
>>>> 4 4 2
>>>> 2 3 1
>>>> 4 3 2
>>> for key in dict:
>>> print key[0], key[1], dict[key]
>> This might read more cleanly with tuple unpacking:
>> for (edge1, edge2), cost in d.iteritems(): # or .items()
>> print edge1, edge2, cost
>> (I'm making the assumption that this is a edge/cost graph...use
>> appropriate names according to what they actually mean)
>> -tkc
I'm impressed, the OP gives a dict with five entries and already we're optimising, a cunning plan if ever there was one. Hum, I think I'll start on the blast proof ferro-concrete bunker tonight just in case WWIII starts tomorrow.
> 10.08.2012, � 0:35, Tim Chase �������(�):
>> On 08/09/12 15:22, Roman Vashkevich wrote:
>>>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
>>>> and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
>>>> 4 5 1
>>>> 5 4 1
>>>> 4 4 2
>>>> 2 3 1
>>>> 4 3 2
>>> for key in dict:
>>> print key[0], key[1], dict[key]
>> This might read more cleanly with tuple unpacking:
>> for (edge1, edge2), cost in d.iteritems(): # or .items()
>> print edge1, edge2, cost
>> (I'm making the assumption that this is a edge/cost graph...use
>> appropriate names according to what they actually mean)
That link doesn't actually discuss dict.{iter}items()
Both are O(N) because you have to touch each item in the dict--you
can't iterate over N entries in less than O(N) time. For small
data-sets, building the list and then iterating over it may be
faster faster; for larger data-sets, the cost of building the list
overshadows the (minor) overhead of a generator. Either way, the
iterate-and-fetch-the-associated-value of .items() & .iteritems()
can (should?) be optimized in Python's internals to the point I
wouldn't think twice about using the more readable version.
Actually, they are different.
Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
Dict uses hashing to get a value from the dict and this is why it's O(1).
> On 08/09/12 15:41, Roman Vashkevich wrote:
>> 10.08.2012, × 0:35, Tim Chase ÎÁÐÉÓÁÌ(Á):
>>> On 08/09/12 15:22, Roman Vashkevich wrote:
>>>>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
>>>>> and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
>>>>> 4 5 1
>>>>> 5 4 1
>>>>> 4 4 2
>>>>> 2 3 1
>>>>> 4 3 2
>>>> for key in dict:
>>>> print key[0], key[1], dict[key]
>>> This might read more cleanly with tuple unpacking:
>>> for (edge1, edge2), cost in d.iteritems(): # or .items()
>>> print edge1, edge2, cost
>>> (I'm making the assumption that this is a edge/cost graph...use
>>> appropriate names according to what they actually mean)
> That link doesn't actually discuss dict.{iter}items()
> Both are O(N) because you have to touch each item in the dict--you
> can't iterate over N entries in less than O(N) time. For small
> data-sets, building the list and then iterating over it may be
> faster faster; for larger data-sets, the cost of building the list
> overshadows the (minor) overhead of a generator. Either way, the
> iterate-and-fetch-the-associated-value of .items() & .iteritems()
> can (should?) be optimized in Python's internals to the point I
> wouldn't think twice about using the more readable version.
> On 08/09/12 15:41, Roman Vashkevich wrote:
>> 10.08.2012, в 0:35, Tim Chase написал(а):
>>> On 08/09/12 15:22, Roman Vashkevich wrote:
>>>>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
>>>>> and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
>>>>> 4 5 1
>>>>> 5 4 1
>>>>> 4 4 2
>>>>> 2 3 1
>>>>> 4 3 2
>>>> for key in dict:
>>>> print key[0], key[1], dict[key]
>>> This might read more cleanly with tuple unpacking:
>>> for (edge1, edge2), cost in d.iteritems(): # or .items()
>>> print edge1, edge2, cost
>>> (I'm making the assumption that this is a edge/cost graph...use
>>> appropriate names according to what they actually mean)
> That link doesn't actually discuss dict.{iter}items()
> Both are O(N) because you have to touch each item in the dict--you
> can't iterate over N entries in less than O(N) time. For small
> data-sets, building the list and then iterating over it may be
> faster faster; for larger data-sets, the cost of building the list
> overshadows the (minor) overhead of a generator. Either way, the
> iterate-and-fetch-the-associated-value of .items() & .iteritems()
> can (should?) be optimized in Python's internals to the point I
> wouldn't think twice about using the more readable version.
In 3.x, .keys, .values, and .items are set-like read-only views specifically designed for iteration. So in 3.x they are THE way to do so for whichever alternative is appropriate. Iterating by keys and then looking up values instead of yielding the values at the same time is extra work.
> Actually, they are different.
> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
> Dict uses hashing to get a value from the dict and this is why it's O(1).
Sure, that's why
for key in dict:
print key[0], key[1], dict[key]
is probably slower than
for (edge1, edge2), cost in d.iteritems(): # or .items()
print edge1, edge2, cost
So, the latter is both faster and easier to read. Why are you arguing against it?
Also, please stop top-posting. It's impolite here, and makes it much harder to figure out who is saying what, in what order.
On Thu, Aug 9, 2012 at 2:34 PM, Roman Vashkevich <vashkevic...@gmail.com> wrote:
> Actually, they are different.
> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
> Dict uses hashing to get a value from the dict and this is why it's O(1).
Using "in" as an operator such as: "if key in dict" or "result = key
in dict" is O(1) as you say. Iterating on the dictionary requires
touching every item, and so is O(n), even though it also using "in" in
the command.
Here are a few quick timing tests I just ran with Python 2.6:
>>> timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(1))')
0.078683853332734088
>>> timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(10))')
0.17451784110969015
>>> timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(100))')
1.1708168159579486
>>> timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(1))')
0.14186911440299355
>>> timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(10))')
0.33836512561802579
>>> timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(100))')
2.2544262854249268
>>> timeit.timeit('for i in d: v=d[i]', 'd=dict.fromkeys(range(1))')
0.10009793211446549
>>> timeit.timeit('for i in d: v=d[i]', 'd=dict.fromkeys(range(10))')
0.38825072496723578
>>> timeit.timeit('for i in d: v=d[i]', 'd=dict.fromkeys(range(100))')
3.3020098061049339
As can be seen here, a 1-item dictionary iterated in 0.07 seconds, 10
items in 0.17 seconds, and 100 items in 1.17 seconds. That is fairly
close to linear, especially when considering the overhead of a
complete no-op
Using iteritems, it appears to actually scale slightly better than
linear, though it is slower than just the plain iteration.
Doing a plain iteration, then looking up the keys to get the values
also appears to be linear, and is even slower than iteritems.
I realized, I should have done 10, 100, 1000 rather than 1, 10, 100
for better results, so here are the results for 1000 items. It still
maintains the same pattern:
>>> timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(1000))')
10.166595947685153
>>> timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(1000))')
19.922474218828711
>>> timeit.timeit('for i in d: v=d[i]', 'd=dict.fromkeys(range(1000))')
On Thu, Aug 9, 2012 at 2:49 PM, Chris Kaynor <ckay...@zindagigames.com> wrote:
> On Thu, Aug 9, 2012 at 2:34 PM, Roman Vashkevich <vashkevic...@gmail.com> wrote:
>> Actually, they are different.
>> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
>> Dict uses hashing to get a value from the dict and this is why it's O(1).
> Using "in" as an operator such as: "if key in dict" or "result = key
> in dict" is O(1) as you say. Iterating on the dictionary requires
> touching every item, and so is O(n), even though it also using "in" in
> the command.
> Here are a few quick timing tests I just ran with Python 2.6:
>>>> timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(1))')
> 0.078683853332734088
>>>> timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(10))')
> 0.17451784110969015
>>>> timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(100))')
> 1.1708168159579486
>>>> timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(1))')
> 0.14186911440299355
>>>> timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(10))')
> 0.33836512561802579
>>>> timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(100))')
> 2.2544262854249268
>>>> timeit.timeit('for i in d: v=d[i]', 'd=dict.fromkeys(range(1))')
> 0.10009793211446549
>>>> timeit.timeit('for i in d: v=d[i]', 'd=dict.fromkeys(range(10))')
> 0.38825072496723578
>>>> timeit.timeit('for i in d: v=d[i]', 'd=dict.fromkeys(range(100))')
> 3.3020098061049339
> As can be seen here, a 1-item dictionary iterated in 0.07 seconds, 10
> items in 0.17 seconds, and 100 items in 1.17 seconds. That is fairly
> close to linear, especially when considering the overhead of a
> complete no-op
> Using iteritems, it appears to actually scale slightly better than
> linear, though it is slower than just the plain iteration.
> Doing a plain iteration, then looking up the keys to get the values
> also appears to be linear, and is even slower than iteritems.
Thanks a lot for the clarification.
Actually my problem is giving to raster dataset in geo-tif format find out
unique pair combination, count the number of observation
unique combination in rast1, count the number of observation
unique combination in rast2, count the number of observation
I try different solution and this seems to me the faster
mask=( Rast00 != 0 ) & ( Rast10 != 0 ) # may be this masking
operation can be included in the for loop
Rast00_mask= Rast00[mask] # may be this masking
operation can be included in the for loop
Rast10_mask= Rast10[mask] # may be this masking
operation can be included in the for loop
array2D = np.array(zip( Rast00_mask,Rast10_mask))
unique_u=dict()
unique_k1=dict()
unique_k2=dict()
for key1,key2 in array2D :
row = tuple((key1,key2))
if row in unique_u:
unique_u[row] += 1
else:
unique_u[row] = 1
if key1 in unique_k1:
unique_k1[key1] += 1
else:
unique_k1[key1] = 1
if key2 in unique_k2:
unique_k2[key2] += 1
else:
unique_k2[key2] = 1
output = open(dst_file_rast0010, "w")
for (a, b), c in unique_u.items():
print(a, b, c, file=output)
output.close()
output = open(dst_file_rast00, "w")
for (a), b in unique_k1.items():
print(a, b, file=output)
output.close()
output = open(dst_file_rast10, "w")
for (a), b in unique_k2.items():
print(a, b, file=output)
output.close()
What do you think? is there a way to speed up the process?
Thanks
Giuseppe
On 9 August 2012 16:34, Roman Vashkevich <vashkevic...@gmail.com> wrote:
> Actually, they are different.
> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
> Dict uses hashing to get a value from the dict and this is why it's O(1).
> 10.08.2012, в 1:21, Tim Chase написал(а):
>> On 08/09/12 15:41, Roman Vashkevich wrote:
>>> 10.08.2012, в 0:35, Tim Chase написал(а):
>>>> On 08/09/12 15:22, Roman Vashkevich wrote:
>>>>>> {(4, 5): 1, (5, 4): 1, (4, 4): 2, (2, 3): 1, (4, 3): 2}
>>>>>> and i want to print to a file without the brackets comas and semicolon in order to obtain something like this?
>>>>>> 4 5 1
>>>>>> 5 4 1
>>>>>> 4 4 2
>>>>>> 2 3 1
>>>>>> 4 3 2
>>>>> for key in dict:
>>>>> print key[0], key[1], dict[key]
>>>> This might read more cleanly with tuple unpacking:
>>>> for (edge1, edge2), cost in d.iteritems(): # or .items()
>>>> print edge1, edge2, cost
>>>> (I'm making the assumption that this is a edge/cost graph...use
>>>> appropriate names according to what they actually mean)
>> That link doesn't actually discuss dict.{iter}items()
>> Both are O(N) because you have to touch each item in the dict--you
>> can't iterate over N entries in less than O(N) time. For small
>> data-sets, building the list and then iterating over it may be
>> faster faster; for larger data-sets, the cost of building the list
>> overshadows the (minor) overhead of a generator. Either way, the
>> iterate-and-fetch-the-associated-value of .items() & .iteritems()
>> can (should?) be optimized in Python's internals to the point I
>> wouldn't think twice about using the more readable version.
> On 08/09/2012 05:34 PM, Roman Vashkevich wrote:
>> Actually, they are different.
>> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
>> Dict uses hashing to get a value from the dict and this is why it's O(1).
> Sure, that's why
> for key in dict:
> print key[0], key[1], dict[key]
> is probably slower than
> for (edge1, edge2), cost in d.iteritems(): # or .items()
> print edge1, edge2, cost
> So, the latter is both faster and easier to read. Why are you arguing against it?
> Also, please stop top-posting. It's impolite here, and makes it much harder to figure out who is saying what, in what order.
> --
> DaveA
I'm not arguing at all. Sorry if it sounded like I was arguing.
Thanks for notifying me of the way messages should be sent.
> Actually, they are different.
> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
> Dict uses hashing to get a value from the dict and this is why it's O(1).
Sligtly off topic, but looking up a value in a dictionary is actually
O(n) for all other entries in the dict which suffer a hash collision
with the searched entry.
True, a sensible choice of hash function will reduce n to 1 in common
cases, but it becomes an important consideration for larger datasets.
> On 09/08/2012 22:34, Roman Vashkevich wrote:
>> Actually, they are different.
>> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
>> Dict uses hashing to get a value from the dict and this is why it's O(1).
> Sligtly off topic, but looking up a value in a dictionary is actually
> O(n) for all other entries in the dict which suffer a hash collision
> with the searched entry.
> True, a sensible choice of hash function will reduce n to 1 in common
> cases, but it becomes an important consideration for larger datasets.
> ~Andrew
I'm glad you're wrong for CPython's dictionaries. The only time the
lookup would degenerate to O[n] would be if the hash table had only one
slot. CPython sensibly increases the hash table size when it becomes
too small for efficiency.
Where have you seen dictionaries so poorly implemented?
> On 08/09/2012 06:03 PM, Andrew Cooper wrote:
> I'm glad you're wrong for CPython's dictionaries. The only time the
> lookup would degenerate to O[n] would be if the hash table had only one
> slot. CPython sensibly increases the hash table size when it becomes
> too small for efficiency.
> Where have you seen dictionaries so poorly implemented?
On Thu, Aug 9, 2012 at 3:26 PM, Dave Angel <d...@davea.name> wrote:
> On 08/09/2012 06:03 PM, Andrew Cooper wrote:
>> On 09/08/2012 22:34, Roman Vashkevich wrote:
>>> Actually, they are different.
>>> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
>>> Dict uses hashing to get a value from the dict and this is why it's O(1).
>> Sligtly off topic, but looking up a value in a dictionary is actually
>> O(n) for all other entries in the dict which suffer a hash collision
>> with the searched entry.
>> True, a sensible choice of hash function will reduce n to 1 in common
>> cases, but it becomes an important consideration for larger datasets.
>> ~Andrew
> I'm glad you're wrong for CPython's dictionaries. The only time the
> lookup would degenerate to O[n] would be if the hash table had only one
> slot. CPython sensibly increases the hash table size when it becomes
> too small for efficiency.
> Where have you seen dictionaries so poorly implemented?
There are plenty of ways to make a pathological hash function that
will have that issue in CPython.
The very simple (and stupid):
class O(object):
def __hash__(self):
return 0
def __eq__(self, other): # I am aware this is the default equals method.
return self is other
Start adding those to a dictionary to get O(n) lookups.
Any case the hash return values modulus the dictionary hash table size
is constant will have similar results; powers of 2 are likely to
result in such behavior as well.
On Fri, Aug 10, 2012 at 8:26 AM, Dave Angel <d...@davea.name> wrote:
> On 08/09/2012 06:03 PM, Andrew Cooper wrote:
>> O(n) for all other entries in the dict which suffer a hash collision
>> with the searched entry.
>> True, a sensible choice of hash function will reduce n to 1 in common
>> cases, but it becomes an important consideration for larger datasets.
> I'm glad you're wrong for CPython's dictionaries. The only time the
> lookup would degenerate to O[n] would be if the hash table had only one
> slot. CPython sensibly increases the hash table size when it becomes
> too small for efficiency.
> Where have you seen dictionaries so poorly implemented?
In vanilla CPython up to version (I think) 3.3, where it's possible to
DoS the hash generator. Hash collisions are always possible, just
ridiculously unlikely unless deliberately exploited.
(And yes, I know an option was added to older versions to randomize
the hashes there too. It's not active by default, so "vanilla CPython"
is still vulnerable.)
> On 08/09/2012 06:03 PM, Andrew Cooper wrote:
>> On 09/08/2012 22:34, Roman Vashkevich wrote:
>>> Actually, they are different.
>>> Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference.
>>> Dict uses hashing to get a value from the dict and this is why it's O(1).
>> Sligtly off topic, but looking up a value in a dictionary is actually
>> O(n) for all other entries in the dict which suffer a hash collision
>> with the searched entry.
>> True, a sensible choice of hash function will reduce n to 1 in common
>> cases, but it becomes an important consideration for larger datasets.
>> ~Andrew
> I'm glad you're wrong for CPython's dictionaries. The only time the
> lookup would degenerate to O[n] would be if the hash table had only one
> slot. CPython sensibly increases the hash table size when it becomes
> too small for efficiency.
> Where have you seen dictionaries so poorly implemented?
Different n, which I should have made more clear. I was using it for
consistency with O() notation. My statement was O(n) where n is the
number of hash collisions.
The choice of hash algorithm (or several depending on the
implementation) should specifically be chosen to reduce collisions to
aid in efficient space utilisation and lookup times, but any
implementation must allow for collisions. There are certainly runtime
methods of improving efficiency using amortized operations.
As for poor implementations,
class Foo(object):
...
def __hash__(self):
return 0
I seriously found that in some older code I had the misfortune of
reading. It didn't remain in that state for long.
<python.l...@tim.thechases.com> wrote:
> On 08/09/12 17:26, Dave Angel wrote:
>> On 08/09/2012 06:03 PM, Andrew Cooper wrote:
>> I'm glad you're wrong for CPython's dictionaries. The only time the
>> lookup would degenerate to O[n] would be if the hash table had only one
>> slot. CPython sensibly increases the hash table size when it becomes
>> too small for efficiency.
>> Where have you seen dictionaries so poorly implemented?
That's the same hash collision attack that I alluded to above, and it
strikes *many* language implementations. Most released a patch fairly
quickly and quietly (Pike, Lua, V8 (JavaScript/ECMAScript), PHP), but
CPython dared not, on account of various applications depending on
hash order (at least for tests). It's not (for once) an indictment of
PHP (maybe that should be an "inarrayment"?), it's a consequence of a
hashing algorithm that favored simplicity over cryptographic
qualities.
In article <ucXUr.1030527$2z2.380...@fx19.am4>,
Andrew Cooper <am...@cam.ac.uk> wrote:
> As for poor implementations,
> class Foo(object):
> def __hash__(self):
> return 0
> I seriously found that in some older code I had the misfortune of
> reading.
Python assumes you are a consenting adult. If you wish to engage in activities which are hazardous to your health, so be it. But then again, you could commit this particular stupidity just as easily in C++ or any other language which lets you define your own hash() function.
On Fri, Aug 10, 2012 at 9:05 AM, Roy Smith <r...@panix.com> wrote:
> Python assumes you are a consenting adult. If you wish to engage in
> activities which are hazardous to your health, so be it.