I face a peculiar problem when I try to debug my perl script. This
script is used to fetch values from an excel sheet and print it into
different files. I needed to debug this script. When I do try to
debug, I get this error.
"Bizarre copy of HASH in leave at excel_extract.pl line 114.
at excel_extract.pl line 114
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info."
I tried searching for help on the net. Couldn't find much information
about it. Seems like a peculiar error.
Does anyone know why this happens? It would be a lot of help if you
could shine some light on this :)
Warm Regards,
Artemis.
This is due to a bug in some C or XS code somewhere, probably in a CPAN
module. First upgrade to the latest version of everything, including
perl. If you still get the error, reduce it to the shortest program you
can, preferably under 20 lines, and post it as a bug on rt.perl.org or
just email the author of the most excel-specific module your program uses.
It may not be their fault but you can't be expected to do more unless you
know how.
--
Peter Scott
http://www.perlmedic.com/
http://www.perldebugged.com/
FWIW, google finds a short method of generating this error.
This is perl, v5.8.8 built for i386-linux-thread-multi
perl -Te '@{%h}{x}'
Bizarre copy of HASH in leave at -e line 1.
--S
** Posted from http://www.teranews.com **
Elegant. Looks fixed in 5.10. Either change 27350 or 25808.
I should perhaps point out that this doesn't mean what you might think,
and that in 5.10 its meaning has also been fixed.
~% perl5.8.8 -le'%h = qw/a b/; %{"1/8"} = qw/a c/; print @{%h}{a}'
b
~% perl5.10.0 -le'%h = qw/a b/; %{"1/8"} = qw/a c/; print @{%h}{a}'
c
Perl used to allow you to treat a hash or array as a reference to
itself; this was a bug, and has now been (partly) fixed. The way the
expression now evaluates is
Evaluate %h in scalar context -> '1/8'
Evaluate @{'1/8'}{a} as a symbolic ref
which is why the above gives 'c'. But since 5.8 and earlier incorrectly
sliced %h rather than %{'1/8'}, you can't rely on this. (It would be
stupid behaviour to rely on, in any case, since the exact value of a
hash in scalar context has never been guaranteed. The only formal
statement in the docs is that the value will be true iff the hash has
any elements.)
Under 'use strict' you get 'Can't use string ("1/8") as a HASH
reference' with perls at least as far back as 5.6.1, so this won't be a
problem in any normal code. If you want to slice %h, the correct syntax
is simply
@h{a}
Ben
--
"If a book is worth reading when you are six, * b...@morrow.me.uk
it is worth reading when you are sixty." [C.S.Lewis]
Having read all this inspired me to run a few tests, and I found
something odd regarding allocation:
$ perl5.8.8 -Mstrict -we 'my %h; @h{1..1} = (1..100); print "[",
scalar %h, "]\n";'
[1/8]
$ perl5.8.8 -Mstrict -we 'my %h; @h{1..2} = (1..100); print "[",
scalar %h, "]\n";'
[2/8]
$ perl5.8.8 -Mstrict -we 'my %h; @h{1..3} = (1..100); print "[",
scalar %h, "]\n";'
[3/8]
$ perl5.8.8 -Mstrict -we 'my %h; @h{1..4} = (1..100); print "[",
scalar %h, "]\n";'
[3/8]
$ perl5.8.8 -Mstrict -we 'my %h; @h{1..5} = (1..100); print "[",
scalar %h, "]\n";'
[4/8]
I get the same using 5.10.0, 5.8.2, and 5.8.0. 5.6.1, however, shows
the fourth line as [4/8], and the 5th as [5/8], which is what I would
have exacted. It seems Perl 5.8.0 and above sometimes incorrectly return
the number of used buckets, as in the fourth line, there are four
key-value pairs, but only 3 buckets.... how can this be?
--
szr
So hashing algorithms in 5.6.1 is slightly better (on this particular
codeset). [No surprise for me; I suspect I know who optimized it. ;-]
With randomized hashing, 5 people with 8 possible birth-weekdays would
have a quite large chance of a collision, 1 - 8*7*6*5*4 / 5^8 = 80%
(birthday paradox). So it is not surprising that what you got is a
collision.
> which is what I would have exacted. It seems Perl 5.8.0 and above
> sometimes incorrectly return the number of used buckets, as in the
> fourth line, there are four key-value pairs, but only 3
> buckets.... how can this be?
Each bucket may keep an inlimited number of keys. [If you are lucky,
most buckets have only one key, and key lookup is quite quick.]
Hope this helps,
Ilya
> (birthday paradox). So it is not surprising that what you got is a
> collision.
[The answer is AFAIK correct, only the expresion was wrong...]
Sorry,
Ilya
Learn how hash tables work. A 'bucket' isn't a key, but a set of keys
that hash to the same value; after that perl will do a linear scan
through all the keys in the bucket looking for one that matches.
Obviously, for efficiency, you want this final linear scan to be as
short as possible; this is why it is important to use a hash function
that distributes the keys evenly between the buckets.
Presumably the hash function was tweaked in 5.8, and two of the strings
'1'..'5' now end up in the same bucket; I would expect that this was
done to make some real-world set of keys distribute better, but I don't
know.
Ben
--
The Earth is degenerating these days. Bribery and corruption abound.
Children no longer mind their parents, every man wants to write a book,
and it is evident that the end of the world is fast approaching.
Assyrian stone tablet, c.2800 BC b...@morrow.me.uk
Or maybe hv.h provides the clue:
/* hash a key */
...
The "hash seed" feature was added in Perl 5.8.1
to perturb the results to avoid "algorithmic
complexity attacks".
--
Charles DeRykus
Nope. Firstly, the hashing appears to have changed *before* 5.8.1;
secondly, as of 5.8.2 (IIRC) the random-hash-seed behaviour only kicks
in on hashes that are actually under attack.
Ben
--
Outside of a dog, a book is a man's best friend.
Inside of a dog, it's too dark to read.
b...@morrow.me.uk Groucho Marx
Seems strange the new hashing behavior - at least in the example
mentioned - is less distributive than 5.6.1. That is, the same 2 keys
now hash the same and get bucketed together whereas with 5.6.1 they
didn't. It just seems very counter-intuitive
that 2 different, single character keys would hash
to the same value in any case if the algorithm
bore any resemblance to the classic one:
int i = key_length;
unsigned int hash = 0;
char *s = key;
while (i--) { hash = hash * 33 + *s++; }
--
Charles DeRykus
5.6 used exactly that; 5.8 changed it to
char *s_PeRlHaSh_tmp = str;
unsigned char *s_PeRlHaSh = (unsigned char *)s_PeRlHaSh_tmp;
I32 i_PeRlHaSh = len;
U32 hash_PeRlHaSh = 0;
while (i_PeRlHaSh--) {
hash_PeRlHaSh += *s_PeRlHaSh++;
hash_PeRlHaSh += (hash_PeRlHaSh << 10);
hash_PeRlHaSh ^= (hash_PeRlHaSh >> 6);
}
hash_PeRlHaSh += (hash_PeRlHaSh << 3);
hash_PeRlHaSh ^= (hash_PeRlHaSh >> 11);
(hash) = (hash_PeRlHaSh + (hash_PeRlHaSh << 15));
which apparently performs better on real-world data.
Ben
--
The cosmos, at best, is like a rubbish heap scattered at random.
Heraclitus
b...@morrow.me.uk