Unable to debug Perl script

Artemis Fowl

unread,

Aug 21, 2008, 7:08:47 AM8/21/08

to

Hello all,

I face a peculiar problem when I try to debug my perl script. This
script is used to fetch values from an excel sheet and print it into
different files. I needed to debug this script. When I do try to
debug, I get this error.

"Bizarre copy of HASH in leave at excel_extract.pl line 114.
at excel_extract.pl line 114
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info."

I tried searching for help on the net. Couldn't find much information
about it. Seems like a peculiar error.
Does anyone know why this happens? It would be a lot of help if you
could shine some light on this :)

Warm Regards,
Artemis.

Peter Scott

unread,

Aug 21, 2008, 8:14:36 AM8/21/08

to

On Thu, 21 Aug 2008 04:08:47 -0700, Artemis Fowl wrote

> I face a peculiar problem when I try to debug my perl script. This
> script is used to fetch values from an excel sheet and print it into
> different files. I needed to debug this script. When I do try to
> debug, I get this error.
>
> "Bizarre copy of HASH in leave at excel_extract.pl line 114.
> at excel_extract.pl line 114

> I tried searching for help on the net. Couldn't find much information
> about it. Seems like a peculiar error.

This is due to a bug in some C or XS code somewhere, probably in a CPAN
module. First upgrade to the latest version of everything, including
perl. If you still get the error, reduce it to the shortest program you
can, preferably under 20 lines, and post it as a bug on rt.perl.org or
just email the author of the most excel-specific module your program uses.
It may not be their fault but you can't be expected to do more unless you
know how.

--
Peter Scott
http://www.perlmedic.com/
http://www.perldebugged.com/

smallpond

unread,

Aug 21, 2008, 11:15:00 AM8/21/08

to

FWIW, google finds a short method of generating this error.

This is perl, v5.8.8 built for i386-linux-thread-multi

perl -Te '@{%h}{x}'
Bizarre copy of HASH in leave at -e line 1.

--S
** Posted from http://www.teranews.com **

Peter Scott

unread,

Aug 22, 2008, 12:44:45 PM8/22/08

to

On Thu, 21 Aug 2008 11:15:00 -0400, smallpond wrote:
> FWIW, google finds a short method of generating this error.
>
> This is perl, v5.8.8 built for i386-linux-thread-multi
>
> perl -Te '@{%h}{x}'
> Bizarre copy of HASH in leave at -e line 1.

Elegant. Looks fixed in 5.10. Either change 27350 or 25808.

Ben Morrow

unread,

Aug 22, 2008, 1:32:56 PM8/22/08

to

Quoth Peter Scott <Pe...@PSDT.com>:

> On Thu, 21 Aug 2008 11:15:00 -0400, smallpond wrote:
> > FWIW, google finds a short method of generating this error.
> >
> > This is perl, v5.8.8 built for i386-linux-thread-multi
> >
> > perl -Te '@{%h}{x}'
> > Bizarre copy of HASH in leave at -e line 1.
>
> Elegant. Looks fixed in 5.10. Either change 27350 or 25808.

I should perhaps point out that this doesn't mean what you might think,
and that in 5.10 its meaning has also been fixed.

~% perl5.8.8 -le'%h = qw/a b/; %{"1/8"} = qw/a c/; print @{%h}{a}'
b
~% perl5.10.0 -le'%h = qw/a b/; %{"1/8"} = qw/a c/; print @{%h}{a}'
c

Perl used to allow you to treat a hash or array as a reference to
itself; this was a bug, and has now been (partly) fixed. The way the
expression now evaluates is

Evaluate %h in scalar context -> '1/8'
Evaluate @{'1/8'}{a} as a symbolic ref

which is why the above gives 'c'. But since 5.8 and earlier incorrectly
sliced %h rather than %{'1/8'}, you can't rely on this. (It would be
stupid behaviour to rely on, in any case, since the exact value of a
hash in scalar context has never been guaranteed. The only formal
statement in the docs is that the value will be true iff the hash has
any elements.)

Under 'use strict' you get 'Can't use string ("1/8") as a HASH
reference' with perls at least as far back as 5.6.1, so this won't be a
problem in any normal code. If you want to slice %h, the correct syntax
is simply

@h{a}

Ben

--
"If a book is worth reading when you are six, * b...@morrow.me.uk
it is worth reading when you are sixty." [C.S.Lewis]

szr

unread,

Aug 22, 2008, 2:11:23 PM8/22/08

to

Having read all this inspired me to run a few tests, and I found
something odd regarding allocation:

$ perl5.8.8 -Mstrict -we 'my %h; @h{1..1} = (1..100); print "[",
scalar %h, "]\n";'
[1/8]

$ perl5.8.8 -Mstrict -we 'my %h; @h{1..2} = (1..100); print "[",
scalar %h, "]\n";'
[2/8]

$ perl5.8.8 -Mstrict -we 'my %h; @h{1..3} = (1..100); print "[",
scalar %h, "]\n";'
[3/8]

$ perl5.8.8 -Mstrict -we 'my %h; @h{1..4} = (1..100); print "[",
scalar %h, "]\n";'
[3/8]

$ perl5.8.8 -Mstrict -we 'my %h; @h{1..5} = (1..100); print "[",
scalar %h, "]\n";'
[4/8]

I get the same using 5.10.0, 5.8.2, and 5.8.0. 5.6.1, however, shows
the fourth line as [4/8], and the 5th as [5/8], which is what I would
have exacted. It seems Perl 5.8.0 and above sometimes incorrectly return
the number of used buckets, as in the fourth line, there are four
key-value pairs, but only 3 buckets.... how can this be?

--
szr

Ilya Zakharevich

unread,

Aug 22, 2008, 3:21:54 PM8/22/08

to

[A complimentary Cc of this posting was sent to
szr
<sz...@szromanMO.comVE>], who wrote in article <g8mvg...@news4.newsguy.com>:

> $ perl5.8.8 -Mstrict -we 'my %h; @h{1..4} = (1..100); print "[",
> scalar %h, "]\n";'
> [3/8]
>
> $ perl5.8.8 -Mstrict -we 'my %h; @h{1..5} = (1..100); print "[",
> scalar %h, "]\n";'
> [4/8]
>
>
> I get the same using 5.10.0, 5.8.2, and 5.8.0. 5.6.1, however, shows
> the fourth line as [4/8], and the 5th as [5/8],

So hashing algorithms in 5.6.1 is slightly better (on this particular
codeset). [No surprise for me; I suspect I know who optimized it. ;-]

With randomized hashing, 5 people with 8 possible birth-weekdays would
have a quite large chance of a collision, 1 - 8*7*6*5*4 / 5^8 = 80%
(birthday paradox). So it is not surprising that what you got is a
collision.

> which is what I would have exacted. It seems Perl 5.8.0 and above
> sometimes incorrectly return the number of used buckets, as in the
> fourth line, there are four key-value pairs, but only 3
> buckets.... how can this be?

Each bucket may keep an inlimited number of keys. [If you are lucky,
most buckets have only one key, and key lookup is quite quick.]

Hope this helps,
Ilya

Ilya Zakharevich

unread,

Aug 22, 2008, 3:24:28 PM8/22/08

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Ilya Zakharevich
<nospam...@ilyaz.org>], who wrote in article <g8n3ki$1mc$1...@agate.berkeley.edu>:

> With randomized hashing, 5 people with 8 possible birth-weekdays would
> have a quite large chance of a collision, 1 - 8*7*6*5*4 / 5^8 = 80%

^^^
8^5

> (birthday paradox). So it is not surprising that what you got is a
> collision.

[The answer is AFAIK correct, only the expresion was wrong...]

Sorry,
Ilya

Ben Morrow

unread,

Aug 22, 2008, 3:23:14 PM8/22/08

to

Quoth "szr" <sz...@szromanMO.comVE>:

>
> Having read all this inspired me to run a few tests, and I found
> something odd regarding allocation:
>
> $ perl5.8.8 -Mstrict -we 'my %h; @h{1..1} = (1..100); print "[",
> scalar %h, "]\n";'
> [1/8]
>
> $ perl5.8.8 -Mstrict -we 'my %h; @h{1..2} = (1..100); print "[",
> scalar %h, "]\n";'
> [2/8]
>
> $ perl5.8.8 -Mstrict -we 'my %h; @h{1..3} = (1..100); print "[",
> scalar %h, "]\n";'
> [3/8]
>
> $ perl5.8.8 -Mstrict -we 'my %h; @h{1..4} = (1..100); print "[",
> scalar %h, "]\n";'
> [3/8]
>
> $ perl5.8.8 -Mstrict -we 'my %h; @h{1..5} = (1..100); print "[",
> scalar %h, "]\n";'
> [4/8]
>
> I get the same using 5.10.0, 5.8.2, and 5.8.0. 5.6.1, however, shows
> the fourth line as [4/8], and the 5th as [5/8], which is what I would
> have exacted. It seems Perl 5.8.0 and above sometimes incorrectly return
> the number of used buckets, as in the fourth line, there are four
> key-value pairs, but only 3 buckets.... how can this be?

Learn how hash tables work. A 'bucket' isn't a key, but a set of keys
that hash to the same value; after that perl will do a linear scan
through all the keys in the bucket looking for one that matches.
Obviously, for efficiency, you want this final linear scan to be as
short as possible; this is why it is important to use a hash function
that distributes the keys evenly between the buckets.

Presumably the hash function was tweaked in 5.8, and two of the strings
'1'..'5' now end up in the same bucket; I would expect that this was
done to make some real-world set of keys distribute better, but I don't
know.

Ben

--
The Earth is degenerating these days. Bribery and corruption abound.
Children no longer mind their parents, every man wants to write a book,
and it is evident that the end of the world is fast approaching.
Assyrian stone tablet, c.2800 BC b...@morrow.me.uk

comp.lang.c++

unread,

Aug 23, 2008, 8:44:26 AM8/23/08

to

On Aug 22, 12:23 pm, Ben Morrow <b...@morrow.me.uk> wrote:
> Quoth "szr" <sz...@szromanMO.comVE>:
>
> ...

> Presumably the hash function was tweaked in 5.8, and two of the strings
> '1'..'5' now end up in the same bucket; I would expect that this was
> done to make some real-world set of keys distribute better, but I don't
> know.
>

Or maybe hv.h provides the clue:

/* hash a key */
...
The "hash seed" feature was added in Perl 5.8.1
to perturb the results to avoid "algorithmic
complexity attacks".

--
Charles DeRykus

Ben Morrow

unread,

Aug 23, 2008, 4:06:25 PM8/23/08

to

Quoth "comp.lang.c++" <c...@blv-sam-01.ca.boeing.com>:

Nope. Firstly, the hashing appears to have changed *before* 5.8.1;
secondly, as of 5.8.2 (IIRC) the random-hash-seed behaviour only kicks
in on hashes that are actually under attack.

Ben

--
Outside of a dog, a book is a man's best friend.
Inside of a dog, it's too dark to read.
b...@morrow.me.uk Groucho Marx

comp.lang.c++

unread,

Aug 24, 2008, 7:09:31 PM8/24/08

to

On Aug 23, 1:06 pm, Ben Morrow <b...@morrow.me.uk> wrote:
> Quoth "comp.lang.c++" <c...@blv-sam-01.ca.boeing.com>:
>
>
>
> > On Aug 22, 12:23 pm, Ben Morrow <b...@morrow.me.uk> wrote:
> > > Quoth "szr" <sz...@szromanMO.comVE>:
>
> > > ...
> > > Presumably the hash function was tweaked in 5.8, and two of the strings
> > > '1'..'5' now end up in the same bucket; I would expect that this was
> > > done to make some real-world set of keys distribute better, but I don't
> > > know.
>
> > Or maybe hv.h provides the clue:
>
> > /* hash a key */
> > ...
> > The "hash seed" feature was added in Perl 5.8.1
> > to perturb the results to avoid "algorithmic
> > complexity attacks".
>
> Nope. Firstly, the hashing appears to have changed *before* 5.8.1;
> secondly, as of 5.8.2 (IIRC) the random-hash-seed behaviour only kicks
> in on hashes that are actually under attack.
>

Seems strange the new hashing behavior - at least in the example
mentioned - is less distributive than 5.6.1. That is, the same 2 keys
now hash the same and get bucketed together whereas with 5.6.1 they
didn't. It just seems very counter-intuitive
that 2 different, single character keys would hash
to the same value in any case if the algorithm
bore any resemblance to the classic one:

int i = key_length;
unsigned int hash = 0;
char *s = key;
while (i--) { hash = hash * 33 + *s++; }

--
Charles DeRykus

Ben Morrow

unread,

Aug 24, 2008, 8:31:42 PM8/24/08

to

5.6 used exactly that; 5.8 changed it to

char *s_PeRlHaSh_tmp = str;
unsigned char *s_PeRlHaSh = (unsigned char *)s_PeRlHaSh_tmp;
I32 i_PeRlHaSh = len;
U32 hash_PeRlHaSh = 0;
while (i_PeRlHaSh--) {
hash_PeRlHaSh += *s_PeRlHaSh++;
hash_PeRlHaSh += (hash_PeRlHaSh << 10);
hash_PeRlHaSh ^= (hash_PeRlHaSh >> 6);
}
hash_PeRlHaSh += (hash_PeRlHaSh << 3);
hash_PeRlHaSh ^= (hash_PeRlHaSh >> 11);
(hash) = (hash_PeRlHaSh + (hash_PeRlHaSh << 15));

which apparently performs better on real-world data.

Ben

--
The cosmos, at best, is like a rubbish heap scattered at random.
Heraclitus
b...@morrow.me.uk