Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

scalar(%tied_hash)

2 views
Skip to first unread message

Ilya Zakharevich

unread,
Dec 3, 2003, 2:24:05 AM12/3/03
to Rafael Garcia-Suarez, Mailing list Perl5
On Wed, Dec 03, 2003 at 02:23:49AM +0100, Rafael Garcia-Suarez wrote:
> > > Don't. Using a tied hash in scalar context will probably be a fatal
> > > error with perl 5.10.

> > This would mean that most the people will need to stop upgrading...
> > Any code which uses hashes in scalar context will need to be wrapped
> > in eval{} etc...

> That's most right. I just sent a message to P5P about this.
> (are you subscribed to P5P ?)

No.

Actually, the return value of scalar(%tied) should be better something
like "23/-1" if 23 == keys %tied. This -1 is not 0, but conveys
"meaningless" better than the format "23/23" I proposed before.

Or just return "-1/-1"...

Thanks,
Ilya

David Nicol

unread,
Dec 3, 2003, 1:47:51 PM12/3/03
to Ilya Zakharevich, Rafael Garcia-Suarez, Mailing list Perl5

how about

join '/',keys %tied, tied %tied

On Wed, 2003-12-03 at 01:24, Ilya Zakharevich wrote:
>
> Actually, the return value of scalar(%tied) should be better something
> like "23/-1" if 23 == keys %tied. This -1 is not 0, but conveys
> "meaningless" better than the format "23/23" I proposed before.
>
> Or just return "-1/-1"...
>
> Thanks,
> Ilya

--
david nicol "I'll be working, working; but if you come visit I'll
put down what I'm doing: my friends are important" -- David Byrne

Michael G Schwern

unread,
Dec 3, 2003, 2:12:40 PM12/3/03
to david nicol, Ilya Zakharevich, Rafael Garcia-Suarez, Mailing list Perl5
On Wed, 2003-12-03 at 01:24, Ilya Zakharevich wrote:
> Actually, the return value of scalar(%tied) should be better something
> like "23/-1" if 23 == keys %tied. This -1 is not 0, but conveys
> "meaningless" better than the format "23/23" I proposed before.
>
> Or just return "-1/-1"...

Since I doubt anyone uses the literal scalar hash return value for anything
but optimizing Perl's hashing algorithm, it really doesn't matter what a tied
hash returns as long as its:

A) true if there's keys.
B) false if there's no keys.
C) Matches \d+/\d+.
D) cheap.

Calculating the number of keys in a tied hash is not always cheap, so I'd
suggest we just drop any requirement that SCALAR has to report an accurate
number of keys.


--
Michael G Schwern sch...@pobox.com http://www.pobox.com/~schwern/
Perl_croak(aTHX_ "Believe me, you don't want to use \"-u\" on a Macintosh");
-- toke.c

David Nicol

unread,
Dec 3, 2003, 2:59:36 PM12/3/03
to Michael G Schwern, Ilya Zakharevich, Rafael Garcia-Suarez, Mailing list Perl5

what's the rationale for C? C prevents us from finding out,
in the case of a tied hash, what it is tied to. Am I the only
one who imagines that that would be useful information?

On Wed, 2003-12-03 at 13:12, Michael G Schwern wrote:
> On Wed, 2003-12-03 at 01:24, Ilya Zakharevich wrote:
> > Actually, the return value of scalar(%tied) should be better something
> > like "23/-1" if 23 == keys %tied. This -1 is not 0, but conveys
> > "meaningless" better than the format "23/23" I proposed before.
> >
> > Or just return "-1/-1"...
>
> Since I doubt anyone uses the literal scalar hash return value for anything
> but optimizing Perl's hashing algorithm, it really doesn't matter what a tied
> hash returns as long as its:
>
> A) true if there's keys.
> B) false if there's no keys.
> C) Matches \d+/\d+.
> D) cheap.
>
> Calculating the number of keys in a tied hash is not always cheap, so I'd
> suggest we just drop any requirement that SCALAR has to report an accurate
> number of keys.
--

david nicol
Where the hell did I put my coffee?

Ilya Zakharevich

unread,
Dec 3, 2003, 3:52:24 PM12/3/03
to david nicol, Michael G Schwern, Rafael Garcia-Suarez, Mailing list Perl5
On Wed, Dec 03, 2003 at 01:59:36PM -0600, david nicol wrote:
> what's the rationale for C? C prevents us from finding out,
> in the case of a tied hash, what it is tied to. Am I the only
> one who imagines that that would be useful information?

Why not use use tied()?

> > A) true if there's keys.
> > B) false if there's no keys.

Both needed. But maybe also apply the same to components of \d+/\d+
notation?

> > C) Matches \d+/\d+.

Nice to have too. Then something like "01/01" may be an answer, right?

> > D) cheap.

Something like doing one "new" each() call

> > Calculating the number of keys in a tied hash is not always cheap, so I'd
> > suggest we just drop any requirement that SCALAR has to report an accurate
> > number of keys.

This is why I suggested "-1/-1"...

Yours,
Ilya

Michael G Schwern

unread,
Dec 4, 2003, 4:26:08 PM12/4/03
to david nicol, Ilya Zakharevich, Rafael Garcia-Suarez, Mailing list Perl5
On Wed, Dec 03, 2003 at 01:59:36PM -0600, david nicol wrote:
> what's the rationale for C?

Because it emulates what a normal hash returns. Tied hashes are supposed
to look like regular hashes. It is an edge case, but there's no reason
to screw up someone that's doing

($keys, $buckets) = split '/', scalar %hash;

just because %hash is tied.

A and B dominate. C is a nice-to-have.


> C prevents us from finding out,
> in the case of a tied hash, what it is tied to.

We have tied() for that!


> Am I the only one who imagines that that would be useful information?

In the scalar return value of a tied hash? Think so. :)

...let me think it over while Cheese beats you with a baseball bat.

Michael G Schwern

unread,
Dec 4, 2003, 4:36:43 PM12/4/03
to Ilya Zakharevich, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Wed, Dec 03, 2003 at 12:52:24PM -0800, Ilya Zakharevich wrote:
> > > A) true if there's keys.
> > > B) false if there's no keys.
>
> Both needed. But maybe also apply the same to components of \d+/\d+
> notation?

I don't quite understand. To clarify, I'm saying the solution should have
the features A, B, C and D. A, B and D being most important. C being
fairly minor since very few people actually parse the scalar return value of
a hash.


> > > C) Matches \d+/\d+.
>
> Nice to have too. Then something like "01/01" may be an answer, right?

1/1 would be fine when there are keys and 0/1 when there's not. Returning
the correct number of keys would be nice, but not a requirement. So long
as A and B are met.


> > > D) cheap.
>
> Something like doing one "new" each() call

Using each() to determine if there are keys will call NEXTKEY incrementing
the key counter, so this will go wrong:

while(($k,$v) = each %tied) {
print scalar %tied;
}

You'll skip every other key. :(

There's also this odd edge case. Consider a tied hash with one key.

print scalar %tied;
print scalar %tied;

The first line will call FIRSTKEY and then NEXTKEY getting the one key in
the hash causing a correct return value of true. The next call will call
NEXTKEY and since there are no more keys to list it will return false causing
an incorrect return value of false. :(

I don't think there's any way we can supply a default SCALAR method without
messing up the key counter or supplying the wrong value.


> > > Calculating the number of keys in a tied hash is not always cheap, so I'd
> > > suggest we just drop any requirement that SCALAR has to report an accurate
> > > number of keys.
>
> This is why I suggested "-1/-1"...

The only problem I have with that is it doesn't quite match \d+/\d+. This
isn't a big deal, but it would be nice to keep the same format. It does
have the advantage of indicating "this key/bucket value is bogus".

Stupid am I? Stupid like a fox!

Ilya Zakharevich

unread,
Dec 4, 2003, 4:55:43 PM12/4/03
to Michael G Schwern, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Thu, Dec 04, 2003 at 01:36:43PM -0800, Michael G Schwern wrote:
> > > > A) true if there's keys.
> > > > B) false if there's no keys.
> >
> > Both needed. But maybe also apply the same to components of \d+/\d+
> > notation?
>
> I don't quite understand. To clarify, I'm saying the solution should have
> the features A, B, C and D. A, B and D being most important. C being
> fairly minor since very few people actually parse the scalar return value of
> a hash.

I say that this may be also useful:

> > > > C) Matches \d+/\d+.

C') ... and the components are TRUE (so != 0 if under restriction \d+)

And I think \d+ for components is too restrictive. I think we should
also discuss the variant -1/-1, which conveys "something fishy" better
than 1/1.

> > Nice to have too. Then something like "01/01" may be an answer, right?
>
> 1/1 would be fine when there are keys and 0/1 when there's not. Returning
> the correct number of keys would be nice, but not a requirement. So long
> as A and B are met.

I propose 01 instead of 1 to allow distinguishing the "fake" case from
the real one...

> There's also this odd edge case. Consider a tied hash with one key.
>
> print scalar %tied;
> print scalar %tied;
>
> The first line will call FIRSTKEY and then NEXTKEY getting the one key in
> the hash causing a correct return value of true. The next call will call
> NEXTKEY and since there are no more keys to list it will return false causing
> an incorrect return value of false. :(

I do not see how the existence of a wrong implementation should ruin
the idea. ;-). Of course, it should be FIRSTKEY which is called.

> I don't think there's any way we can supply a default SCALAR method without
> messing up the key counter or supplying the wrong value.

Of course there is.

If there is a key counter, then the hash is not empty.

If there is no key counter, create one; if it is created, the hash
is not empty.

Hope this helps,
Ilya

Michael G Schwern

unread,
Dec 4, 2003, 5:19:47 PM12/4/03
to Ilya Zakharevich, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Thu, Dec 04, 2003 at 01:55:43PM -0800, Ilya Zakharevich wrote:
> > > > > C) Matches \d+/\d+.
>
> C') ... and the components are TRUE (so != 0 if under restriction \d+)

Well, no. Given a/b: a is only true if the hash has keys, false otherwise.
It doesn't really matter what the value of b is.


> And I think \d+ for components is too restrictive. I think we should
> also discuss the variant -1/-1, which conveys "something fishy" better
> than 1/1.

A bucket value of -1 would be acceptable to me. Like I said, I doubt
anyone's doing this:

my($keys,$buckets) = scalar %hash =~ m{(\d+)/(\d+)};

so its not too important that \d+/\d+ be strictly upheld.


> > > Nice to have too. Then something like "01/01" may be an answer, right?
> >
> > 1/1 would be fine when there are keys and 0/1 when there's not. Returning
> > the correct number of keys would be nice, but not a requirement. So long
> > as A and B are met.
>
> I propose 01 instead of 1 to allow distinguishing the "fake" case from
> the real one...

A bucket value of -1 handles that, but again, the format isn't terribly
important to me.


> Of course there is.
>
> If there is a key counter, then the hash is not empty.
>
> If there is no key counter, create one; if it is created, the hash
> is not empty.

Maybe I'm not up on how each() is implemented with tied hashes. I thought
there was no key counter on the HV and its all handled inside FIRSTKEY
and NEXTKEY.

Could you clarify 'key counter'? Do you mean xpvhv->xhv_riter and
xpvhv->eiter?

So what you're saying is SCALAR would...

Check if xpvhv->xhv_eiter exists. If so, return a true value
because we're in the middle of an iteration which means there's
keys.

Else call FIRSTKEY. If it returns true, reset the hash iterator
and return true. Otherwise return false.

Ingenious!

I need a SHOWER a BURGER and some ROBOTS, STAT!
-- http://www.angryflower.com/allrigh.gif

Ronald J Kimball

unread,
Dec 4, 2003, 5:17:08 PM12/4/03
to Perl 5 Porters
On Thu, Dec 04, 2003 at 01:36:43PM -0800, Michael G Schwern wrote:
> On Wed, Dec 03, 2003 at 12:52:24PM -0800, Ilya Zakharevich wrote:
> > > > C) Matches \d+/\d+.
> >
> > Nice to have too. Then something like "01/01" may be an answer, right?
>
> 1/1 would be fine when there are keys and 0/1 when there's not. Returning
> the correct number of keys would be nice, but not a requirement. So long
> as A and B are met.

Er, when there are no keys, you should return 0, of course. "0/1" isn't
false. :)


As to the other suggestion, a tied hash in a scalar context simply can't
call FIRSTKEY. Otherwise you've completely broken the transparency with
non-tied hashes!


Ronald

Michael G Schwern

unread,
Dec 4, 2003, 7:59:07 PM12/4/03
to Ronald J Kimball, Perl 5 Porters
On Thu, Dec 04, 2003 at 05:17:08PM -0500, Ronald J Kimball wrote:
> Er, when there are no keys, you should return 0, of course. "0/1" isn't
> false. :)

Oh. Right.


> As to the other suggestion, a tied hash in a scalar context simply can't
> call FIRSTKEY. Otherwise you've completely broken the transparency with
> non-tied hashes!

I don't see why.

My breasts are arousing weapons.

Ilya Zakharevich

unread,
Dec 4, 2003, 10:13:30 PM12/4/03
to Michael G Schwern, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Thu, Dec 04, 2003 at 02:19:47PM -0800, Michael G Schwern wrote:
> On Thu, Dec 04, 2003 at 01:55:43PM -0800, Ilya Zakharevich wrote:
> > > > > > C) Matches \d+/\d+.
> >
> > C') ... and the components are TRUE (so != 0 if under restriction \d+)
>
> Well, no. Given a/b: a is only true if the hash has keys, false otherwise.
> It doesn't really matter what the value of b is.

Well, your argument is uncompatible with your

A) true if there's keys.
B) false if there's no keys.

So if one gets a/b, which is automatically TRUE, there are keys; thus
a must be TRUE as well. It is good to have B TRUE too.

> > Of course there is.
> >
> > If there is a key counter, then the hash is not empty.
> >
> > If there is no key counter, create one; if it is created, the hash
> > is not empty.
>
> Maybe I'm not up on how each() is implemented with tied hashes. I thought
> there was no key counter on the HV and its all handled inside FIRSTKEY
> and NEXTKEY.

How would it know then that it should call FIRSTKEY?

> Could you clarify 'key counter'? Do you mean xpvhv->xhv_riter and
> xpvhv->eiter?

I do not remember; what I said was based on common sense, not an
implementation. The HV must knows when iteration through
each/keys/values has been started, and when it is finished.

> So what you're saying is SCALAR would...
>
> Check if xpvhv->xhv_eiter exists. If so, return a true value
> because we're in the middle of an iteration which means there's
> keys.
>
> Else call FIRSTKEY. If it returns true, reset the hash iterator
> and return true. Otherwise return false.
>
> Ingenious!

If xpvhv->xhv_eiter has the necessary semantic, then yes, this is what
I meant...

Yours,
Ilya

Michael G Schwern

unread,
Dec 5, 2003, 1:04:35 AM12/5/03
to Ilya Zakharevich, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Thu, Dec 04, 2003 at 07:13:30PM -0800, Ilya Zakharevich wrote:
> > > C') ... and the components are TRUE (so != 0 if under restriction \d+)
> >
> > Well, no. Given a/b: a is only true if the hash has keys, false otherwise.
> > It doesn't really matter what the value of b is.
>
> Well, your argument is uncompatible with your
>
> A) true if there's keys.
> B) false if there's no keys.
>
> So if one gets a/b, which is automatically TRUE, there are keys; thus
> a must be TRUE as well. It is good to have B TRUE too.

I'd forgotten that a hash returns 0 in scalar context if there's no keys.


> > Maybe I'm not up on how each() is implemented with tied hashes. I thought
> > there was no key counter on the HV and its all handled inside FIRSTKEY
> > and NEXTKEY.
>
> How would it know then that it should call FIRSTKEY?

Right.

I knew right away that my pants and your inner child could be best friends.

Tassilo Von Parseval

unread,
Dec 5, 2003, 3:18:44 AM12/5/03
to Michael G Schwern, Ilya Zakharevich, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Thu, Dec 04, 2003 at 10:04:35PM -0800 Michael G Schwern wrote:
> On Thu, Dec 04, 2003 at 07:13:30PM -0800, Ilya Zakharevich wrote:
> > > > C') ... and the components are TRUE (so != 0 if under restriction \d+)
> > >
> > > Well, no. Given a/b: a is only true if the hash has keys, false otherwise.
> > > It doesn't really matter what the value of b is.
> >
> > Well, your argument is uncompatible with your
> >
> > A) true if there's keys.
> > B) false if there's no keys.
> >
> > So if one gets a/b, which is automatically TRUE, there are keys; thus
> > a must be TRUE as well. It is good to have B TRUE too.
>
> I'd forgotten that a hash returns 0 in scalar context if there's no keys.

When following this thread, I come to think that maybe my proposal in
<http://www.mail-archive.com/perl5-...@perl.org/msg72484.html>
wasn't such a bad idea. It transfers all these considerations done in
this thread onto the user by letting him craft his own SCALAR method for
tied hashes.

Tassilo
--
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval

Rafael Garcia-Suarez

unread,
Dec 5, 2003, 3:33:03 AM12/5/03
to tassilo....@post.rwth-aachen.de, sch...@pobox.com, nospam...@ilyaz.org, what...@davidnicol.com, perl5-...@perl.org
Tassilo von Parseval wrote:
>
> When following this thread, I come to think that maybe my proposal in
> <http://www.mail-archive.com/perl5-...@perl.org/msg72484.html>
> wasn't such a bad idea. It transfers all these considerations done in
> this thread onto the user by letting him craft his own SCALAR method for
> tied hashes.

I like your proposal, esp. if you add a default Tie::Hash::SCALAR method.
But I've already mentioned this. I think that adding a new tie method
is worthwhile in this case ; and SCALAR is not the kind of thing that
everyone will want to override anyway.

Tassilo Von Parseval

unread,
Dec 5, 2003, 3:49:19 AM12/5/03
to Rafael Garcia-Suarez, sch...@pobox.com, nospam...@ilyaz.org, what...@davidnicol.com, perl5-...@perl.org

Thanks for the support. I'll complete and polish up my previous patch a
little so that it could theoretically be applied to blead.

Michael G Schwern

unread,
Dec 5, 2003, 6:33:31 AM12/5/03
to Rafael Garcia-Suarez, tassilo....@post.rwth-aachen.de, nospam...@ilyaz.org, what...@davidnicol.com, perl5-...@perl.org

Well wait a second. The algorithm I outlined using FIRSTKEY and the
internal hash iterator values looks like it can give a tied hash the
proper scalar value, no user defined method necessary. If we can get tied
hashes to behave correctly in scalar context by default, I don't think
SCALAR is worthwhile. Who's really going to want to redefine the key/bucket
value of a hash?

"A Masterpiece."
"Well, better than average, maybe."

Tassilo Von Parseval

unread,
Dec 5, 2003, 7:35:12 AM12/5/03
to Michael G Schwern, Rafael Garcia-Suarez, nospam...@ilyaz.org, what...@davidnicol.com, perl5-...@perl.org
On Fri, Dec 05, 2003 at 03:33:31AM -0800 Michael G Schwern wrote:
> On Fri, Dec 05, 2003 at 09:33:03AM +0100, Rafael Garcia-Suarez wrote:
> > > When following this thread, I come to think that maybe my proposal in
> > > <http://www.mail-archive.com/perl5-...@perl.org/msg72484.html>
> > > wasn't such a bad idea. It transfers all these considerations done in
> > > this thread onto the user by letting him craft his own SCALAR method for
> > > tied hashes.
> >
> > I like your proposal, esp. if you add a default Tie::Hash::SCALAR method.
> > But I've already mentioned this. I think that adding a new tie method
> > is worthwhile in this case ; and SCALAR is not the kind of thing that
> > everyone will want to override anyway.
>
> Well wait a second. The algorithm I outlined using FIRSTKEY and the
> internal hash iterator values looks like it can give a tied hash the
> proper scalar value, no user defined method necessary. If we can get tied
> hashes to behave correctly in scalar context by default, I don't think
> SCALAR is worthwhile. Who's really going to want to redefine the key/bucket
> value of a hash?

But you can't really make them behave correctly because you don't know
what the user implementing a tied hash considers correct. If I
understand your algorithm correctly you first checking for the existence
of xpvhv->xhv_eiter. If it exists, return true, otherwise trigger
FIRSTKEY and return its value.

Considering that tying is about changing the behaviour of a data-type, I
think this is too limiting. With the above, there is either only false
or true returned. Secondly, you might end up triggering a method (namely
FIRSTKEY) anyway, so why not just trigger SCALAR in the first place?
The amount of work necessary to implement the xhv_iter/FIRSTKEY trick is
around the same as the SCALAR approach.

However (and now it gets sophisticated:-), we could do both. If SCALAR
does not exist, fallback to your method. Of course, this is the most
work implementation-wise. But it would be fully backwards-compatible.

Whatever the eventual solution will be, the one I dislike most is a
solution where a user cannot control the behaviour of the tied hash in
scalar conext. This violates the principle behind tied variables.

Michael G Schwern

unread,
Dec 5, 2003, 4:26:17 PM12/5/03
to Rafael Garcia-Suarez, nospam...@ilyaz.org, what...@davidnicol.com, perl5-...@perl.org
On Fri, Dec 05, 2003 at 01:35:12PM +0100, Tassilo von Parseval wrote:
> However (and now it gets sophisticated:-), we could do both. If SCALAR
> does not exist, fallback to your method. Of course, this is the most
> work implementation-wise. But it would be fully backwards-compatible.

If we are to have SCALAR, this is the only acceptable way to do it. Hashes
in scalar context are so trivial a feature that it should just DWIM without
user intervention.

Monkey tennis

Tassilo Von Parseval

unread,
Dec 5, 2003, 6:38:24 PM12/5/03
to Michael G Schwern, Rafael Garcia-Suarez, nospam...@ilyaz.org, what...@davidnicol.com, perl5-...@perl.org
On Fri, Dec 05, 2003 at 01:26:17PM -0800 Michael G Schwern wrote:
> On Fri, Dec 05, 2003 at 01:35:12PM +0100, Tassilo von Parseval wrote:
> > However (and now it gets sophisticated:-), we could do both. If SCALAR
> > does not exist, fallback to your method. Of course, this is the most
> > work implementation-wise. But it would be fully backwards-compatible.
>
> If we are to have SCALAR, this is the only acceptable way to do it. Hashes
> in scalar context are so trivial a feature that it should just DWIM without
> user intervention.

Alright, there's something we both agree on. Tomorrow I'll send a
modified patch that incorporates your method as a fallback into mine.

Yitzchak Scott-Thoennes

unread,
Dec 6, 2003, 9:59:58 PM12/6/03
to Michael G Schwern, Ilya Zakharevich, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Thu, Dec 04, 2003 at 02:19:47PM -0800, Michael G Schwern <sch...@pobox.com> wrote:
> So what you're saying is SCALAR would...
>
> Check if xpvhv->xhv_eiter exists. If so, return a true value
> because we're in the middle of an iteration which means there's
> keys.
>
> Else call FIRSTKEY. If it returns true, reset the hash iterator
> and return true. Otherwise return false.
>
> Ingenious!

It is ingenious. The only flaw is that clearing the hash or deleting
the last element will leave xhv_eiter set. Obviously hv_clear can zero
out xhv_eiter, but I don't think catching the delete case is possible,
and even if it were, it isn't desirable. Consider code like this:

while (my ($k,$v) = each %h) {

# do some stuff

delete $h{$k};

if (%h) {
# if there are more keys to come, do some other stuff
}
}

if xhv_eiter is 0, we can call FIRSTKEY; otherwise we should just croak.

Ilya has pointed out that we could just document that scalar(%hash)
*may* perturb the iterator. If we go that route, I'd rather make it
*always* clear the iterator even for regular hashes (which would break
code like that above, but at least provide consistency).

Tassilo Von Parseval

unread,
Dec 7, 2003, 3:23:54 AM12/7/03
to Yitzchak Scott-Thoennes, Michael G Schwern, Ilya Zakharevich, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Sat, Dec 06, 2003 at 06:59:58PM -0800 Yitzchak Scott-Thoennes wrote:

> On Thu, Dec 04, 2003 at 02:19:47PM -0800, Michael G Schwern <sch...@pobox.com> wrote:
> > So what you're saying is SCALAR would...
> >
> > Check if xpvhv->xhv_eiter exists. If so, return a true value
> > because we're in the middle of an iteration which means there's
> > keys.
> >
> > Else call FIRSTKEY. If it returns true, reset the hash iterator
> > and return true. Otherwise return false.
> >
> > Ingenious!
>
> It is ingenious. The only flaw is that clearing the hash or deleting
> the last element will leave xhv_eiter set. Obviously hv_clear can zero
> out xhv_eiter, but I don't think catching the delete case is possible,
> and even if it were, it isn't desirable. Consider code like this:

This was the thing that made me tear my hair out when doing the patch.
It took me a while to figure out that xhv_eiter was not reset on
clearing the hash. This is now done in magic_wipepack().

The problem which I didn't think of was deleting key/value pairs until
there are none left. This is currently _not_ handled by my patch.

Thinking about the delete problem, I think this could be solved.
Pseudocode follows:

SV*
Perl_magic_isempty(pTHX_ SV *sv, MAGIC *mg)
{
HE* oldhe = HvEITER((HV*)sv);
if (hv_iternext((HV*)sv)) {
HvEITER((HV*)sv) = oldhe;
return &PL_sv_no;
}
return &PL_sv_yes;
}

I think saving and restoring the iterator would solve it, no?

> Ilya has pointed out that we could just document that scalar(%hash)
> *may* perturb the iterator. If we go that route, I'd rather make it
> *always* clear the iterator even for regular hashes (which would break
> code like that above, but at least provide consistency).

If the above algorithm doesn't work then we would have to document that
the default scalar operation may return a true value when the hash is in
fact empty (namely after many deletes). The iterator itself is not
touched by my patch.

Yitzchak Scott-Thoennes

unread,
Dec 7, 2003, 4:10:53 AM12/7/03
to Michael G Schwern, Ilya Zakharevich, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Sun, Dec 07, 2003 at 09:23:54AM +0100, Tassilo von Parseval <tassilo....@post.rwth-aachen.de> wrote:
> Thinking about the delete problem, I think this could be solved.
> Pseudocode follows:
>
> SV*
> Perl_magic_isempty(pTHX_ SV *sv, MAGIC *mg)
> {
> HE* oldhe = HvEITER((HV*)sv);
> if (hv_iternext((HV*)sv)) {
> HvEITER((HV*)sv) = oldhe;
> return &PL_sv_no;
> }
> return &PL_sv_yes;
> }
>
> I think saving and restoring the iterator would solve it, no?

No. For tied hashes, HvEITER is basically only going to determine
whether FIRSTKEY or NEXTKEY will be called for each(). Which element
gets returned by NEXTKEY is up to the tieing class, and you perturb
that by calling hv_iternext. I think the patch you came up with already
is as good as it gets.

> If the above algorithm doesn't work then we would have to document that
> the default scalar operation may return a true value when the hash is in
> fact empty (namely after many deletes). The iterator itself is not
> touched by my patch.

You have to avoid messing with both the hv's iterator and the tied object's
internal iterator. Calling FIRSTKEY if and only if the hv's iterator is
NULL, as you do, should be safe.

I'd rather croak than return bad data, but I can see the counterarguments.

Tassilo Parseval

unread,
Dec 7, 2003, 6:43:52 AM12/7/03
to Michael G Schwern, Ilya Zakharevich, david nicol, Rafael Garcia-Suarez, Mailing list Perl5, Yitzchak Scott-Thoennes
> On Sun, Dec 07, 2003 at 09:23:54AM +0100, Tassilo von Parseval <tassilo....@post.rwth-aachen.de> wrote:
> > Thinking about the delete problem, I think this could be solved.
> > Pseudocode follows:
> >
> > SV*
> > Perl_magic_isempty(pTHX_ SV *sv, MAGIC *mg)
> > {
> > HE* oldhe = HvEITER((HV*)sv);
> > if (hv_iternext((HV*)sv)) {
> > HvEITER((HV*)sv) = oldhe;
> > return &PL_sv_no;
> > }
> > return &PL_sv_yes;
> > }
> >
> > I think saving and restoring the iterator would solve it, no?
>
> No. For tied hashes, HvEITER is basically only going to determine
> whether FIRSTKEY or NEXTKEY will be called for each(). Which element
> gets returned by NEXTKEY is up to the tieing class, and you perturb
> that by calling hv_iternext. I think the patch you came up with already
> is as good as it gets.

Aww, too bad!



> > If the above algorithm doesn't work then we would have to document that
> > the default scalar operation may return a true value when the hash is in
> > fact empty (namely after many deletes). The iterator itself is not
> > touched by my patch.
>
> You have to avoid messing with both the hv's iterator and the tied object's
> internal iterator. Calling FIRSTKEY if and only if the hv's iterator is
> NULL, as you do, should be safe.

Now it's clear. Since tied hashes can be implemented in an arbitrary way
perl can't ever know about the internal iterator.



> I'd rather croak than return bad data, but I can see the counterarguments.

Especially since we cannot detect the rotten case so we'd always have to
croak when scalar on tied hashes is detected. The best thing IMO is to
just document this problem and advise people to always define a SCALAR
method when the scalar value of their tied hashes is supposed to have
any meaning. Patch to perltie.pod to be expected a little later today.

Tassilo

Yitzchak Scott-Thoennes

unread,
Dec 7, 2003, 1:05:30 PM12/7/03
to tassilo....@post.rwth-aachen.de, Mailing list Perl5
On Sun, Dec 07, 2003 at 12:43:52PM +0100, tassilo....@post.rwth-aachen.de wrote:
> > I'd rather croak than return bad data, but I can see the counterarguments.
>
> Especially since we cannot detect the rotten case so we'd always have to
> croak when scalar on tied hashes is detected. The best thing IMO is to
> just document this problem and advise people to always define a SCALAR
> method when the scalar value of their tied hashes is supposed to have
> any meaning. Patch to perltie.pod to be expected a little later today.

The rotten case is just when HvEITER is true for a tied hash.

Ilya Zakharevich

unread,
Dec 7, 2003, 3:21:06 PM12/7/03
to Yitzchak Scott-Thoennes, Michael G Schwern, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Sun, Dec 07, 2003 at 09:23:54AM +0100, Tassilo von Parseval wrote:
> The problem which I didn't think of was deleting key/value pairs until
> there are none left. This is currently _not_ handled by my patch.
>
> Thinking about the delete problem, I think this could be solved.
> Pseudocode follows:

[broken approach deleted]

> I think saving and restoring the iterator would solve it, no?

[Already answered in another message.]

Did not you note another approach I outlined? As documented, each()
is not supported when mixed with write-access to hash. *Enforce* it
for the combination each(%hash)+modify+scalar(%hash):

Keep a flag; set it to true on FIRSTKEY/NEXTKEY; set it to false on
write access. If the flag is set and hv_iter (sp?) is present, reset
the iterator in the beginning of scalar(%hash).

Hope this helps,
Ilya

Tassilo Von Parseval

unread,
Dec 7, 2003, 4:22:35 PM12/7/03
to Ilya Zakharevich, Yitzchak Scott-Thoennes, Michael G Schwern, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Sun, Dec 07, 2003 at 12:21:06PM -0800 Ilya Zakharevich wrote:

> On Sun, Dec 07, 2003 at 09:23:54AM +0100, Tassilo von Parseval wrote:
> > The problem which I didn't think of was deleting key/value pairs until
> > there are none left. This is currently _not_ handled by my patch.
> >
> > Thinking about the delete problem, I think this could be solved.
> > Pseudocode follows:
>
> [broken approach deleted]
>
> > I think saving and restoring the iterator would solve it, no?
>
> [Already answered in another message.]
>
> Did not you note another approach I outlined? As documented, each()
> is not supported when mixed with write-access to hash. *Enforce* it
> for the combination each(%hash)+modify+scalar(%hash):

To get this straight, the iterator should manually be reset when write
access (delete or adding a key/value pair) happens while being inside an
iteration. Right so far?

This would be too strict since there is one exception from the
no-delete-while-eaching rule: Namely when you delete a key/value pair
that was just returned by each().

> Keep a flag; set it to true on FIRSTKEY/NEXTKEY; set it to false on
> write access. If the flag is set and hv_iter (sp?) is present, reset
> the iterator in the beginning of scalar(%hash).

This would break with the example given in 'perldoc -f each', I think:

while (($key, $value) = each %hash) {
print $key, "\n";
delete $hash{$key}; # This is safe
}

Appears as though the closer one looks at scalared tied hashes, the more
pathological it gets. :-/

David Nicol

unread,
Dec 7, 2003, 8:01:04 PM12/7/03
to tassilo....@post.rwth-aachen.de, Ilya Zakharevich, Yitzchak Scott-Thoennes, Michael G Schwern, Rafael Garcia-Suarez, Mailing list Perl5

currently scalar %tied-hash returns "0" for all tied hashes.

What is so bad about this?

To get meaningful information requires access to tie-class internals

we seem to have consensus that SCALAR would be the method name

let's just document the current behavior

and support SCALAR in future releases (when it exists)

Yitzchak Scott-Thoennes

unread,
Dec 7, 2003, 8:10:35 PM12/7/03
to david nicol, tassilo....@post.rwth-aachen.de, Ilya Zakharevich, Michael G Schwern, Rafael Garcia-Suarez, Mailing list Perl5
On Sun, Dec 07, 2003 at 07:01:04PM -0600, david nicol <what...@davidnicol.com> wrote:
>
> currently scalar %tied-hash returns "0" for all tied hashes.

That's a common misconception.
$ perl -MTie::Hash -we'%h = 0..99; tie %h, "Tie::StdHash"; print scalar(%h)'
36/64



> What is so bad about this?

For starters, it's caused a number of bug reports.

> To get meaningful information requires access to tie-class internals
>
> we seem to have consensus that SCALAR would be the method name
>
> let's just document the current behavior
>
> and support SCALAR in future releases (when it exists)

But Tassilo has done a great job coming up with a fallback, and it's
been applied. Why get rid of it now?

Michael G Schwern

unread,
Dec 7, 2003, 8:14:15 PM12/7/03
to david nicol, tassilo....@post.rwth-aachen.de, Ilya Zakharevich, Yitzchak Scott-Thoennes, Michael G Schwern, Rafael Garcia-Suarez, Mailing list Perl5
On Sun, Dec 07, 2003 at 07:01:04PM -0600, david nicol wrote:
> currently scalar %tied-hash returns "0" for all tied hashes.
>
> What is so bad about this?

Because it breaks the hash API. Where have you been?


> To get meaningful information requires access to tie-class internals

Tassillo already has most of an implementation that can resolve the scalar
problem without making assumptions on the user's tie implementation. Ilya
and him are hashing out the edge cases.


> we seem to have consensus that SCALAR would be the method name
>
> let's just document the current behavior
>
> and support SCALAR in future releases (when it exists)

Tassillo already has a patch in for this. Keep up in back!

"The method employed I would gladly explain,
While I have it so clear in my head,
If I had but the time and you had but the brain--
But much yet remains to be said."
-- "Hunting of the Snark", Lewis Carroll

Ilya Zakharevich

unread,
Dec 8, 2003, 1:58:45 AM12/8/03
to Yitzchak Scott-Thoennes, Michael G Schwern, david nicol, Rafael Garcia-Suarez, Mailing list Perl5
On Sun, Dec 07, 2003 at 10:22:35PM +0100, Tassilo von Parseval wrote:
> To get this straight, the iterator should manually be reset when write
> access (delete or adding a key/value pair) happens while being inside an
> iteration. Right so far?
>
> This would be too strict since there is one exception from the
> no-delete-while-eaching rule: Namely when you delete a key/value pair
> that was just returned by each().

I did

perldoc perl

and can't find *any* place which would explain what is a hash and now
to deal with it. Well there is one such place (perldata) which could
document what is a hash, but it looks like the topic is somewhat
different (after a quick glance I have no idea what questions this
document is supposed to answer).

So: if such an exception exists, I did not take it into consideration.
Back to the drawing board...

Sorry,
Ilya


0 new messages