Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[ID 20020611.002] Bug when testing array returned by sort operator

1 view
Skip to first unread message

da...@whistlingcat.com

unread,
Jun 11, 2002, 4:43:14 AM6/11/02
to perl5-...@perl.org
This is a bug report for perl from dme...@whistlingcat.com,
generated with the help of perlbug 1.33 running under perl v5.6.1.


-----------------------------------------------------------------
[Please enter your report here]

I believe I have discovered a bug in Perl 5.6.1, which I first noticed when
upgrading Net::DNS from version 0.14 to version 0.22. The problem does not
appear to be in the Net::DNS module itself, but rather in the Perl core,
as I have been able to develop a standalone test case. The bug seems to be
associated with the sort operator when sorting an array which has been
returned from another operator and which has not been assigned to another
variable. The test case follows:

#!/usr/bin/perl

use strict;

{
my @foo = (
{ type => 'MX', preference => 100 },
{ type => 'MX', preference => 5 },
{ type => 'MX', preference => 10 }
);

if (Sub1(\@foo)) {
print "Test 1 OK\n";
}
else {
print "Test 1 FAILED\n";
}

my @tmp = Sub1(\@foo);
if (@tmp) {
print "Test 2 OK\n";
}

if (Sub2(\@foo)) {
print "Test 3 OK\n";
}

@tmp = Sub2(\@foo);
if (@tmp) {
print "Test 4 OK\n";
}
}

sub Sub1
{
my $array = shift;

return sort { $a->{preference} <=> $b->{preference} }
grep { $_->{type} eq 'MX' } @$array;
}

sub Sub2
{
my $array = shift;

my @items = grep { $_->{type} eq 'MX' } @$array;
my @sortedItems = sort { $a->{preference} <=> $b->{preference} } @items;
return @sortedItems;
}

Both of the subroutines in this script take an arrayref and extract some items
from the underlying array into another array, and then return a sorted version
of the intermediate array. The four tests should print OK, as each invocation
of either Sub1 or Sub2 should return a three-element array. However, on running
this script, the following output is obtained:

Test 1 FAILED
Test 2 OK
Test 3 OK
Test 4 OK


[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
category=core
severity=high
---
Site configuration information for perl v5.6.1:

Configured by bhcompile at Wed Feb 20 15:00:12 EST 2002.

Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration:
Platform:
osname=linux, osvers=2.4.17-0.13smp, archname=i386-linux
uname='linux daffy.perf.redhat.com 2.4.17-0.13smp #1 smp fri feb 1 10:30:48 est 2002 i686 unknown '
config_args='-des -Doptimize=-O2 -march=i386 -mcpu=i686 -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dcccdlflags=-fPIC -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Uusethreads -Uuseithreads -Uuselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Di_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Dlocincpth='
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
useperlio=undef d_sfio=undef uselargefiles=undef usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
Compiler:
cc='gcc', ccflags ='-fno-strict-aliasing -I/usr/local/include',
optimize='-O2 -march=i386 -mcpu=i686',
cppflags='-fno-strict-aliasing -I/usr/local/include'
ccversion='', gccversion='2.96 20000731 (Red Hat Linux 7.1 2.96-98)', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=4
alignbytes=4, usemymalloc=n, prototype=define
Linker and Libraries:
ld='gcc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -ldl -lm -lc -lcrypt -lutil
perllibs=-lnsl -ldl -lm -lc -lcrypt -lutil
libc=/lib/libc-2.2.4.so, so=so, useshrplib=false, libperl=libperl.a
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:

---
@INC for perl v5.6.1:
/usr/lib/perl5/5.6.1/i386-linux
/usr/lib/perl5/5.6.1
/usr/lib/perl5/site_perl/5.6.1/i386-linux
/usr/lib/perl5/site_perl/5.6.1
/usr/lib/perl5/site_perl/5.6.0/i386-linux
/usr/lib/perl5/site_perl/5.6.0
/usr/lib/perl5/site_perl
.

---
Environment for perl v5.6.1:
HOME=/home/david
LANG=en_US
LANGUAGE (unset)
LC_COLLATE=C
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/bin:/usr/bin:/usr/bin/X11:/usr/local/bin:/usr/games:/opt/gnome/bin:/usr/X11R6/bin:/var/qmail/bin:/usr/sbin:/home/david/bin:/home/david/bin
PERL_BADLANG (unset)
SHELL=/bin/bash

Abigail

unread,
Jun 11, 2002, 5:03:46 AM6/11/02
to da...@whistlingcat.com, perl5-...@perl.org
>
> Both of the subroutines in this script take an arrayref and extract some items
> from the underlying array into another array, and then return a sorted version
> of the intermediate array. The four tests should print OK, as each invocation
> of either Sub1 or Sub2 should return a three-element array.

No, it should not. Expressions in scalar context don't return lists,
they return scalars. 'sort' in scalar context doesn't actually do
anything, and returns undefined.

Your first invokation of Sub1 is in scalar context, hence the sort is
in scalar context.

Abigail

Michael G Schwern

unread,
Jun 11, 2002, 3:20:30 PM6/11/02
to Abigail, da...@whistlingcat.com, perl5-...@perl.org
On Tue, Jun 11, 2002 at 02:03:46AM -0700, Abigail wrote:
> > Both of the subroutines in this script take an arrayref and extract some items
> > from the underlying array into another array, and then return a sorted version
> > of the intermediate array. The four tests should print OK, as each invocation
> > of either Sub1 or Sub2 should return a three-element array.
>
> No, it should not. Expressions in scalar context don't return lists,
> they return scalars. 'sort' in scalar context doesn't actually do
> anything, and returns undefined.

If this is good behavior, which I don't think it is, it isn't documented.

The more logical thing for sort in scalar context to do is skip the sort and
just return the number of elements it got passed in, just like you'd have
gotten if the sort wasn't there at all. This is much less surprising.

So if this returns 3:

$ perl -wle '@a = qw(c b a); print scalar @a'
3

why not this:

$ perl -wle '@a = qw(c b a); print scalar sort @a'
Use of uninitialized value in print at -e line 1.


More importantly, the user can't tell the difference if the sort is returned
from a subroutine, which is what bit David. You have to be careful about
how you use sort() in a subroutine

# dangerous
sub foo { my @a = qw(c b a); return sort @a; }

# safe
sub foo { my @a = qw(c b a); @a = sort @a; return @a; }

and there's really no reason it should be so.


The fix should be simple. pp_sort.c does this:

if (gimme != G_ARRAY) {
SP = MARK;
RETPUSHUNDEF;
}

and it's a small matter of changing that RETURNPUSHUNDEF with something that
returns the number of elements in the given list. I'll poke at it, but it
would make me happy if someone could beat me to it. :)


This is very old functionality, going back to at least 5.004 and probably
back further. However, since it is undocumented, not useful and actively
confusing and counter-intuitive, it would be best to change it to something
which at least DWIM.


--
This sig file temporarily out of order.

Arthur Bergman

unread,
Jun 11, 2002, 3:23:44 PM6/11/02
to Michael G Schwern, Abigail, da...@whistlingcat.com, perl5-...@perl.org

On tisdag, juni 11, 2002, at 09:20 , Michael G Schwern wrote:

If so, one could make the case that it should sort aswell.

my @foo = qw(1 2 3);
sub foo { sort @foo}
scalar foo();

?

Arthur

Ronald J Kimball

unread,
Jun 11, 2002, 4:31:32 PM6/11/02
to Michael G Schwern, Abigail, da...@whistlingcat.com, perl5-...@perl.org
On Tue, Jun 11, 2002 at 03:20:30PM -0400, Michael G Schwern wrote:
> On Tue, Jun 11, 2002 at 02:03:46AM -0700, Abigail wrote:
> > > Both of the subroutines in this script take an arrayref and extract some items
> > > from the underlying array into another array, and then return a sorted version
> > > of the intermediate array. The four tests should print OK, as each invocation
> > > of either Sub1 or Sub2 should return a three-element array.
> >
> > No, it should not. Expressions in scalar context don't return lists,
> > they return scalars. 'sort' in scalar context doesn't actually do
> > anything, and returns undefined.
>
> If this is good behavior, which I don't think it is, it isn't documented.
>
> The more logical thing for sort in scalar context to do is skip the sort and
> just return the number of elements it got passed in, just like you'd have
> gotten if the sort wasn't there at all. This is much less surprising.

I disagree with this suggestion. First, returning the number of elements
is not necessarily what would happen if the sort weren't there at all:

@x = sort('dog', 'cat', 'fish');
$x = ('dog', 'cat', 'fish');

Second, if one wants the behavior that occurs without the sort, then I
think one should omit the sort explicitly.

It would be just as logical, I think, for sort in scalar context to sort
and return the first element, or sort and return the last element, or sort
in place, or sort and return the number of comparisons made...

In this case, given all these possibilities, it's difficult to choose which
is actually the best option. Therefore, I think the appropriate solution
is to do nothing (except document this behavior and maybe add a warning
when sort is called in scalar context).

>
> So if this returns 3:
>
> $ perl -wle '@a = qw(c b a); print scalar @a'
> 3
>
> why not this:
>
> $ perl -wle '@a = qw(c b a); print scalar sort @a'
> Use of uninitialized value in print at -e line 1.
>
>
> More importantly, the user can't tell the difference if the sort is returned
> from a subroutine, which is what bit David. You have to be careful about
> how you use sort() in a subroutine
>
> # dangerous
> sub foo { my @a = qw(c b a); return sort @a; }
>
> # safe
> sub foo { my @a = qw(c b a); @a = sort @a; return @a; }
>
> and there's really no reason it should be so.

I don't think there's any reason it shouldn't be so. You have to be
careful about how you use any context-sensitive expression in a subroutine,
such as split(), map(), grep(), and subroutine calls.

We do have wantarray(), after all.


> The fix should be simple. pp_sort.c does this:
>
> if (gimme != G_ARRAY) {
> SP = MARK;
> RETPUSHUNDEF;
> }
>
> and it's a small matter of changing that RETURNPUSHUNDEF with something that
> returns the number of elements in the given list. I'll poke at it, but it
> would make me happy if someone could beat me to it. :)
>
>
> This is very old functionality, going back to at least 5.004 and probably
> back further. However, since it is undocumented, not useful and actively
> confusing and counter-intuitive, it would be best to change it to something
> which at least DWIM.

I'm just not sure returning the number of elements is useful or what most
people mean.

Ronald

Rafael Garcia-Suarez

unread,
Jun 11, 2002, 4:56:53 PM6/11/02
to Ronald J Kimball, Michael G Schwern, Abigail, da...@whistlingcat.com, perl5-...@perl.org
On 2002.06.11 22:31 Ronald J Kimball wrote:
> I disagree with this suggestion. First, returning the number of elements
> is not necessarily what would happen if the sort weren't there at all:
>
> @x = sort('dog', 'cat', 'fish');
> $x = ('dog', 'cat', 'fish');

Note also that sort in scalar context warns with 5.8.0.

Michael G Schwern

unread,
Jun 11, 2002, 6:46:38 PM6/11/02
to Ronald J Kimball, Abigail, da...@whistlingcat.com, perl5-...@perl.org
On Tue, Jun 11, 2002 at 04:31:32PM -0400, Ronald J Kimball wrote:
> I disagree with this suggestion. First, returning the number of elements
> is not necessarily what would happen if the sort weren't there at all:
>
> @x = sort('dog', 'cat', 'fish');
> $x = ('dog', 'cat', 'fish');

Why would you sort a constant list? Is there any reasonable way to get this
effect?


> Second, if one wants the behavior that occurs without the sort, then I
> think one should omit the sort explicitly.

You have a subroutine which returns a sorted list. A user wants to know how
many elements are in that list. The obvious thing to do is:

sub foo { ... return sort @list }

# and elsewhere
print scalar foo();

and that doesn't DWIM. This is a problem. Removing the sort just causes
another problem: sure you can get the number of elements just fine, but now
foo() doesn't sort the list!


> It would be just as logical, I think, for sort in scalar context to sort
> and return the first element, or sort and return the last element, or sort
> in place, or sort and return the number of comparisons made...

I find these options to be dubious. Having the ability to do any of the
above somehow is nice, but making the leap that sort in scalar context
should do any of them... I don't see it.

sort() returning undef solves no problems, it's just there because we can't
agree. It in fact creates a surprise. Changing sort to return the number of
elements removes that surprise. Changing it to any of the above just adds
another neat, unexpected feature.


> > More importantly, the user can't tell the difference if the sort is returned
> > from a subroutine, which is what bit David. You have to be careful about
> > how you use sort() in a subroutine
> >
> > # dangerous
> > sub foo { my @a = qw(c b a); return sort @a; }
> >
> > # safe
> > sub foo { my @a = qw(c b a); @a = sort @a; return @a; }
> >
> > and there's really no reason it should be so.
>
> I don't think there's any reason it shouldn't be so. You have to be
> careful about how you use any context-sensitive expression in a subroutine,
> such as split(), map(), grep(), and subroutine calls.

What suprise am I missing here?

$ perl -wle 'sub foo { return map { $_ } ("dog", "cat", "fish") } $x = foo(); print $x'
3

$ perl -wle 'sub foo { return grep { $_ } ("dog", "cat", "fish") } $x = foo(); print $x'
3

$ perl -wle 'sub foo { return split / /, "dog cat fish" } $x = foo(); print $x'
3

$ perl -wle 'sub foo { return sort("dog", "cat", "fish") } $x = foo(); print $x'


Use of uninitialized value in print at -e line 1.


o/` One of these things is not like the other o/` :)

Michael G Schwern

unread,
Jun 11, 2002, 6:48:48 PM6/11/02
to Rafael Garcia-Suarez, Ronald J Kimball, Abigail, da...@whistlingcat.com, perl5-...@perl.org

It does for these simple cases:

$ perl5.8.0 -wle 'my $x = sort("dog", "cat", "fish")'
Useless use of sort in scalar context at -e line 1.

$ perl5.8.0 -wle 'my $x = 42 if sort("dog", "cat", "fish")'
Useless use of sort in scalar context at -e line 1.


but not for the problem case:

$ perl5.8.0 -wle 'sub foo { return sort("dog", "cat", "fish") } $x = foo(); print $x'


Use of uninitialized value in print at -e line 1.

Ronald J Kimball

unread,
Jun 11, 2002, 11:23:39 PM6/11/02
to Michael G Schwern, Abigail, da...@whistlingcat.com, perl5-...@perl.org
On Tue, Jun 11, 2002 at 06:46:38PM -0400, Michael G Schwern wrote:
> On Tue, Jun 11, 2002 at 04:31:32PM -0400, Ronald J Kimball wrote:
> > I disagree with this suggestion. First, returning the number of elements
> > is not necessarily what would happen if the sort weren't there at all:
> >
> > @x = sort('dog', 'cat', 'fish');
> > $x = ('dog', 'cat', 'fish');
>
> Why would you sort a constant list? Is there any reasonable way to get this
> effect?

Why would you call sort in a scalar context?

How about this, then:

@x = sort($m, $n, $o);
$x = ($m, $n, $o);

> > Second, if one wants the behavior that occurs without the sort, then I
> > think one should omit the sort explicitly.
>
> You have a subroutine which returns a sorted list. A user wants to know how
> many elements are in that list. The obvious thing to do is:
>
> sub foo { ... return sort @list }
>
> # and elsewhere
> print scalar foo();
>
> and that doesn't DWIM. This is a problem. Removing the sort just causes
> another problem: sure you can get the number of elements just fine, but now
> foo() doesn't sort the list!

This problem is properly solved by rewriting the subroutine to use
wantarray(), not by changing the behavior of sort. e.g.

sub foo { return wantarray ? sort @list : scalar(@list) }
# the use scalar() is redundant but useful for clarity


> > It would be just as logical, I think, for sort in scalar context to sort
> > and return the first element, or sort and return the last element, or sort
> > in place, or sort and return the number of comparisons made...
>
> I find these options to be dubious. Having the ability to do any of the
> above somehow is nice, but making the leap that sort in scalar context
> should do any of them... I don't see it.

Returning the first element seems like an obvious behavior, and other
people have, in fact, suggested it before. It would certainly be much more
useful than returning the number of elements in the list.


> sort() returning undef solves no problems, it's just there because we can't
> agree. It in fact creates a surprise. Changing sort to return the number of
> elements removes that surprise. Changing it to any of the above just adds
> another neat, unexpected feature.

We can't agree, and therefore changing the sort to return the number of
elements only removes the surprise for people who agree with your position.
:)

I don't think that having sort() return the number of elements solves a
problem. The problem is that calling sort in scalar context is a mistake.
The solution is to notify the user that a mistake has been made so that
they can correct it.


> > > More importantly, the user can't tell the difference if the sort is returned
> > > from a subroutine, which is what bit David. You have to be careful about
> > > how you use sort() in a subroutine
> > >
> > > # dangerous
> > > sub foo { my @a = qw(c b a); return sort @a; }
> > >
> > > # safe
> > > sub foo { my @a = qw(c b a); @a = sort @a; return @a; }
> > >
> > > and there's really no reason it should be so.
> >
> > I don't think there's any reason it shouldn't be so. You have to be
> > careful about how you use any context-sensitive expression in a subroutine,
> > such as split(), map(), grep(), and subroutine calls.
>
> What suprise am I missing here?
>
> $ perl -wle 'sub foo { return map { $_ } ("dog", "cat", "fish") } $x = foo(); print $x'
> 3
>
> $ perl -wle 'sub foo { return grep { $_ } ("dog", "cat", "fish") } $x = foo(); print $x'
> 3
>
> $ perl -wle 'sub foo { return split / /, "dog cat fish" } $x = foo(); print $x'
> 3
>
> $ perl -wle 'sub foo { return sort("dog", "cat", "fish") } $x = foo(); print $x'
> Use of uninitialized value in print at -e line 1.
>
>
> o/` One of these things is not like the other o/` :)

The behavior changes based on the context of the expression. If you don't
pay attention to the context, you will not get the behavior you expect.

Also note that the number returned grep, map, and split in scalar context
is often not the same as the number of elements in the argument list, as
opposed to your suggestion for sort.

Ronald

Michael G Schwern

unread,
Jun 12, 2002, 12:13:03 AM6/12/02
to Ronald J Kimball, Abigail, da...@whistlingcat.com, perl5-...@perl.org
I was going to do another point-by-point reply, but I think I've got the
crux of the problem here.

There's three distict issues, and I'm only worried about #3.


Issue #1: sort in boolean context.

if( sort @array ) { ... }

this is pretty straight-forward. It's useless, and 5.8.0 warns you as such.
Good.


Issue #2: sort in simple scalar context.

my $x = sort @array;

this could potentially do many things. Return undef, return the number of
elements, return the first element, the last element, etc... Currently, it
returns undef and warns you. That's an area ripe for discussion, but not
now.


Issue #3: sort in "variable" context.

sub foo { ... return sort @array }

The current behavior of this subroutine is surprising. Unlike similar
operators (map, grep, split, etc...), this will not work when foo() is used
in scalar context.

sub foo { ... return @array }
sub foo { ... return @array[1..4] }
sub foo { ... return map { ... } @array }
sub foo { ... return grep { ... } @array }
sub foo { ... return split / /, $var }

Those all will. And that's the surprise for the author and caller of foo().
No similar Perl function requires that you manually do the wantarray()
logic, why would you expect to have to do it with sort?

Now, is this a common problem? Here's a little straw poll of all the
modules I have installed which contain /return sort/.

CPAN
Apache::MP3::Sorted
Apache::MP3
ExtUtils::Installed
Debian::Defoma::Id
Net::LDAP::ASN
DBI
Xmms
Bio::DB::GFF::Feature
Bio::SeqFeature::Gene::Transcript
Bio::Tree::Node


$ perl -MDBI -le 'printf "You have %d DBI drivers\n", scalar DBI->available_drivers'
You have 0 DBI drivers

That's funny. I could have sworn I'd installed some DBD modules.


$ perl -MExtUtils::Installed -le '$i = ExtUtils::Installed->new; printf "I have %d modules installed\n", scalar $i->modules'
I have 0 modules installed

AHHH! Where'd all my modules go!!!! ;)

If Alan Burlison, Lincoln Stein, Andreas Koenig, Graham Barr and Tim Bunce
all get tripped by this, I think it's a problem.

Graham Barr

unread,
Jun 12, 2002, 5:09:41 AM6/12/02
to Michael G Schwern, Ronald J Kimball, Abigail, da...@whistlingcat.com, perl5-...@perl.org
On Wed, Jun 12, 2002 at 12:13:03AM -0400, Michael G Schwern wrote:
> I was going to do another point-by-point reply, but I think I've got the
> crux of the problem here.

If you ask a 3 year old "whats 1 + 1" and they reply 3, you say no its "2"
they are surprised. Does that mean that 2 is the wrong answer ? No, it
just means they need to learn.

> There's three distict issues, and I'm only worried about #3.
>
>
> Issue #1: sort in boolean context.
>
> if( sort @array ) { ... }

Why would you do that ?

> this is pretty straight-forward. It's useless, and 5.8.0 warns you as such.
> Good.
>
>
> Issue #2: sort in simple scalar context.
>
> my $x = sort @array;
>
> this could potentially do many things. Return undef, return the number of
> elements, return the first element, the last element, etc... Currently, it
> returns undef and warns you. That's an area ripe for discussion, but not
> now.
>
>
> Issue #3: sort in "variable" context.
>
> sub foo { ... return sort @array }
>
> The current behavior of this subroutine is surprising.

I fail to see why this should be surprising if the return value of foo()
is documented properly.

> Unlike similar
> operators (map, grep, split, etc...), this will not work when foo() is used
> in scalar context.

Define "work", what you really mean is that it will not do what YOU want
it to do.

>
> sub foo { ... return @array }
> sub foo { ... return @array[1..4] }
> sub foo { ... return map { ... } @array }
> sub foo { ... return grep { ... } @array }
> sub foo { ... return split / /, $var }
>
> Those all will. And that's the surprise for the author and caller of foo().

No, thats YOUR surprise. If you read the docs properly you will know what it would
return.

> No similar Perl function requires that you manually do the wantarray()
> logic, why would you expect to have to do it with sort?
>
> Now, is this a common problem? Here's a little straw poll of all the
> modules I have installed which contain /return sort/.
>
> CPAN
> Apache::MP3::Sorted
> Apache::MP3
> ExtUtils::Installed
> Debian::Defoma::Id
> Net::LDAP::ASN

Your pattern needs fixing as Net::LDAP::ASN does not call sort at all.

> DBI
> Xmms
> Bio::DB::GFF::Feature
> Bio::SeqFeature::Gene::Transcript
> Bio::Tree::Node
>
>
> $ perl -MDBI -le 'printf "You have %d DBI drivers\n", scalar DBI->available_drivers'
> You have 0 DBI drivers

One could just as easily say that that was a bug in available_drivers, but then if it is
not documented to work in a scalar context, why should it.

> That's funny. I could have sworn I'd installed some DBD modules.
>
>
> $ perl -MExtUtils::Installed -le '$i = ExtUtils::Installed->new; printf "I have %d modules installed\n", scalar $i->modules'
> I have 0 modules installed
>
> AHHH! Where'd all my modules go!!!! ;)
>
> If Alan Burlison, Lincoln Stein, Andreas Koenig, Graham Barr and Tim Bunce
> all get tripped by this, I think it's a problem.

Not me ? So please don't use my name as justifaction for a bad premise

Graham.

H.Merijn Brand

unread,
Jun 12, 2002, 5:28:22 AM6/12/02
to Graham Barr, Perl 5 Porters
On Wed 12 Jun 2002 11:09, Graham Barr <gb...@pobox.com> wrote:
> On Wed, Jun 12, 2002 at 12:13:03AM -0400, Michael G Schwern wrote:
> > I was going to do another point-by-point reply, but I think I've got the
> > crux of the problem here.
>
> If you ask a 3 year old "whats 1 + 1" and they reply 3, you say no its "2"
> they are surprised. Does that mean that 2 is the wrong answer ? No, it
> just means they need to learn.
>
> > There's three distict issues, and I'm only worried about #3.
> >
> >
> > Issue #1: sort in boolean context.
> >
> > if( sort @array ) { ... }
>
> Why would you do that ?

Undeliberately, as shown in Schwerns previous example

unless (DBI->available_drivers) die "No DBI drivers available";

> > this is pretty
straight-forward. It's useless, and 5.8.0 warns you as such.
> > Good.

--
H.Merijn Brand Amsterdam Perl Mongers (http://amsterdam.pm.org/)
using perl-5.6.1, 5.8.0 & 632 on HP-UX 10.20 & 11.00, AIX 4.2, AIX 4.3,
WinNT 4, Win2K pro & WinCE 2.11. Smoking perl CORE: smo...@perl.org
http://archives.develooper.com/daily...@perl.org/ per...@perl.org
send smoke reports to: smokers...@perl.org, QA: http://qa.perl.org


Nick Ing-Simmons

unread,
Jun 12, 2002, 12:03:11 PM6/12/02
to sch...@pobox.com, da...@whistlingcat.com, perl5-...@perl.org, Abigail
Michael G Schwern <sch...@pobox.com> writes:
>On Tue, Jun 11, 2002 at 02:03:46AM -0700, Abigail wrote:
>> > Both of the subroutines in this script take an arrayref and extract some items
>> > from the underlying array into another array, and then return a sorted version
>> > of the intermediate array. The four tests should print OK, as each invocation
>> > of either Sub1 or Sub2 should return a three-element array.
>>
>> No, it should not. Expressions in scalar context don't return lists,
>> they return scalars. 'sort' in scalar context doesn't actually do
>> anything, and returns undefined.
>
>If this is good behavior, which I don't think it is, it isn't documented.
>
>The more logical thing for sort in scalar context to do is skip the sort and
>just return the number of elements it got passed in, just like you'd have
>gotten if the sort wasn't there at all. This is much less surprising.

It is supposed to be surprising. Most people expect

$foo = sort ...;

to return the largest/smallest entry so returning undef surprises
everyone ... a warning "sort in scalar context" would be fine too.

>
>This is very old functionality, going back to at least 5.004 and probably
>back further. However, since it is undocumented, not useful and actively
>confusing and counter-intuitive, it would be best to change it to something
>which at least DWIM.

There was considerable heat on "What It Meant" last time around.
Just returning the length seems like least interesting returning
one of the extreams would give it value but needs real code behind
it...

--
Nick Ing-Simmons
http://www.ni-s.u-net.com/

Michael G Schwern

unread,
Jun 12, 2002, 12:26:11 PM6/12/02
to Graham Barr, Ronald J Kimball, Abigail, da...@whistlingcat.com, perl5-...@perl.org
On Wed, Jun 12, 2002 at 10:09:41AM +0100, Graham Barr wrote:
> > Issue #1: sort in boolean context.
> >
> > if( sort @array ) { ... }
>
> Why would you do that ?

You wouldn't, that's why we warn about it. Like I said, not what I'm
worried about, just trying to clarify the issue by eliminating this case
from discussion. Perl's current behavior is ok.

> > this is pretty straight-forward. It's useless, and 5.8.0 warns you as such.
> > Good.

<snip>

> > Issue #3: sort in "variable" context.
> >
> > sub foo { ... return sort @array }
> >
> > The current behavior of this subroutine is surprising.
>
> I fail to see why this should be surprising if the return value of foo()
> is documented properly.

Because the author of foo() likely doesn't even know this behavior exists.
There's no warning, it's not documented and there's nothing else in Perl
that behaves like sort().

I'm quite certain both DBI->available_drivers and
ExtUtils::Installed->modules weren't intended to return undef in scalar
context. I've already hacked around it inside ExtUtils::Installed->modules.


> > Net::LDAP::ASN
>
> Your pattern needs fixing as Net::LDAP::ASN does not call sort at all.

You're right, I should have looked deeper into it.


> > $ perl -MDBI -le 'printf "You have %d DBI drivers\n", scalar DBI->available_drivers'
> > You have 0 DBI drivers
>
> One could just as easily say that that was a bug in available_drivers, but
> then if it is not documented to work in a scalar context, why should it.

Explicitly documenting in every module that each function which returns an
array also returns the number of elements in that array in scalar context is
redundant information, like documenting "This function returns a number
which can also be used as a string". That's expected, it's how Perl works
and how we expect well-behaved functions to work. You document the
exceptions.

Tim didn't sit down one day and think, "Hmm, I really should prevent anyone
from using DBI->available_drivers in scalar context."

Michael G Schwern

unread,
Jun 12, 2002, 12:36:21 PM6/12/02
to Nick Ing-Simmons, da...@whistlingcat.com, perl5-...@perl.org, Abigail
On Wed, Jun 12, 2002 at 05:03:11PM +0100, Nick Ing-Simmons wrote:
> >The more logical thing for sort in scalar context to do is skip the sort and
> >just return the number of elements it got passed in, just like you'd have
> >gotten if the sort wasn't there at all. This is much less surprising.
>
> It is supposed to be surprising. Most people expect
>
> $foo = sort ...;
>
> to return the largest/smallest entry so returning undef surprises
> everyone ... a warning "sort in scalar context" would be fine too.

We already have such a warning, but it only works for simple cases.

$ perl5.8.0 -wle 'my $foo = sort qw(1 2 3); print $foo'


Useless use of sort in scalar context at -e line 1.

Use of uninitialized value in print at -e line 1.

$ perl5.8.0 -wle 'sub foo { return sort @_ } my $foo = foo(qw(1 2 3)); print $foo'


Use of uninitialized value in print at -e line 1.

It's the second case which is the surprise. The problem being that when you
write "$foo = sort @array" it's obviously suspicious. But when you write
"return sort @array" it's so natural most people wouldn't think twice.

You certainly wouldn't expect to have to write it the "right" way

sub foo { return wantarray ? sort @_ : @_ }

would you?

So that's why I think having sort return the number of elements in scalar
context, even if it is the least interesting option. It makes returning a
sorted list more natural.

You can even leave the compile-time warning on for the simple case.

Larry Wall

unread,
Jun 12, 2002, 12:44:18 PM6/12/02
to Nick Ing-Simmons, sch...@pobox.com, da...@whistlingcat.com, perl5-...@perl.org, Abigail
On Wed, 12 Jun 2002, Nick Ing-Simmons wrote:
> There was considerable heat on "What It Meant" last time around.
> Just returning the length seems like least interesting returning
> one of the extreams would give it value but needs real code behind
> it...

I think returning an extreme ought to be explicit. Perl 6 breaks
the C comma operator, and never returns the last element of a list
implicitly. You have to use [-1] for that.

Larry

John P. Linderman

unread,
Jun 12, 2002, 12:55:51 PM6/12/02
to Larry Wall, Nick Ing-Simmons, sch...@pobox.com, da...@whistlingcat.com, perl5-...@perl.org, Abigail

Two votes (at least) for explicitness. Quicksort and merge sort
(and Abigail's favorite, heapsort) can save a lot of work if they
know that only the N largest or smallest elements are of interest.
So, tell them. -- jpl

Nick Ing-Simmons

unread,
Jun 12, 2002, 2:39:24 PM6/12/02
to sch...@pobox.com, Graham Barr, Abigail, da...@whistlingcat.com, perl5-...@perl.org, Ronald J Kimball
Michael G Schwern <sch...@pobox.com> writes:
>
>Because the author of foo() likely doesn't even know this behavior exists.
>There's no warning, it's not documented and there's nothing else in Perl
>that behaves like sort().

Presumably we could warn if sort was child of a 'return' or last statement
of a sub ?

Nick Ing-Simmons

unread,
Jun 12, 2002, 2:57:24 PM6/12/02
to j...@research.att.com, Abigail, Nick Ing-Simmons, da...@whistlingcat.com, sch...@pobox.com, perl5-...@perl.org, Larry Wall
John P. Linderman <j...@research.att.com> writes:
>Larry said:
>> On Wed, 12 Jun 2002, Nick Ing-Simmons wrote:
>> > There was considerable heat on "What It Meant" last time around.
>> > Just returning the length seems like least interesting returning
>> > one of the extreams would give it value but needs real code behind
>> > it...
>>
>> I think returning an extreme ought to be explicit. Perl 6 breaks
>> the C comma operator, and never returns the last element of a list
>> implicitly. You have to use [-1] for that.
>
>Two votes (at least) for explicitness.

But Larry's vote carries near-infinite weight...

>Quicksort and merge sort
>(and Abigail's favorite, heapsort) can save a lot of work if they
>know that only the N largest or smallest elements are of interest.
>So, tell them. -- jpl

But spotting

my ($n) = sort ...
or
my ($p,q) = (sort ..)[-2,-1];

Is much harder to do than the scalar/list context.

But consider me convinced by the explicit argument - we
presumably should invent max() and min() if that is what we mean.

Michael G Schwern

unread,
Jun 12, 2002, 3:25:01 PM6/12/02
to Nick Ing-Simmons, j...@research.att.com, Abigail, Nick Ing-Simmons, da...@whistlingcat.com, perl5-...@perl.org, Larry Wall
On Wed, Jun 12, 2002 at 07:57:24PM +0100, Nick Ing-Simmons wrote:
> But consider me convinced by the explicit argument - we
> presumably should invent max() and min() if that is what we mean.

use List::Util qw(max min maxstr minstr);

Though it is unfortunate that max/min aren't generic. ie.

my $max = max { your sort criteria } @array;

but I think my nit-picking account is overdrawn. :)

Benjamin Goldberg

unread,
Jun 12, 2002, 8:17:08 PM6/12/02
to Nick Ing-Simmons, j...@research.att.com, Abigail, Nick Ing-Simmons, da...@whistlingcat.com, sch...@pobox.com, perl5-...@perl.org, Larry Wall
Nick Ing-Simmons wrote:
>
> John P. Linderman <j...@research.att.com> writes:
> >Larry said:
> >> On Wed, 12 Jun 2002, Nick Ing-Simmons wrote:
> >> > There was considerable heat on "What It Meant" last time around.
> >> > Just returning the length seems like least interesting returning
> >> > one of the extreams would give it value but needs real code behind
> >> > it...
> >>
> >> I think returning an extreme ought to be explicit. Perl 6 breaks
> >> the C comma operator, and never returns the last element of a list
> >> implicitly. You have to use [-1] for that.
> >
> >Two votes (at least) for explicitness.
>
> But Larry's vote carries near-infinite weight...
>
> >Quicksort and merge sort
> >(and Abigail's favorite, heapsort) can save a lot of work if they
> >know that only the N largest or smallest elements are of interest.
> >So, tell them. -- jpl
>
> But spotting
>
> my ($n) = sort ...
> or
> my ($p,q) = (sort ..)[-2,-1];
>
> Is much harder to do than the scalar/list context.

Isn't Want::want() able to do this? Or at least some of it?

> But consider me convinced by the explicit argument - we
> presumably should invent max() and min() if that is what we mean.

Surely you mean, invent max_n and min_n, which would return the n
largest or n smallest elements.

my ($p, $q) = max_n { $a <=> $b } 2, @list;

--
tr/`4/ /d, print "@{[map --$| ? ucfirst lc : lc, split]},\n" for
pack 'u', pack 'H*', 'ab5cf4021bafd28972030972b00a218eb9720000';

John P. Linderman

unread,
Jun 13, 2002, 8:35:17 AM6/13/02
to perl5-...@perl.org
Benjamin Goldberg chimed in:

That's what I had in mind. But we want to be careful about the interface.
One the one hand, there are worthwhile savings to be had if we don't
guarantee the order in which the elements are returned. For example,
in a qsort-like implementation, having completed the initial partitioning
with l low elements, e equal elements, and g greater elements, if we are
after the m minimal elements, if l <= m, there's no need to continue
sorting the low elements. One the other hand, if the user *does*
want the elements sorted, it would be wasteful not to continue
with the sorting on the spot, and a Very Bad Thing to force the user
to restate the comparison algorithm to impose the order.

(A qsort-like implementation runs the risk of the same quadratic
behavior seen in sorting, and two minutes' thought hasn't turned
up a way to get any advantage from *not* ordering the elements
using mergesort, so maybe the order ought to be guaranteed.
But, like sort(), stability may or may not matter to the end user,
so I favor letting the users explicitly state what matters,
rather than ham-stringing implementations. Pragma?) -- jpl

Nick Ing-Simmons

unread,
Jun 13, 2002, 8:37:11 AM6/13/02
to gol...@earthlink.net, Nick Ing-Simmons, sch...@pobox.com, da...@whistlingcat.com, Nick Ing-Simmons, perl5-...@perl.org, Larry Wall, Abigail, j...@research.att.com
Benjamin Goldberg <gol...@earthlink.net> writes:
>> But consider me convinced by the explicit argument - we
>> presumably should invent max() and min() if that is what we mean.
>
>Surely you mean, invent max_n and min_n, which would return the n
>largest or n smallest elements.
>
>my ($p, $q) = max_n { $a <=> $b } 2, @list;

I don't want that - it is too messy.

Let perl6 solve the general problem.

Graham Barr

unread,
Jun 13, 2002, 10:34:38 AM6/13/02
to Michael G Schwern, Nick Ing-Simmons, j...@research.att.com, Abigail, Nick Ing-Simmons, da...@whistlingcat.com, perl5-...@perl.org, Larry Wall

I agree its unfortunate, but the reason is because prototypes
don't allow me to make the code block optional and I did not
want the code block for the case of min/max numeric

Graham.

0 new messages