Trying to avoid passing params to subs through globals

Justin C

unread,

Dec 10, 2009, 7:56:58 AM12/10/09

to

Further to me recent problem of unblessed references, I'm re-writing my
code, putting more of it into subroutines so that the main thread is
more obvious. I'm finding that where I was using global variables there
appear to be a lot of variables that need passing to some sub-routines.

I'm thinking of creating a few hashes of references to these variables,
and then passing the whole hash (or a reference to it) to the
subroutine.

my %excel = (
"worksheet" => \$ws,
"workboot" => \$wb,
"format_1" => \$format_1,
"format_2" => \$format_2,
);

my %document = (
"pdf" => \$pdf,
"table" => \$table,
);

Firstly, if these are defined near the begining of the program they look
just like globals to me! What's the difference? OK, so I can just
declare "my %excel =();" and then have a subroutine(s) populate it on an
ad-hoc basis, that, I suppose, looks less global.

To avoid a whole bunch of vars being global I can see that I'm going to
have to nest a lot of sub-routines so that I can keep scope minimal.

The other thing is, if $ws, for example, was returned as a reference (I
think I really need to see where I should return references, and where I
should return the actual thing... if it's an object return the thing, if
it's scalar, array or hash return a ref? maybe that's too crude) I
surely shouldn't be creating a reference to it (in the hash
assignement), so it should be "worksheet" => $ws" instead of "\$ws",
shouldn't it?

Sorry if the above sounds a bit random, it's hitting the keyboard as
it's coming to mind, and I shouldn't really be doing this now, I'm
supposed to be working... though the work I'm supposed to be doing is
getting this program to do what we want - it's a little chicken and egg
here at the moment.

Justin.

--
Justin C, by the sea.

Tad McClellan

unread,

Dec 10, 2009, 9:51:15 AM12/10/09

to

Justin C <justi...@purestblue.com> wrote:

> I'm thinking of creating a few hashes of references to these variables,
> and then passing the whole hash (or a reference to it) to the
> subroutine.
>
> my %excel = (
> "worksheet" => \$ws,
> "workboot" => \$wb,
> "format_1" => \$format_1,
> "format_2" => \$format_2,
> );

You have 5 variables to keep track of there.

Consider using a hash _instead_ of the individual variables:

my %excel = (
ws => function_that_returns_worksheet_object(),
wb => function_that_returns_workbook_object(),
...
);

Then you would have only one variable (the hash) to keep track of.

> The other thing is, if $ws, for example, was returned as a reference (I
> think I really need to see where I should return references, and where I
> should return the actual thing...

If it is "small" return the actual thing, returning a reference instead
might be a good idea for performance reasons (rather than maintainablility
reasons) if the thing is "large".

> if it's an object return the thing, if
> it's scalar, array or hash return a ref?

Scalar references are not needed very often. At this point you
should try pretending that they do not even exist.

Note that you cannot return an array or a hash "thing", you can only
return a list or a scalar.

> surely shouldn't be creating a reference to it (in the hash
> assignement), so it should be "worksheet" => $ws" instead of "\$ws",
> shouldn't it?

yes, but then there would be no need for the $ws variable at all.

whereever you have "$ws" in your code, replace it with "$excel{worksheet}".

--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"

Randal L. Schwartz

unread,

Dec 10, 2009, 2:17:27 PM12/10/09

to

>>>>> "Justin" == Justin C <justi...@purestblue.com> writes:

Justin> Further to me recent problem of unblessed references, I'm re-writing my
Justin> code, putting more of it into subroutines so that the main thread is
Justin> more obvious. I'm finding that where I was using global variables there
Justin> appear to be a lot of variables that need passing to some sub-routines.

Sounds like related state correlated with related behavior.

That's called an "object".

Might wanna look into that. :)

If you need, I can recommend a good book (or two :).

print "Just another Perl hacker,"; # the original

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<mer...@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion

Ben Morrow

unread,

Dec 10, 2009, 5:15:59 PM12/10/09

to

Quoth Justin C <justi...@purestblue.com>:

>
> Further to me recent problem of unblessed references, I'm re-writing my
> code, putting more of it into subroutines so that the main thread is
> more obvious. I'm finding that where I was using global variables there
> appear to be a lot of variables that need passing to some sub-routines.
>
> I'm thinking of creating a few hashes of references to these variables,
> and then passing the whole hash (or a reference to it) to the
> subroutine.
>
> my %excel = (
> "worksheet" => \$ws,
> "workboot" => \$wb,
> "format_1" => \$format_1,
> "format_2" => \$format_2,
> );

As Tad said, at this point the extra variables aren't doing anything for
you. Just use the hash entries, so instead of

my $ws;

create_worksheet();
populate_worksheet();

sub create_worksheet {
# this is just an example, of course
$ws = Worksheet->new;
}

sub populate_worksheet {
$ws->add_cell("A1", "foo");
}

you might have

my %excel;

create_worksheet(\%excel);
populate_worksheet(\%excel);

sub create_worksheet {
my ($excel) = @_;
$excel{ws} = Worksheet->new;
}

sub populate_worksheet {
my ($excel) = @_;
$excel{ws}->add_cell("A1", "foo");
}

This is still less clear than it might be, since you're passing %excel
by reference and then modifying it. It's better than before, since at
least you can see that's what's going on, but it's not clear from the
call that create_worksheet adds an entry but populate_worksheet modifies
a property of an existing entry. The 'correct' way to write this would
be

my $ws = create_worksheet();
populate_worksheet($ws);

sub create_worksheet {
return Worksheet->new;
}

sub populate_worksheet {
my ($ws) = @_;
$ws->add_cell("A1", "foo");
}

The important difference here is that each sub is only passed the data
it needs to do its job (and is passed *all* that data), so you can see
from the call what is going on.

> my %document = (
> "pdf" => \$pdf,
> "table" => \$table,
> );
>
> Firstly, if these are defined near the begining of the program they look
> just like globals to me! What's the difference? OK, so I can just
> declare "my %excel =();" and then have a subroutine(s) populate it on an
> ad-hoc basis, that, I suppose, looks less global.

One way to enforce discipline (until you get into the habit of doing
things right) is to put the 'main body' of your script into a 'sub
main', and then have a call to 'main()' as the only statement outside a
subroutine. That way you won't find yourself declaring variables a
globals when they could actually just be locals in 'main'.

> To avoid a whole bunch of vars being global I can see that I'm going to
> have to nest a lot of sub-routines so that I can keep scope minimal.

I'm not sure what you mean by 'nest subroutines'. If you mean putting
one named sub inside another named sub, like this

sub foo {
sub bar { ... }
}

then this is always a mistake (anon subs are a different matter, but I
don't think you're talking about that). As a general rule, you should be
passing subs the data they need to work on as parameters, and passing
the results back as the return value.

> The other thing is, if $ws, for example, was returned as a reference (I
> think I really need to see where I should return references, and where I
> should return the actual thing... if it's an object return the thing, if
> it's scalar, array or hash return a ref? maybe that's too crude)

Objects are refs already, remember, so returning an object *is*
returning a ref. (Strictly speaking, what you get back from a
constructor like ->new isn't the object itself, but a reference to the
object.)

Returning an explicit ref to a local variable like $ws is
rather rare. About the only usual case is when you've built a complex
data structure and you return a ref to avoid needing to take it apart
into a list and put it back together again in the caller.

> I
> surely shouldn't be creating a reference to it (in the hash
> assignement), so it should be "worksheet" => $ws" instead of "\$ws",
> shouldn't it?

Err... it's not clear why you think this might be necessary, but
populating a hash with refs-to-refs-to-objects would be a little
peculiar.

> Sorry if the above sounds a bit random, it's hitting the keyboard as
> it's coming to mind, and I shouldn't really be doing this now, I'm
> supposed to be working... though the work I'm supposed to be doing is
> getting this program to do what we want - it's a little chicken and egg
> here at the moment.

The compromise between 'learning to program better, which will save time
in the future' and 'solving the immediate problem in front of me right
now' not always easy to find. In your case I would say that working on
the fundamentals is worth it, at this point.

Ben

Tad McClellan

unread,

Dec 10, 2009, 5:38:19 PM12/10/09

to

Ben Morrow <b...@morrow.me.uk> wrote:

> Just use the hash entries, so instead of
>
> my $ws;
>
> create_worksheet();
> populate_worksheet();
>
> sub create_worksheet {
> # this is just an example, of course
> $ws = Worksheet->new;
> }
>
> sub populate_worksheet {
> $ws->add_cell("A1", "foo");
> }
>
> you might have
>
> my %excel;
>
> create_worksheet(\%excel);
> populate_worksheet(\%excel);
>
> sub create_worksheet {
> my ($excel) = @_;
> $excel{ws} = Worksheet->new;

Oops!

We are modifying %excel and not using the $excel parameter for anything...

$excel->{ws} = Worksheet->new;
^^
^^ I'm pretty sure this is what Ben meant to write...

> }

Which leads to one of _my_ style points: don't use similar-looking names.

If we had followed that style rule, and made the same mistake:

my $excel; # a "long" form of the name in this big scope

create_worksheet(\%excel);
populate_worksheet(\%excel);

sub create_worksheet {
my ($xl) = @_; # a "short" form of the name in this little scope
$xl{ws} = Worksheet->new;
}

then perl would have found the mistake for us (assuming "use strict").

Ben Morrow

unread,

Dec 10, 2009, 6:45:22 PM12/10/09

to

Quoth Tad McClellan <ta...@seesig.invalid>:

> Ben Morrow <b...@morrow.me.uk> wrote:
>
> > you might have
> >
> > my %excel;
> >
> > create_worksheet(\%excel);
> > populate_worksheet(\%excel);
> >
> > sub create_worksheet {
> > my ($excel) = @_;
> > $excel{ws} = Worksheet->new;
>
>
> Oops!
>
> We are modifying %excel and not using the $excel parameter for anything...
>
> $excel->{ws} = Worksheet->new;
> ^^
> ^^ I'm pretty sure this is what Ben meant to write...

Yes. Thank you.

> > }
>
> Which leads to one of _my_ style points: don't use similar-looking names.
>
> If we had followed that style rule, and made the same mistake:
>
> my $excel; # a "long" form of the name in this big scope
>
> create_worksheet(\%excel);
> populate_worksheet(\%excel);
>
> sub create_worksheet {
> my ($xl) = @_; # a "short" form of the name in this little scope
> $xl{ws} = Worksheet->new;
> }
>
> then perl would have found the mistake for us (assuming "use strict").

I tend to use names with initial caps for file globals, where they are
necessary, for exactly this reason. It's worth pointing out at this
point that had %excel been minimally-scoped to start with 'strict' would
also have caught the error:

main();

sub main {
my %excel;
create_worksheet(\%excel);
}

sub create_worksheet {
my ($excel) = @_;

$excel{ws} = ...; # %excel is not in scope here
}

Of course, a block will do just as well as a sub, if you're certain it
will only ever be called once.

Ben

Justin C

unread,

Dec 13, 2009, 6:43:58 PM12/13/09

to

In article <slrnhi22gu...@tadbox.sbcglobal.net>, Tad McClellan wrote:
> Justin C <justi...@purestblue.com> wrote:
>
>> I'm thinking of creating a few hashes of references to these variables,
>> and then passing the whole hash (or a reference to it) to the
>> subroutine.
>>
>> my %excel = (
>> "worksheet" => \$ws,
>> "workboot" => \$wb,
>> "format_1" => \$format_1,
>> "format_2" => \$format_2,
>> );
>
>
> You have 5 variables to keep track of there.
>
> Consider using a hash _instead_ of the individual variables:
>
> my %excel = (
> ws => function_that_returns_worksheet_object(),
> wb => function_that_returns_workbook_object(),
> ...
> );

This is *very* interesting. Very, very interesting. Concrete examples using code that I'm using makes *so* much difference to my understanding.

> Then you would have only one variable (the hash) to keep track of.
>
>> The other thing is, if $ws, for example, was returned as a reference (I
>> think I really need to see where I should return references, and where I
>> should return the actual thing...
>
>
> If it is "small" return the actual thing, returning a reference instead
> might be a good idea for performance reasons (rather than maintainablility
> reasons) if the thing is "large".
>
>
>> if it's an object return the thing, if
>> it's scalar, array or hash return a ref?
>
>
> Scalar references are not needed very often. At this point you
> should try pretending that they do not even exist.
>
> Note that you cannot return an array or a hash "thing", you can only
> return a list or a scalar.

I *know* this yet having it actually stated (several times) helps to
hammer it home. If I pass a hash to a sub (function, must get used to
terminology - it's called a function because it performs 'some
function') the function receives a list in @_, if I want to use it as a
hash within the function I have to get it back into a hash - that's why
I should pass-by-reference, that way it's still a hash. (I know you
don't need telling the above, it just re-enforces it in my mind).

Passing a function only what it needs, rather than %excel I suppose I
should use:
populate_worksheet($excel{worksheet}, $data{$current_line});

(I'm thinking aloud, please excuse me). Just pass the "bits" that I'm
going to use.

I can see a complete re-write of the current project coming up. I have
to say, Perl is getting interesting again. It wasn't getting boring, but
I'm getting that "I want to know more" bug again.

Justin C

unread,

Dec 13, 2009, 7:03:49 PM12/13/09

to

In article <vh88v6-...@osiris.mauzo.dyndns.org>, Ben Morrow wrote:
>
> Quoth Justin C <justi...@purestblue.com>:

[snip]

>
> One way to enforce discipline (until you get into the habit of doing
> things right) is to put the 'main body' of your script into a 'sub
> main', and then have a call to 'main()' as the only statement outside a
> subroutine. That way you won't find yourself declaring variables a
> globals when they could actually just be locals in 'main'.

That's it, make my Perl look like C.

>> To avoid a whole bunch of vars being global I can see that I'm going to
>> have to nest a lot of sub-routines so that I can keep scope minimal.
>
> I'm not sure what you mean by 'nest subroutines'. If you mean putting
> one named sub inside another named sub, like this
>
> sub foo {
> sub bar { ... }
> }

No, I mean (using your suggestion):

main();

sub main {
eeni();
}
sub eeni {
meeni();
}
sub meeni {
myni();
}
sub myni {
mo();
}
sub mo {
...;
}

> As a general rule, you should be
> passing subs the data they need to work on as parameters, and passing
> the results back as the return value.

I'm starting to get the idea. It's gonna make a helluva mess of what I'm
working on when I get to work tomorrow!

>> The other thing is, if $ws, for example, was returned as a reference (I
>> think I really need to see where I should return references, and where I
>> should return the actual thing... if it's an object return the thing, if
>> it's scalar, array or hash return a ref? maybe that's too crude)
>
> Objects are refs already, remember, so returning an object *is*
> returning a ref. (Strictly speaking, what you get back from a
> constructor like ->new isn't the object itself, but a reference to the
> object.)

Objects are refs, don't supply a reference to a reference to function
(doh!).

I bet that in about two weeks I'm going to look back at threads like
this and think that I've been a complete idiot.

> Returning an explicit ref to a local variable like $ws is
> rather rare. About the only usual case is when you've built a complex
> data structure and you return a ref to avoid needing to take it apart
> into a list and put it back together again in the caller.

Nope, you've lost me - please don't explain now, I'm sure it won't help.
Baby steps only for the hard of understanding.

>> I
>> surely shouldn't be creating a reference to it (in the hash
>> assignement), so it should be "worksheet" => $ws" instead of "\$ws",
>> shouldn't it?
>
> Err... it's not clear why you think this might be necessary, but
> populating a hash with refs-to-refs-to-objects would be a little
> peculiar.

Putting it like that did make me laugh out loud. I think that it must be
time to re-read perldoc perlref.

>> Sorry if the above sounds a bit random, it's hitting the keyboard as
>> it's coming to mind, and I shouldn't really be doing this now, I'm
>> supposed to be working... though the work I'm supposed to be doing is
>> getting this program to do what we want - it's a little chicken and egg
>> here at the moment.
>
> The compromise between 'learning to program better, which will save time
> in the future' and 'solving the immediate problem in front of me right
> now' not always easy to find. In your case I would say that working on
> the fundamentals is worth it, at this point.

Definitely. I feel close to bridging one or two major gaps in my
understanding. Thank you for the reply.

Justin C

unread,

Dec 13, 2009, 6:23:02 PM12/13/09

to

In article <868wda6...@blue.stonehenge.com>, Randal L. Schwartz wrote:
>>>>>> "Justin" == Justin C <justi...@purestblue.com> writes:
>
>Justin> Further to me recent problem of unblessed references, I'm re-writing my
>Justin> code, putting more of it into subroutines so that the main thread is
>Justin> more obvious. I'm finding that where I was using global variables there
>Justin> appear to be a lot of variables that need passing to some sub-routines.
>
> Sounds like related state correlated with related behavior.
>
> That's called an "object".
>
> Might wanna look into that. :)
>
> If you need, I can recommend a good book (or two :).

I bought it already, stop bugging me! ... Though I'm finding it harder
work than The Llama - but that may be because I don't have as much time
to sink into it as I did The Llama.

I think I'm finding the concepts more difficult to grasp, the exercises
are easy enough. It seems I've been "doing" some of the stuff for a
while (following examples in modules), but not understanding how it's
been working, or what's going on. The book[1] is making me think about
what it is that I've been doing - and how I can use those techniques in
my own code. Interesting stuff. I hope that my Christmas break will
allow more time for study.

> print "Just another Perl hacker,"; # the original

Don't they all say that? ;-) One day I want to submit a useful module to
CPAN, and on that day I'll treat myself to a JAPH T shirt from Think
Geek in celebration.

Justin.

1. For those who don't know what I'm talking about, it's The Alpaca
<URL:http://oreilly.com/catalog/9780596102067/>

Sir Robert Burbridge

unread,

Dec 14, 2009, 9:45:00 AM12/14/09

to

On 12/13/2009 07:03 PM, Justin C wrote:
> In article<vh88v6-...@osiris.mauzo.dyndns.org>, Ben Morrow wrote:
>> Quoth Justin C<justi...@purestblue.com>:

>>> To avoid a whole bunch of vars being global I can see that I'm going to
>>> have to nest a lot of sub-routines so that I can keep scope minimal.
>> I'm not sure what you mean by 'nest subroutines'. If you mean putting
>> one named sub inside another named sub, like this
>>
>> sub foo {
>> sub bar { ... }
>> }
>
> No, I mean (using your suggestion):
>
> main();
>
> sub main {
> eeni();
> }
> sub eeni {
> meeni();
> }

Although, I've actually found this a very helpful paradigm:

sub do_something {
...
my $handler = sub {
...
};

if ($cond_a) {
return $handler->(do_a());
} elsif ($cond_b) {
return $handler->(do_b());
} else {
return $handler->(do_c());
}
}

-Sir

Uri Guttman

unread,

Dec 14, 2009, 1:46:14 PM12/14/09

to

>>>>> "SRB" == Sir Robert Burbridge <rbur...@cisco.com> writes:

SRB> Although, I've actually found this a very helpful paradigm:

actually that is a poor paradigm.

SRB> sub do_something {
SRB> ...
SRB> my $handler = sub {
SRB> ...
SRB> };

SRB> if ($cond_a) {
SRB> return $handler->(do_a());
SRB> } elsif ($cond_b) {

why the else (of the if) when you just returned? make it cleaner by just
falling through to a plain if.

SRB> return $handler->(do_b());
SRB> } else {
SRB> return $handler->(do_c());
SRB> }
SRB> }

better yet, use a dispatch table. i hate elsif's and try to never use
them. they are a marker for poor logic. hell, i eschew else as well. in
over 10k lines in one project i had about 3 elsif's and 11 else's. and
the code is very readable. it is just done with very clean logic and
code flow.

uri

--
Uri Guttman ------ u...@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------

Sir Robert Burbridge

unread,

Dec 14, 2009, 3:47:39 PM12/14/09

to

Definitely =) Conditionals were mostly just fodder for the sake of the
anonymous sub (which was the major point). Mostly I use this kind of
thing for failure routines:
if (my $result = try_something()) {
handle_success($result);
} elsif (my $result = try_fallback()) {
handle_succesS($result);
}
and so on ... I don't really have a good paradigm for that kind of thing
that would involve a dispatch table though (is there one?).

-Sir

Uri Guttman

unread,

Dec 14, 2009, 4:04:27 PM12/14/09

to

>>>>> "SRB" == Sir Robert Burbridge <rbur...@cisco.com> writes:

SRB> On 12/14/2009 01:46 PM, Uri Guttman wrote:
SRB> if ($cond_a) {
SRB> return $handler->(do_a());
SRB> } elsif ($cond_b) {
>>
>> why the else (of the if) when you just returned? make it cleaner by just
>> falling through to a plain if.
>>
SRB> return $handler->(do_b());
SRB> } else {
SRB> return $handler->(do_c());
SRB> }
SRB> }
>>
>> better yet, use a dispatch table. i hate elsif's and try to never use
>> them. they are a marker for poor logic. hell, i eschew else as well. in
>> over 10k lines in one project i had about 3 elsif's and 11 else's. and
>> the code is very readable. it is just done with very clean logic and
>> code flow.

SRB> Definitely =) Conditionals were mostly just fodder for the sake of
SRB> the anonymous sub (which was the major point). Mostly I use this kind
SRB> of thing for failure routines:
SRB> if (my $result = try_something()) {
SRB> handle_success($result);
SRB> } elsif (my $result = try_fallback()) {
SRB> handle_succesS($result);
SRB> }
SRB> and so on ... I don't really have a good paradigm for that kind of
SRB> thing that would involve a dispatch table though (is there one?).

you can still drop the elsif and make them if's if the handlers are done
and you need to exit. that alone makes for a better coding style.

as for that try/handle stuff, if it is as regular as you show, then a
simple table/loop will do it better:

my @try_subs = ( \&try_foo, \&try_bar, ... ) ;

foreach my $try ( @try_subs ) {

my $result = $try->() ;
next unless $result ;

handle_success( $result ) ;
last ;
}

nary an else or elsif in the code!

thanx,

Sir Robert Burbridge

unread,

Dec 14, 2009, 4:11:09 PM12/14/09

to

> as for that try/handle stuff, if it is as regular as you show, then a
> simple table/loop will do it better:
>
> my @try_subs = ( \&try_foo, \&try_bar, ... ) ;
>
> foreach my $try ( @try_subs ) {
>
> my $result = $try->() ;
> next unless $result ;
>
> handle_success( $result ) ;
> last ;
> }
>
> nary an else or elsif in the code!
>
> thanx,
>
> uri
>

Hey, that's a pretty good paradigm. Thanks =)

-Sir

Jochen Lehmeier

unread,

Dec 14, 2009, 4:18:51 PM12/14/09

to

On Mon, 14 Dec 2009 19:46:14 +0100, Uri Guttman <u...@stemsystems.com>
wrote:

> in over 10k lines in one project i had about 3 elsif's and 11 else's.

I understand your trouble with elsif (well, at least I can think of cases
where elsif may not be optimal).

But "else"? How could "else" possibly be a sign of bad logic?

Jochen Lehmeier

unread,

Dec 14, 2009, 4:40:34 PM12/14/09

to

On Mon, 14 Dec 2009 22:04:27 +0100, Uri Guttman <u...@stemsystems.com>
wrote:

> SRB> if (my $result = try_something()) {

> SRB> handle_success($result);
> SRB> } elsif (my $result = try_fallback()) {
> SRB> handle_succesS($result);
> SRB> }

> my @try_subs = ( \&try_foo, \&try_bar, ... ) ;

>
> foreach my $try ( @try_subs ) {
>
> my $result = $try->() ;
> next unless $result ;
>
> handle_success( $result ) ;
> last ;
> }
>
> nary an else or elsif in the code!

I guess if we would work in a team, we would regularly have heated
discussions on things like this. Replacing a standard, efficient, easy to
parse (mentally), absolutely normal coding style with a rather convoluted
one, just for the abstract goal of avoiding "else" and "elsif" (*)...
well, I have no useful, public-compatible comment really.

Besids, your pattern breaks down as soon as your functions take different
arguments, or your handle_success() differ from each other, though I guess
you could work around that with closures too, if you wanted.

Note, I do not recall that the OP mentioned lots and lots of elsif's -
only 2 or maybe 3. For larger numbers of cases, and especially when
re-used, this pattern might be kind of useful, but not for simple cases
like this, and not just for getting rid of els(e|if).

Aside from that,

my $result = try_something() || try_fallback();
handle_success($result) if $result;

would be the Perl-like variant I'd use (or straigtforward "elsif", just
like SRB), which scales perfectly to large numbers of try_()'s if you
insert line breaks in strategical places.

Uri Guttman

unread,

Dec 14, 2009, 4:51:18 PM12/14/09

to

>>>>> "JL" == Jochen Lehmeier <OJZGSR...@spammotel.com> writes:

JL> On Mon, 14 Dec 2009 19:46:14 +0100, Uri Guttman <u...@stemsystems.com>
JL> wrote:

>> in over 10k lines in one project i had about 3 elsif's and 11 else's.

JL> I understand your trouble with elsif (well, at least I can think of
JL> cases where elsif may not be optimal).

JL> But "else"? How could "else" possibly be a sign of bad logic?

not a sign of bad logic but of inefficient logic in perl.

my favorite counter example is this style which is all too common:

you enter a sub and have a validity check for something and then you do
the work if the data is ok:

if ( valid() ) {

do lots of code here
}
else {
return -1 ;
}

return $result ;

vs my style:

return -1 unless valid() ;

lots of code
return $result ;

look ma, no else!! no else is needed!! it is shorter, no extra block
entries, no extra indent, no wasted pixels for the {} chars!

you do your checking and get out immediately if bad. then the main good
work is done inline and not in an indented block. perl allows and
encourages this style with statement modifiers and good flow control
ops.

there are many variations on that style but they all eschew else
statements. you just don't need them that often in good perl. most of
the time elses are used for error cases. other times a conditional
expression is better (that is another commonly abused thing, else's
assigning to the same var as the then clause). if you show me code with
lots of else's i can usually rewrite it better without them.

Jochen Lehmeier

unread,

Dec 14, 2009, 5:44:50 PM12/14/09

to

On Mon, 14 Dec 2009 22:51:18 +0100, Uri Guttman <u...@stemsystems.com>
wrote:

> not a sign of bad logic but of inefficient logic in perl.

Define "bad", define "inefficient". Try to avoid the term "logic" please,
it is much too big in this context, c/f http://en.wikipedia.org/wiki/Logic.

> you enter a sub and have a validity check for something and then you do
> the work if the data is ok:
>
> if ( valid() ) {
>
> do lots of code here
> }
> else {
> return -1 ;
> }
>
> return $result ;
>
> vs my style:
>
> return -1 unless valid() ;
>
> lots of code
> return $result ;

> look ma, no else!! no else is needed!!

Well, this is completely different code from what we talked about before,
and you didn't mention my remarks on the original topic at all (Logical
Fallacy: Irrelevant Conclusion). Of course your second code is shorter
than the longer code up there (and I've written enough fragements closer
to your second than your first), but I still fail to see how this is so
specifically *because* there is no "else" (Logical Fallacies: Affirming
the Consequent, Denying the Antecedent, and maybe Non Sequitur). And while
I am with you in that your second code is better than the first, you still
have to convince me that your second code is "better" than (just to name
an example)

my $result=-1;
if (valid())
{

lots of code
}
return $result;

You seem to be very focused on "else". I still fail to see why. There are
many ways to write code with or without "else", all with different
lengths, your measurement of code simply seems to consist of the number of
bytes and number of "else"s.

> you do your checking and get out immediately if bad. then the main good
> work is done inline and not in an indented block. perl allows and
> encourages this style with statement modifiers and good flow control
> ops.

Yes, yes, but go back to the original topic please - very different stuff
back there (Logical Fallacy: Fallacy of Many Questions or Loaded Question)

> there are many variations on that style but they all eschew else

> statements. you just don't need them that often in good perl.[...]

> if you show me code with
> lots of else's i can usually rewrite it better without them.

For your definition of "better".

The reason I reply here and try to argue with you is that you make "else"
out to be totally bad and evil, like someone saying "goto is evil" (the
latter being true ;-) ). I would hate for some Perl newcomer to grow up
with FUD or hearsay like this. Saying "a rather short than long coding
style, getting out of subs ASAP" is perfectly fine with me. But saying
"avoid else at all costs" is not.

You are correct in saying that there is the long-winded coding style of
people who are not familiar with the usual Perl idioms which can be used
to make Perl code short. But there is nothing at all specifically pointing
to "else" in that reasoning. And it is absolutely not true that in all
circumstances the shorter code is "better".

Besides, consider: you suggest to work a lot with "return" or "last"
mingled in between other statements. This is one thing which surely leads
to shorter code, but also leads very quickly to code which is much harder
to maintain, since it gets harder to see the flow of execution at a
glance. I.e., if I have a sub that does *not* contain any "return" or
"last", then I can be absolutely sure that the flow of execution will, at
some point, arrive at the end of the method; and I can be absolutely sure
that I can add code there that will be executed after everything else in
that sub (for example, something which modifies, logs or validates the
return code). With frequent "return" and "last", this gets harder. (Yes, I
am purposefully ignoring "die" here).

There are is also the Theory of Computation, which, for example, concerns
itself with proofs of correctness. In those areas, if/else is often *much*
easier to work with than return/last.

Check out LISP, just to broaden the horizon. The language does not even
have the concept of "return" or "last" at all, and heavily uses "else".
But it's still *very* logical, indeed.

PS: Sorry for the OT Logic stuff, folks. ;-)

Uri Guttman

unread,

Dec 14, 2009, 6:01:14 PM12/14/09

to

>>>>> "JL" == Jochen Lehmeier <OJZGSR...@spammotel.com> writes:

JL> On Mon, 14 Dec 2009 22:51:18 +0100, Uri Guttman <u...@stemsystems.com>
JL> wrote:

>> not a sign of bad logic but of inefficient logic in perl.

JL> Define "bad", define "inefficient". Try to avoid the term "logic"
JL> please, it is much too big in this context, c/f
JL> http://en.wikipedia.org/wiki/Logic.

i use logic as in the decisions made in flow control. those are all
logical choices. it is a fine term for this.

>> you enter a sub and have a validity check for something and then you do
>> the work if the data is ok:
>>
>> if ( valid() ) {
>>
>> do lots of code here
>> }
>> else {
>> return -1 ;
>> }
>>
>> return $result ;
>>
>> vs my style:
>>
>> return -1 unless valid() ;
>>
>> lots of code
>> return $result ;

>> look ma, no else!! no else is needed!!

JL> Well, this is completely different code from what we talked about
JL> before, and you didn't mention my remarks on the original topic at
JL> all (Logical Fallacy: Irrelevant Conclusion). Of course your second
JL> code is shorter than the longer code up there (and I've written
JL> enough fragements closer to your second than your first), but I still
JL> fail to see how this is so specifically *because* there is no "else"
JL> (Logical Fallacies: Affirming the Consequent, Denying the Antecedent,
JL> and maybe Non Sequitur). And while I am with you in that your second
JL> code is better than the first, you still have to convince me that
JL> your second code is "better" than (just to name an example)

it is faster as in no block entry which costs you in perl.

it is shorter so it is easier to read on a screen

it is simpler as you don't need to check inside blocks for what is
happening and where.

easier to follow the MAIN logic which doing the work as you already
handled the easy error exit case. no worries about what ELSE should be
done as there is no else clause.

one less indent makes it easier to read and you can use longer lines
with fewer wrapped lines. that also makes the code shorter and easier to read.

is that enough to qualify it as BETTER code? 5 good reasons with others
i can scrounge up if needed.

JL> my $result=-1;
JL> if (valid())
JL> {
JL> lots of code
JL> }
JL> return $result;

JL> You seem to be very focused on "else". I still fail to see why. There
JL> are many ways to write code with or without "else", all with
JL> different lengths, your measurement of code simply seems to consist
JL> of the number of bytes and number of "else"s.

i don't focus on else. i just easily code away from it. it takes zero
effort for me to do it and it generates better code by the above
reasons. it is the coding style i teach for perl and it usually wins
over most who learn it.

>> you do your checking and get out immediately if bad. then the main good
>> work is done inline and not in an indented block. perl allows and
>> encourages this style with statement modifiers and good flow control
>> ops.

JL> Yes, yes, but go back to the original topic please - very different
JL> stuff back there (Logical Fallacy: Fallacy of Many Questions or
JL> Loaded Question)

huh?? we are talking a minimal change in style to lower the else count
and make better code. there is no loaded question nor fallacy here. it
is a proven style improvement. simple and effective.

>> there are many variations on that style but they all eschew else
>> statements. you just don't need them that often in good perl.[...]
>> if you show me code with
>> lots of else's i can usually rewrite it better without them.

JL> For your definition of "better".

and i get paid to judge code so my word has backing on this. i rate code
for recruitment and for code review projects. too many elses can always
be improved. it is just one of the many things i see which can be
improved. hell, my code can be improved. please do so. my cpan id is
URI. go for it and publish it here. if you show me better code, i will
use it gladly. but you have to also prove it to be better. i know i can
back up my code against weak attacks.

JL> The reason I reply here and try to argue with you is that you make
JL> "else" out to be totally bad and evil, like someone saying "goto
JL> is evil" (the latter being true ;-) ). I would hate for some Perl
JL> newcomer to grow up with FUD or hearsay like this. Saying "a
JL> rather short than long coding style, getting out of subs ASAP" is
JL> perfectly fine with me. But saying "avoid else at all costs" is
JL> not.

not evil, just not needed as often as it is commonly used. it is needed
but just rarely. i don't go out of my way to force no elses. i just can
do it naturally in perl. in other langs it wouldn't be as easy. this is
a perl style point, not a general thing against elses like with no gotos.

JL> You are correct in saying that there is the long-winded coding style
JL> of people who are not familiar with the usual Perl idioms which can
JL> be used to make Perl code short. But there is nothing at all
JL> specifically pointing to "else" in that reasoning. And it is
JL> absolutely not true that in all circumstances the shorter code is
JL> "better".

shorter code is generally better code given that it isn't forced into
obsfucation. there is a limit but so much code is long winded and can be
easily improved by shortening it.

JL> Besides, consider: you suggest to work a lot with "return" or "last"
JL> mingled in between other statements. This is one thing which surely
JL> leads to shorter code, but also leads very quickly to code which is
JL> much harder to maintain, since it gets harder to see the flow of
JL> execution at a glance. I.e., if I have a sub that does *not* contain
JL> any "return" or "last", then I can be absolutely sure that the flow
JL> of execution will, at some point, arrive at the end of the method;
JL> and I can be absolutely sure that I can add code there that will be
JL> executed after everything else in that sub (for example, something
JL> which modifies, logs or validates the return code). With frequent
JL> "return" and "last", this gets harder. (Yes, I am purposefully
JL> ignoring "die" here).

then your code is too complex. i keep things short and sweet and flow
control is everything in doing that.

JL> There are is also the Theory of Computation, which, for example,
JL> concerns itself with proofs of correctness. In those areas, if/else
JL> is often *much* easier to work with than return/last.

who cares about that when working in the real world?

JL> Check out LISP, just to broaden the horizon. The language does not
JL> even have the concept of "return" or "last" at all, and heavily uses
JL> "else". But it's still *very* logical, indeed.

lisp sucks donkey balls. and i know from plenty of experience in
it. lisp is for computers. perl is for people. your choice!

David Harmon

unread,

Dec 15, 2009, 1:28:08 AM12/15/09

to

On Mon, 14 Dec 2009 16:04:27 -0500 in comp.lang.perl.misc, "Uri Guttman"
<u...@StemSystems.com> wrote,

> my @try_subs = ( \&try_foo, \&try_bar, ... ) ;
>
> foreach my $try ( @try_subs ) {
>
> my $result = $try->() ;
> next unless $result ;
>
> handle_success( $result ) ;
> last ;
> }
>
>nary an else or elsif in the code!

my $result = try_foo()
|| try_bar()
|| ... ;

handle_success( $result ) if $result;

Uri Guttman

unread,

Dec 15, 2009, 3:19:11 AM12/15/09

to

>>>>> "DH" == David Harmon <sou...@netcom.com> writes:

DH> On Mon, 14 Dec 2009 16:04:27 -0500 in comp.lang.perl.misc, "Uri Guttman"
DH> <u...@StemSystems.com> wrote,

>> my @try_subs = ( \&try_foo, \&try_bar, ... ) ;
>>
>> foreach my $try ( @try_subs ) {
>>
>> my $result = $try->() ;
>> next unless $result ;
>>
>> handle_success( $result ) ;
>> last ;
>> }
>>
>> nary an else or elsif in the code!

DH> my $result = try_foo()
DH> || try_bar()
DH> || ... ;

DH> handle_success( $result ) if $result;

and how would you handle undef values? or a list return there? undef can
be checked with // in 5.10 but lists can't be handled with ||. the loop
version can easily be modified to handle list returns.

also if you need to pass args to those subs it gets redundant to see
them in each call and noisy as well. the loop version keeps the calling
code simple and extendable while keeping the list of calls separate
where you can easily see and edit it.

this assumes a decent sized list of calls, not just 2 or 3 of them.

Eric Pozharski

unread,

Dec 15, 2009, 3:46:19 AM12/15/09

to

On 2009-12-14, Uri Guttman <u...@StemSystems.com> wrote:
*SKIP*

> it is faster as in no block entry which costs you in perl.

While I totaly agree on everything skipped I beg to differ on this one:

#!/usr/bin/perl

use strict;
use warnings;
use Benchmark qw{ cmpthese timethese };

use feature 'switch';
my( $x, $y );

cmpthese timethese -5, {
code00 => sub { if( ++$x % 2 ) { $y++ } elsif( not $x % 2 ) { $y++ }; },
code01 => sub { if( ++$x % 2 ) { $y++ }; if( not $x % 2 ) { $y++ }; },
code02 => sub { $y++ if ++$x % 2; $y++ if not $x % 2; },
code03 => sub { ++$x % 2 and $y++; (not $x % 2) and $y++; },
code04 => sub { given( ++$x % 2 ) { when( 1 ) { $y++ } when( 0 ) { $y++ } }; },
code05 => sub { if( ++$x % 2 ) { my $z = 0 } elsif( not $x % 2 ) { my $z = 1 }; },
};

__END__
Benchmark: running code00, code01, code02, code03, code04, code05 for at least 5 CPU seconds...
code00: 6 wallclock secs ( 5.31 usr + 0.00 sys = 5.31 CPU) @ 3023788.32/s (n=16056316)
code01: 6 wallclock secs ( 5.02 usr + 0.01 sys = 5.03 CPU) @ 2340337.77/s (n=11771899)
code02: 3 wallclock secs ( 5.04 usr + 0.00 sys = 5.04 CPU) @ 2320529.37/s (n=11695468)
code03: 5 wallclock secs ( 5.32 usr + 0.00 sys = 5.32 CPU) @ 2176516.17/s (n=11579066)
code04: 4 wallclock secs ( 5.07 usr + 0.00 sys = 5.07 CPU) @ 1295559.17/s (n=6568485)
code05: 4 wallclock secs ( 5.32 usr + 0.00 sys = 5.32 CPU) @ 2562540.23/s (n=13632714)
Rate code04 code03 code02 code01 code05 code00
code04 1295559/s -- -40% -44% -45% -49% -57%
code03 2176516/s 68% -- -6% -7% -15% -28%
code02 2320529/s 79% 7% -- -1% -9% -23%
code01 2340338/s 81% 8% 1% -- -9% -23%
code05 2562540/s 98% 18% 10% 9% -- -15%
code00 3023788/s 133% 39% 30% 29% 18% --

5.10.1 isn't the first perl that gives me a strong feeling that perl
somewhat shortcuts I<elsif>s.

*SKIP*

> lisp sucks donkey balls. and i know from plenty of experience in
> it. lisp is for computers. perl is for people. your choice!

I don't wish you lack in this fight -- you don't need it.

--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom

David Harmon

unread,

Dec 15, 2009, 5:32:48 PM12/15/09

to

On Tue, 15 Dec 2009 03:19:11 -0500 in comp.lang.perl.misc, "Uri Guttman"
<u...@StemSystems.com> wrote,

>>>>>> "DH" == David Harmon <sou...@netcom.com> writes:
>
> DH> my $result = try_foo()
> DH> || try_bar()
> DH> || ... ;
>
> DH> handle_success( $result ) if $result;
>
>and how would you handle undef values? or a list return there? undef can
>be checked with // in 5.10 but lists can't be handled with ||. the loop
>version can easily be modified to handle list returns.

There is no indication that any of those things are needed. You are
moving the goalpost. Let the subroutines return 0 on failure. Or, the
above can _more_ easily be modified using the appropriate expression if
one of them cannot be altered.

>also if you need to pass args to those subs it gets redundant to see
>them in each call and noisy as well. the loop version keeps the calling
>code simple and extendable while keeping the list of calls separate
>where you can easily see and edit it.

If the subs need args then most likely they need different args.
How are you going to accommodate that in the loop version?

>this assumes a decent sized list of calls, not just 2 or 3 of them.

My list of calls is no uglier than your array list for any size.

But mainly it is straightforward and easily understood code, eliminating
all that loopyness for a task that is not much like a loop to begin
with. Consider it as a response to the elseif version, as yours was,
and then maybe you will like it more.

Uri Guttman

unread,

Dec 15, 2009, 6:59:11 PM12/15/09

to

>>>>> "DH" == David Harmon <sou...@netcom.com> writes:

DH> On Tue, 15 Dec 2009 03:19:11 -0500 in comp.lang.perl.misc, "Uri Guttman"
DH> <u...@StemSystems.com> wrote,

>>>>>>> "DH" == David Harmon <sou...@netcom.com> writes:
>>
DH> my $result = try_foo()
DH> || try_bar()
DH> || ... ;
>>
DH> handle_success( $result ) if $result;
>>
>> and how would you handle undef values? or a list return there? undef can
>> be checked with // in 5.10 but lists can't be handled with ||. the loop
>> version can easily be modified to handle list returns.

DH> There is no indication that any of those things are needed. You are
DH> moving the goalpost. Let the subroutines return 0 on failure. Or, the
DH> above can _more_ easily be modified using the appropriate expression if
DH> one of them cannot be altered.

i have seen too many subs which return undef on fail. in older perls
that is problematic with ||.

>> also if you need to pass args to those subs it gets redundant to see
>> them in each call and noisy as well. the loop version keeps the calling
>> code simple and extendable while keeping the list of calls separate
>> where you can easily see and edit it.

DH> If the subs need args then most likely they need different args.
DH> How are you going to accommodate that in the loop version?

a hash ref and a table of hashes (probably along with the sub
call). this is called table driven code and i have very nice working
examples of it in the test code in Sort::Maker on cpan. you can create
tables of args, expected return values, golden sort code, etc. it
removes tons of redundant code and makes it much easier to make
variations than if you did a standard if/else tree.

>> this assumes a decent sized list of calls, not just 2 or 3 of them.

DH> My list of calls is no uglier than your array list for any size.

for some definition of ugly. i will take mine. my perl eyes are very
experienced and used by many others for business.

DH> But mainly it is straightforward and easily understood code, eliminating
DH> all that loopyness for a task that is not much like a loop to begin
DH> with. Consider it as a response to the elseif version, as yours was,
DH> and then maybe you will like it more.

a long || list can be very annoying to some. it has several weaknesses
and it is inflexible. i don't prefer it so that is all i have to say.

David Harmon

unread,

Dec 15, 2009, 8:32:04 PM12/15/09

to

On Tue, 15 Dec 2009 18:59:11 -0500 in comp.lang.perl.misc, "Uri Guttman"
<u...@StemSystems.com> wrote,

>i have seen too many subs which return undef on fail. in older perls
>that is problematic with ||.

Thanks, I will have to watch out for that.

Ben Morrow

unread,

Dec 15, 2009, 8:41:24 PM12/15/09

to

Quoth "Uri Guttman" <u...@StemSystems.com>:

>
> i have seen too many subs which return undef on fail. in older perls
> that is problematic with ||.

Can you explain how? AFAIK undef has always been false...

Ben

Uri Guttman

unread,

Dec 15, 2009, 10:00:07 PM12/15/09

to

>>>>> "BM" == Ben Morrow <b...@morrow.me.uk> writes:

BM> Quoth "Uri Guttman" <u...@StemSystems.com>:

>>
>> i have seen too many subs which return undef on fail. in older perls
>> that is problematic with ||.

BM> Can you explain how? AFAIK undef has always been false...

because you can't tell undef from '' or 0 or '0' with ||. 5.10 has //
which solves that problem.

David Harmon

unread,

Dec 16, 2009, 1:35:58 PM12/16/09

to

On Tue, 15 Dec 2009 22:00:07 -0500 in comp.lang.perl.misc, "Uri Guttman"
<u...@StemSystems.com> wrote,

>>>>>> "BM" == Ben Morrow <b...@morrow.me.uk> writes:
>
> BM> Quoth "Uri Guttman" <u...@StemSystems.com>:
> >>
> >> i have seen too many subs which return undef on fail. in older perls
> >> that is problematic with ||.
>
> BM> Can you explain how? AFAIK undef has always been false...
>
>because you can't tell undef from '' or 0 or '0' with ||. 5.10 has //
>which solves that problem.

But that's not a problem in this thread, where undef is completely
acceptable as a false result from a sub indicating failure. When you
wrote that undef was problematic with || I took that as meaning that it
didn't always work as documented in older perls. After searching, I can
find no indication of that being true. So it is not problematic; it
just doesn't always do everything you might imagine.

In another thread you suggest:
my $month = $ARGV[0] || '' ;

How are you supposed to distinguish month 0 from a missing arg in that
case?

Uri Guttman

unread,

Dec 16, 2009, 4:35:47 PM12/16/09

to

>>>>> "DH" == David Harmon <sou...@netcom.com> writes:

DH> On Tue, 15 Dec 2009 22:00:07 -0500 in comp.lang.perl.misc, "Uri Guttman"
DH> <u...@StemSystems.com> wrote,

>>>>>>> "BM" == Ben Morrow <b...@morrow.me.uk> writes:
>>
BM> Quoth "Uri Guttman" <u...@StemSystems.com>:
>> >>
>> >> i have seen too many subs which return undef on fail. in older perls
>> >> that is problematic with ||.
>>
BM> Can you explain how? AFAIK undef has always been false...
>>
>> because you can't tell undef from '' or 0 or '0' with ||. 5.10 has //
>> which solves that problem.

DH> But that's not a problem in this thread, where undef is completely
DH> acceptable as a false result from a sub indicating failure. When you
DH> wrote that undef was problematic with || I took that as meaning that it
DH> didn't always work as documented in older perls. After searching, I can
DH> find no indication of that being true. So it is not problematic; it
DH> just doesn't always do everything you might imagine.

DH> In another thread you suggest:
DH> my $month = $ARGV[0] || '' ;

you can't get undef from @ARGV.

jl_...@hotmail.com

unread,

Dec 16, 2009, 4:48:10 PM12/16/09

to

On Dec 14, 3:44 pm, "Jochen Lehmeier" <OJZGSRPBZ...@spammotel.com>
wrote to Uri:

> I am with you in that your second code is better than
> the first, you still have to convince me that your second
> code is "better" than (just to name an example)
>
> my $result=-1;
> if (valid())
> {
> lots of code
> }
> return $result;
>
> You seem to be very focused on "else". I still fail to
> see why. There are many ways to write code with or
> without "else", all with different lengths, your
> measurement of code simply seems to consist of the
> number of bytes and number of "else"s.

What Uri is trying to avoid is not "else"s per se, but rather
unnecessary blocks to begin with. Every time you employ a block (with
proper indentation, hopefully), you leave the maintainer necessarily
having to figure out how the logic differs between code that takes
that block and code that skips that block.

In the example you gave as you wrote it, it's easy to see that if
the if() block is not taken, then the function is returned from
immediately. But what if, in place of the "lots of code", there was
literally a hundred lines of code? If I was looking at that if()
condition, I'd know that in some cases the logic enters that block,
but I wouldn't know that in others the logic is done with that
function altogether.

In order to know that, in certain conditions, that we're done with
the function and we can return, I necessarily would have to scroll all
the way down to the end of the if() block, where I would notice that
the $result is returned untouched. It's much easier for me ("me"
being the maintainer, not the writer of the code) to know that when I
see:

return -1 if not valid();
lots of code
return result;

that non-valid inputs results in an immediate return.

Of course, this is just a simple example. In real life, I often
come across functions with the following skeleton:

sub f
{
my (...) = @_;

my $found = 0;
my $result;

if (...)
{
# do something quick here

if (...)
{
while (... && !$found)
{
if (...)
{
# do something quick here

if (...)
{
# lots of code

if (...)
{
# We found what we're looking for!
$result = ...;
$found = 1;
}
}
}
}
}
}
}

return $result;
}

All this function does is search an input set for something we
want, and when we find it, it gets returned. However, when we know
the inputs aren't what we want at the beginning, we still in the
function until the very end. And when we know the element we're at in
the while() loop isn't what we want, we're still in it for the whole
loop.

And when we finally do get the answer we want, we have to employ
the use of a $found variable to jump out of the loop to get to the end
of the function, where only then we can return.

And notice how many '}' we have at the end. Which matches to
what? Can you tell at a glance? I wrote the code myself, and I can't
immediately tell.

When maintaining, when I come across code like the above, I'll see
the first if() and wonder, "What happens to code that doesn't take
this route? <scroll, scroll, scroll> Oh, nothing." Then I'll come
across the second if(), scroll, and think the same thing. Same for
the third if().

Then I come across the first if() in the while() loop and wonder,
"What happens to code that doesn't take this route? <scroll, scroll>
Oh, this element has nothing more to do with this iteration of the
loop." Same with the next if().

And when I come across the final if() (the one that reaches the
what we're looking for), I think, how does setting the $found and
$result variables affect future behavior of this function? <scroll>
Oh, nothing more of consequence happens except that $result gets
returned."

Instead of using the code above as I wrote it, how about I re-write
it like this:

sub f
{
my (...) = @_;

return if not ...;

# do something quick here

return unless ...;

# do something quick here

return if not ...;

while (...)
{
next if not ...;

# do something quick here

next unless ...;

# lots of code

# Return the result if we found what we're looking for:
return $result if ...;
}
}

Is this new code any better? I think so. For one thing, it
doesn't depend on the $found and $result variables, as they aren't
needed here.

Also, look at those early "return" statements. They tell the
maintainer at-a-glance that the rest of the function is no longer
necessary if certain conditions are met. (You might think that's true
of the previous code, but if you have to scroll down hundreds of lines
to be greeted with half a dozen (or more) of '}'s (sometimes code
between them), you might find yourself wishing that there were a lot
less blocks to navigate through.

Note the "next" statements in the while() loop. They let you know
right away when the element you're examining is one you're interested
in and want to keep around. If you know you don't need it, you can
just invoke "next" and skip right to the next element.

Because of these "next" and extra "return" statements, I no longer
find myself wondering what follows code that doesn't meet the if()
conditions. I know right away what happens, without the need to jump
around to the end of the loop (all the while hoping that the '}' that
I think is the end of the loop really is the right one).

Also note that, with the exception of the top-level block that
defines the f() function, there is only one block, and it's just for
the while() loop. A maintainer can see that there is only one place
where the logic of the code diverges (excepting, of course, the early
exits and skips caused by the "return" and "next" statements).

Have you ever read a recipe in a culinary cookbook? If you have,
you might have noticed that most recipes consist of instructions that
are mostly nicely lined up with the page. Occasionally an instruction
will include multiple "sub-instructions" that are bulleted and/or
indented. However, excessive (and nested) bulleting and indeting is
frowned upon, since it tends to turn the recipe into a logic puzzle,
which is not the original intent of the recipe writer.

Likewise, I freely use early "return"s and "next"s, because I don't
want my code to become a puzzle to the maintainers of my code. I want
to make it clear when the rest of the function is no longer of any use
(by using a "return" statement). The maintainer shouldn't have to
scroll down dozens of lines and match up the right brace just to
figure that out for him/herself.

Similarly, If I'm looping through a list or data structure and
determine that the current element won't be needed, I'll just use a
"next" statemnet to skip it. Otherwise, the maintainer will
necessarily have to scroll down to the bottom of the loop (and match
up the proper braces) just to see that nothing else happens.

Nevertheless, some coders I've worked with still insist on using
the overly-indented style, adamantly avoiding the early "return"s and
"next"s. When asked why, one coder stated it's because he was taught
in a college C course that "every function should have exactly one
entrance point and exactly one exit point."

While that may be good avice for C (in C you often need to perform
"clean up" code to free up memory and close file handles), that advice
is not so good for languages like C++ and Perl, which encourage the
use of self-cleaning objects. (You can read up on "RAII" if you want
to learn more about it for C++.)

In properly written C++ and Perl, if you exit a block or a routine
early, the variables created in the scope(s) you just left simply
clean themselves up! With this kind of convenience, there's no reason
to prohibit "next"s and multiple "return" statements.

In truth, some coders will avoid "next"s and early "return"s simply
because they've never used them and are not used to them (and
sometimes they'll even use that argument to justify why nobody else
should use them, either!). But if this were a valid argument, then
everybody would necessarily have to code to the lowest common
denominator of knowledge, which is just plain silly. Imagine that,
whenever a new coder is hired, a company would have to examine the new
coder's knowledge to determine what keywords and coding styles should
never be used again. (You probably see where this is going, and why
avoiding code styles just because they're unfamiliar is a terrible
idea.)

The use of less blocks is also less error-prone. I've seen code
where someone accidentally put a line of code inbetween the wrong pair
of '}' braces (among half a dozen braces). When I was debugging the
code, I noticed that all but two of the '}' could have been eliminate
(by judicial use of "next" and "return"). If there had only been two
'}' to begin with, then the bug caused by the misplaced code probably
would have never existed, as the code probably wouldn't have been
misplaced in the first place.

So I think that Uri's point (and mine, definitely) is to try toward
the minimal number of blocks, since minimizing the number of blocks
tends to reduce code errors and is easier to maintain. And "next"s
and multiple "return"s may FEEL wrong to some coders (especially to
those who have been taught not to use them), but if used correctly
they are a real blessing to the maintainer.

I hope this clarifies things, Jochen.

-- Jean-Luc Romano

David Harmon

unread,

Dec 16, 2009, 5:22:13 PM12/16/09

to

On Wed, 16 Dec 2009 16:35:47 -0500 in comp.lang.perl.misc, "Uri Guttman"
<u...@StemSystems.com> wrote,

>>>>>> "DH" == David Harmon <sou...@netcom.com> writes:
> DH> In another thread you suggest:
> DH> my $month = $ARGV[0] || '' ;
>
>you can't get undef from @ARGV.

use Data::Dumper;
my $month = $ARGV[0];
print Dumper \$month;

yields:
$VAR1 = \undef;

J�rgen Exner

unread,

Dec 16, 2009, 5:55:32 PM12/16/09

to

"jl_...@hotmail.com" <jl_...@hotmail.com> wrote:
>On Dec 14, 3:44 pm, "Jochen Lehmeier" <OJZGSRPBZ...@spammotel.com>
>wrote to Uri:
>> I am with you in that your second code is better than
>> the first, you still have to convince me that your second
>> code is "better" than (just to name an example)
>>
>> my $result=-1;
>> if (valid())
>> {
>> lots of code
>> }
>> return $result;
>>
>> You seem to be very focused on "else". I still fail to
>> see why. There are many ways to write code with or
>> without "else", all with different lengths, your
>> measurement of code simply seems to consist of the
>> number of bytes and number of "else"s.
>
>
> What Uri is trying to avoid is not "else"s per se, but rather
>unnecessary blocks to begin with. Every time you employ a block (with
>proper indentation, hopefully), you leave the maintainer necessarily
>having to figure out how the logic differs between code that takes
>that block and code that skips that block.
>
> In the example you gave as you wrote it, it's easy to see that if
>the if() block is not taken, then the function is returned from
>immediately. But what if, in place of the "lots of code", there was
>literally a hundred lines of code?

While I agree with you in general this particular argument backfires.

If you have a block with hundred lines of code then it is overdue to be
split up into 6-10 subs. Yes, there are certainly examples where
exceedingly long blocks of code are appropriate, but in general the old
rule of thumb "No sub longer than a single [VT100] screen" still has its
merrits.

jue

J�rgen Exner

unread,

Dec 16, 2009, 5:59:08 PM12/16/09

to

"jl_...@hotmail.com" <jl_...@hotmail.com> wrote:
> Of course, this is just a simple example. In real life, I often
>come across functions with the following skeleton:
>
>sub f
>{
> my (...) = @_;
>
> my $found = 0;
> my $result;
>
> if (...)
> {
> # do something quick here
>
> if (...)
> {
> # do something quick here
>
> if (...)
> {
> while (... && !$found)
> {
> if (...)
> {
> # do something quick here
>
> if (...)
> {
> # lots of code
>
> if (...)
> {
> # We found what we're looking for!
> $result = ...;
> $found = 1;
> }
> }
> }
> }
> }
> }
> }
>
> return $result;
>}
>

[...]

> When maintaining, when I come across code like the above,

... then I wonder, if the original programmer has never heard of subs.

jue

Uri Guttman

unread,

Dec 16, 2009, 6:12:43 PM12/16/09

to

>>>>> "DH" == David Harmon <sou...@netcom.com> writes:

DH> On Wed, 16 Dec 2009 16:35:47 -0500 in comp.lang.perl.misc, "Uri Guttman"
DH> <u...@StemSystems.com> wrote,

>>>>>>> "DH" == David Harmon <sou...@netcom.com> writes:
DH> In another thread you suggest:
DH> my $month = $ARGV[0] || '' ;
>>
>> you can't get undef from @ARGV.

DH> use Data::Dumper;
DH> my $month = $ARGV[0];
DH> print Dumper \$month;

DH> yields:
DH> $VAR1 = \undef;

that is because you didn't test @ARGV for elements. please make
intelligent tests. if @ARGV has elements, those elements cannot be undef
because the shell can't pass undef via exec. checking $ARGV[0] directly
is bad coding but i didn't address that in this thread.

jl_...@hotmail.com

unread,

Dec 16, 2009, 6:35:32 PM12/16/09

to

On Dec 16, 3:55 pm, Jürgen Exner <jurge...@hotmail.com> wrote:
> If you have a block with hundred lines of code then it is
> overdue to be split up into 6-10 subs. Yes, there are certainly
> examples where exceedingly long blocks of code are appropriate,
> but in general the old rule of thumb "No sub longer than a
> single [VT100] screen" still has its merrits.

(For the record, I was going to type "hundreds of lines of code,"
but then I thought better of it and decided to write "a hundred lines
of code" instead.)

But my point is that I don't control how other code is written, and
if it I run across a function I didn't write that has literally has
hundreds of lines of code, then I'm stuck debugging it regardless.
(In other words, I can't refuse to debug it just because it doesn't
look well-written.)

The reason I encourage coders to use prodigious use of "next" and
"return" statements in Perl is because they are almost always done
correctly when used. In other words, their use almost always makes my
job as a maintainer easier.

However, I tend not to encourage breaking up long code into smaller
functions quite as often, for three main reasons:

1. Sometimes the algorithm is a valid -- yet long --
algorithm that works well as a single function and
would compromise the flow of the main logic if
broken up.

2. For some reason, breaking up functions causes
many programmers to resort to using global
variables (instead of correctly implementing
function parameter passing) and this is a HUGE
pet peeve of mine. (I'd rather deal with
overly-long functions than with unnecessary
global variables.)

3. Even if the function should be broken up,
a long function that doesn't rely on global
variables is not that much harder to debug
than one that is broken up into smaller
functions (especially if the function is
well written, with easily distinguishable
sections of code). In other words, a long
function might be a coding sin, but there
are plenty worse sins out there (in my opinion
as a maintainer).

I'm not saying that large functions shouldn't be broken up into
smaller ones; what I'm saying is that, if I were to write a
"Maintainer's Wish List", my wish to have coders use "next" and
"return" prodigiously (which cuts down on the over-use of bracketed
blocks) ranks much higher than my wish to break up large functions
into smaller ones (especially since many codes mistakenly think that
breaking up functions gives them permission to use global variables).

-- Jean-Luc