String concatentation operator

Andy Wardley

unread,

Nov 14, 2002, 7:19:47 AM11/14/02

to perl6-l...@perl.org

Quoted from "Seven Deadly Sins of Introductory Programming Language
Design" [1] by Linda McIver and Damian Conway:

We have shown over one thousand novice programming students
the C/C++ expression:

"the quick brown fox" + "jumps over the lazy dog"

and asked them what they believe the effect of the + sign is.
Not one of them has ever suggested that the + sign is illegally
attempting to add the address of the locations of the first two
characters of the two literal strings. Without exception they
believed that the + should concatenate the two strings.

Makes perfect sense to me.

Can we overload + in Perl 6 to work as both numeric addition
and string concatenation, depending on the type of the operand
on the left?

I realise the answer is "probably not", given the number/string
ambiguity of Perl variables:

my $a = 123;
my $b = 456;
$a + $b; # 579 or 123456?

I quite like '_' as the string concatenation operator (so much so
that I added it to the Template Toolkit some time ago, confidently
telling people that it's what Perl 6 would use :-). It ties in
nicely with the 123_456 numerical style.

On the other hand, I'm not a big fan of using '~' to indicate
string context. The tilde (aka wobbly operator) seems much better
suited to smart matching, IMHO, being reminiscent of the "almost
equal to" operator (which I would attempt to include here if I
had the slightest clue how to make my keyboard speak Unicode).

Another option: could we quote operators to indicate string context?

$a "+" $b

This would tie in nicely with using [ ] to indicate vectorised
operators, although I realise that particular syntax has been
disvogued of late.

@a [+] @b

A

[1] http://www.csse.monash.edu.au/~damian/papers/ [2]
[2] Good paper, well worth a read. That Conway chap seems to know
his cookies. His name rings a bell, too...

Ken Fox

unread,

Nov 14, 2002, 9:34:55 AM11/14/02

to Andy Wardley, perl6-l...@perl.org

Andy Wardley wrote:

> Can we overload + in Perl 6 to work as both numeric addition

> and string concatenation ...

Isn't there some nifty Unicode operator perl6 could enlist? ;)

How about concatenating adjacent operands? ANSI C does this
with string constants and it works very well. It would become
one of those great Perl sound bites too: the community
couldn't decide on the operator, so perl6 left it out.

- Ken

Smylers

unread,

Nov 14, 2002, 10:07:03 AM11/14/02

to perl6-l...@perl.org

Andy Wardley wrote:

> Quoted from "Seven Deadly Sins of Introductory Programming Language
> Design" [1] by Linda McIver and Damian Conway:
>

> over one thousand novice programming students ...:

>
> "the quick brown fox" + "jumps over the lazy dog"
>

> ... they believed that the + should concatenate the two strings.

>
> Makes perfect sense to me.

Makes sense in a language where variables are typed, because the type of
operation can use the type of the variables. Having types in neither
variables nor operators causes confusion.

> Can we overload + in Perl 6 to work as both numeric addition and
> string concatenation, depending on the type of the operand on the
> left?
>
> I realise the answer is "probably not", given the number/string
> ambiguity of Perl variables:
>
> my $a = 123;
> my $b = 456;
> $a + $b; # 579 or 123456?

Indeed. And even more so with:

my $a = 123;
my $b = '456';
print $a + $b;

You suggest using the type of the left operand. So now when wanting to
perform addition the order of the operands can matter:

print $b + $a;

I realize that by definition the operand order is significant in
concatenation. (Some argue that this alone is sufficient reason for not
overloading C<+> for concatenation in any language.) But having the
order of operands determine which of addition or concatenation takes
place strikes me as very fragile.

There are other languages which have attempted having both variables and
operators without types:

* Basic has traditionally had typed variables and C<+> for both
addition and concatenation. Microsoft 'Visual Basic' introduced
'variant# variables (in, I think, version 3) of no particular type.
To avoid the addition or concat problem they added the C<&> operator
to always mean concatenation (deprecating C<+> for concatenation,
but continuing to allow it where unambiguous for backwards
compatibility).

* I think JavaScript does what you're suggesting. I don't really know
JavaScript, but recently while trying to hack on code I didn't
understand I could I could access a previous array element with:

options[i - 1]

but, because C<i> was a string, this didn't give the following
element:

options[i + 1] // multiplies by 10, _then_ adds one!

and ended up with:

options[i - 0 + 1] // eurgh!

Please let's not go there with Perl.

* PHP has separate concatenation and addition operators, but tries to
get away with a single untyped equality operator. I posted a
message to this list a couple of weeks ago demonstrating some of the
ways this fails.

While I think that McIvery and Conway are absolutely right for the
context in which they made the above observations, I don't think that
they necessarily apply to Perl:

* Data read in to Perl tends to arrive as strings. This applies
whether reading from the keyboard, getting values from a form on a
webpage, reading from a file, or retrieving records from a database.
Therefore you are very likely to end up with numeric data just
happening to be stored as a string initially. My example above:

my $b = '456';

is obviously contrived, but numbers stored as strings are common.
(See also previous PHP equality operator rant.)

This potential confusion doesn't occur in C++, therefore using C<+>
for concatenation is much more sensible.

* The paper points out what C++ actually does when C<+> is used with a
couple of strings (trying to add the addresses of the first
character of each string) is completely useless, therefore the
syntax may as well be redeployed in a fruitful way. Perl already
does something useful with this syntax.

Some beginners may use C<+> hoping to get concatenation, but I think
on discovering that C<+> has tried to add their strings
mathematically they will be able to understand this behaviour much
more so than the C++ behaviour.

* The paper is discussing what makes a good language to use when
trying to teach programming. The points made are not what makes a
language easy to learn per se, but what helps when trying to use a
language as the tool for teaching programming to complete beginners,
those who don't know anything at all about programming and are
struggling to come to terms with its concepts.

I don't think use in teaching programming is a primary aim of Perl
6. (Or, another way round, I think that making Perl 6 a great
language for teaching programming would make it too restrictive for
the rest of us.)

* If somebody uses C<+> in an attempt to concatenate two strings,
chances are that at least one of those strings isn't numeric. In
this case Perl uses a value of zero and displays a warning. In
other words, it's very often possible to identify an attempted abuse
of C<+> because it's being used somewhere that it doesn't make
sense.

If C<+> did double duty for concatenation then it would always make
some sort of sense. People would still write programs that did the
wrong thing, but there could no longer be a warning message pointing
this out because Perl wouldn't be able to notice.

Arguably the warning (or related diagnostic message) in this case
could be improved, for example by explicitly stating that C<+> only
does arithmetic and that if concatenation is desired then C<~> (or
whatever it is) should be used instead.

Not being able to use C<+> for concatenation may trip up some people
learning Perl some of the time, but I reckon it's a fairly easy hurdle
to get over and that it isn't the main cause of trouble people have with
learning Perl.

> Another option: could we quote operators to indicate string context?
>
> $a "+" $b

Hmmm ...

print $a "+" "+" "+" $b "+" "=" $a + $b;
print $a "+" qq("+") "+" $b "+" "=" $a "+" $b;

print "$a+$b=" "+" $a + $b;
print qq($a"+"$b=) "+" $a "+" $b;

print "$a+$b=$($a + $b)";
print qq($a"+"$b=$($a "+" $b));

I don't think I'm so keen on that idea.

Smylers

Michael G Schwern

unread,

Nov 14, 2002, 3:29:30 PM11/14/02

to Andy Wardley, perl6-l...@perl.org

On Thu, Nov 14, 2002 at 12:19:47PM +0000, Andy Wardley wrote:
> Can we overload + in Perl 6 to work as both numeric addition
> and string concatenation, depending on the type of the operand
> on the left?
>
> I realise the answer is "probably not", given the number/string
> ambiguity of Perl variables:
>
> my $a = 123;
> my $b = 456;
> $a + $b; # 579 or 123456?

Its worse than that, what does this do:

sub add ($num1, $num2) {
return $num1 + $num2;
}

add($foo, $bar);

There are obvious simple cases

$foo = 23; $bar = 42; (perl can figure this one out easy)
$foo = "foo"; $bar = "bar"; (perl can figure ditto)

but it rapidly gets ambiguous

$foo = 23; $bar = 'bar';
$foo = '23'; $bar = 42;

so maybe you can solve it with types:

sub add (number $num1, number $num2) {
...
}

but what about this very simple case:

# We read in from a file, so should Perl consider them
# strings or numbers?
@numbers = slurp 'number_file'; chomp @numbers;
$total = $numbers[0] + $numbers[1];

then you have to do something like this:

number @numbers = slurp 'number_file'; chomp @numbers;
$total = $numbers[0] + $numbers[1];

and I really, really, really don't want to even have to think about types
for basic tasks.

No matter what you decide 23 + 'bar' should do (should it concat? should it
add? should it be an error?) it will be wrong to 2/3 of the population
because the add/concat idea violates the cardinal rule of overloading:

Don't make the same op do two different things.

While people might think string concatination is the same as numeric
addition, its really quite a different operation. If we were to try and
make + do similar things for both it would either be:

23 + 42 == 64;
'a' + 'c' == 'd';

or

'a' + 'c' == 'ac';
23 + 42 == 2342;

If you find yourself having to decide between two radically different
behaviors when a binary op is presented with two different types, something
is wrong.

The real question is: should adding two strings to anything at all?

And the real question below that one is: how far should type coercion go?

In Perl 5 it stops at a sane point:

$ perl -wle 'print "bar" + "foo"'
Argument "foo" isn't numeric in addition (+) at -e line 1.
Argument "bar" isn't numeric in addition (+) at -e line 1.
0

Ok, so string + string does something almost as odd as C does, but at least
it warns about it. And that's probably a good default way to handle it.

<copout type=standard>And you can always just change the behavior of
strings in a module.</copout>

--

Michael G. Schwern <sch...@pobox.com> http://www.pobox.com/~schwern/
Perl Quality Assurance <per...@perl.org> Kwalitee Is Job One
Monkey tennis

Richard Proctor

unread,

Nov 14, 2002, 4:10:07 PM11/14/02

to Michael G Schwern, perl6-l...@perl.org, Andy Wardley

On Thu 14 Nov, Michael G Schwern wrote:
> On Thu, Nov 14, 2002 at 12:19:47PM +0000, Andy Wardley wrote:
> > Can we overload + in Perl 6 to work as both numeric addition
> > and string concatenation, depending on the type of the operand
> > on the left?

There have been times when I have wondered if string concatination could be
done without any operator at all. Simply the placement of two things
next to each other as in $foo $bar or $foo$bar would silently concatenate
them. But then I feel there are some deep horrors and ambiguities that
I have failed to spot...

Richard

--
Personal Ric...@waveney.org http://www.waveney.org
Telecoms Ric...@WaveneyConsulting.com http://www.WaveneyConsulting.com
Web services Ric...@wavwebs.com http://www.wavwebs.com
Independent Telecomms Specialist, ATM expert, Web Analyst & Services

Michael G Schwern

unread,

Nov 14, 2002, 4:47:15 PM11/14/02

to Richard Proctor, perl6-l...@perl.org, Andy Wardley

On Thu, Nov 14, 2002 at 09:10:07PM +0000, Richard Proctor wrote:
> There have been times when I have wondered if string concatination could be
> done without any operator at all. Simply the placement of two things
> next to each other as in $foo $bar or $foo$bar would silently concatenate
> them. But then I feel there are some deep horrors and ambiguities that
> I have failed to spot...

Before this starts up again, I hereby sentence all potential repliers to
first read:

"string concatenation operator - please stop"
http://archive.develooper.com/perl6-l...@perl.org/msg06710.html

and then read *all* the old proposals and arguments about why they won't
work:

"s/./~/g"
http://archive.develooper.com/perl6-l...@perl.org/msg06512.html

"Sane "+" string concat prposal"
http://archive.develooper.com/perl6-l...@perl.org/msg06578.html

"YA string concat propsal"
http://archive.develooper.com/perl6-l...@perl.org/msg06598.html

"Dot can DWIM without whitespace"
http://archive.develooper.com/perl6-l...@perl.org/msg06627.html

"Another string concat proposal"
http://archive.develooper.com/perl6-l...@perl.org/msg06639.html

"YAYAYA string concat proposal"
http://archive.develooper.com/perl6-l...@perl.org/msg06650.html

"a modest proposal Re: s/./~/g"
http://archive.develooper.com/perl6-l...@perl.org/msg06672.html

--

Michael G. Schwern <sch...@pobox.com> http://www.pobox.com/~schwern/
Perl Quality Assurance <per...@perl.org> Kwalitee Is Job One

Here's hoping you don't become a robot!

Mark J. Reed

unread,

Nov 14, 2002, 5:13:13 PM11/14/02

to perl6-l...@perl.org

On 2002-11-14 at 16:47:15, Michael G Schwern wrote:
> "string concatenation operator - please stop"
> http://archive.develooper.com/perl6-l...@perl.org/msg06710.html

BTW, the first link there - to the bikeshed story - is broken.
This is the correct link:

http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/misc.html#BIKESHED-PAINTING

(note that there is no underscore in ISO8859).

--
Mark REED | CNN Internet Technology
1 CNN Center Rm SW0831G | mark...@cnn.com
Atlanta, GA 30348 USA | +1 404 827 4754

Dan Sugalski

unread,

Nov 14, 2002, 6:00:18 PM11/14/02

to Ken Fox, Michael G Schwern, Richard Proctor, perl6-l...@perl.org, Andy Wardley

At 5:57 PM -0500 11/14/02, Ken Fox wrote:

>Michael G Schwern wrote:
>>Before this starts up again, I hereby sentence all potential repliers to
>>first read:
>>
>>"string concatenation operator - please stop"
>>http://archive.develooper.com/perl6-l...@perl.org/msg06710.html
>

>The bike shed thing is like Godwin's Law. Only I don't know
>which side loses. ;)
>
>Wasn't one of the main problems with Jarkko's juxtaposition
>proposal that it would kill indirect objects? Have we chased
>our tail on this subject after the colon became required for
>indirect objects?

I dunno. Makes the direct object syntax interesting as well. What does:

$foo = any(Bar::new, Baz::new, Xyzzy::new);
$foo.run;

do?
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Ken Fox

unread,

Nov 14, 2002, 5:57:51 PM11/14/02

to Michael G Schwern, Richard Proctor, perl6-l...@perl.org, Andy Wardley

Michael G Schwern wrote:
> Before this starts up again, I hereby sentence all potential repliers to
> first read:
>
> "string concatenation operator - please stop"
> http://archive.develooper.com/perl6-l...@perl.org/msg06710.html

The bike shed thing is like Godwin's Law. Only I don't know
which side loses. ;)

Wasn't one of the main problems with Jarkko's juxtaposition
proposal that it would kill indirect objects? Have we chased
our tail on this subject after the colon became required for
indirect objects?

If the assignment variant of the invisible concatentation
operator could be solved, juxtaposition seems like a reasonable
approach. (Line ending juxtaposition problems could be fixed with
a special rule similar to the '} by itself' rule.)

- Ken

Garrett Goebel

unread,

Nov 14, 2002, 6:29:22 PM11/14/02

to Dan Sugalski, Ken Fox, Michael G Schwern, Richard Proctor, perl6-l...@perl.org, Andy Wardley

From: Dan Sugalski [mailto:d...@sidhe.org]

> At 5:57 PM -0500 11/14/02, Ken Fox wrote:
> >

> >Wasn't one of the main problems with Jarkko's juxtaposition
> >proposal that it would kill indirect objects? Have we chased
> >our tail on this subject after the colon became required for
> >indirect objects?
>

> I dunno. Makes the direct object syntax interesting as well.
> What does:
>
>
> $foo = any(Bar::new, Baz::new, Xyzzy::new);
> $foo.run;
>
> do?

If you're wonder what Joe "I can hardly keep up" Blogg thinks...

Assuming junctions have neither a property nor a method named 'run', I'd
assume that $foo in a want method context would delegate the run method
invokation to one of the object-eigenstates.

But what would:

$foo = all(Bar::new, Baz::new, Xyzzy::new);
$foo.run;

do?

Invoke the run method against all of the object-eigenstates? And if not in a
void context, return a junction containing their results?

--
Garrett Goebel
IS Development Specialist

ScriptPro Direct: 913.403.5261
5828 Reeds Road Main: 913.384.1008
Mission, KS 66202 Fax: 913.384.2180
www.scriptpro.com gar...@scriptpro.com

Timothy S. Nelson

unread,

Nov 15, 2002, 11:40:15 AM11/15/02

to Ken Fox, Andy Wardley, perl6-l...@perl.org

This is another reason I want meta-operators. That way, we could
have a meta-operator which indicated that the operands should be treated as
strings instead of numbers. And you could have another meta-operator for
case-sensitivity (for string comparisons). Some of those combining unicode
symbols would be great for this.

On the subject of Unicode, I'm using Yudit as an editor, and SimPL as
the font; note that if you follow their instructions for globally adding SimPL
to Yudit as an allowed font, you'll have to delete your .yudit directory (or
modify it separately).
Yudit: http://www.yudit.org/
SimPL: http://www.vector.org.uk/v161/phil161.htm

:)

---------------------------------------------------------------------
| Name: Tim Nelson | Because the Creator is, |
| E-mail: way...@smartchat.net.au | I am |
---------------------------------------------------------------------

----BEGIN GEEK CODE BLOCK----
Version 3.1
GCS d? s: a-- C++>++++$ US+ P++ L++ E- W+++ N+ w+> M-- V- Y+>++
PGP->++ R(+) !tv B++ DI++++ D+ G e>++ h!/* y-
-----END GEEK CODE BLOCK-----

Larry Wall

unread,

Nov 16, 2002, 12:48:59 AM11/16/02

to Richard Proctor, Michael G Schwern, perl6-l...@perl.org, Andy Wardley

On Thu, Nov 14, 2002 at 09:10:07PM +0000, Richard Proctor wrote:

: On Thu 14 Nov, Michael G Schwern wrote:
: > On Thu, Nov 14, 2002 at 12:19:47PM +0000, Andy Wardley wrote:
: > > Can we overload + in Perl 6 to work as both numeric addition
: > > and string concatenation, depending on the type of the operand
: > > on the left?
:
: There have been times when I have wondered if string concatination could be
: done without any operator at all. Simply the placement of two things
: next to each other as in $foo $bar or $foo$bar would silently concatenate
: them. But then I feel there are some deep horrors and ambiguities that
: I have failed to spot...

Yes, and it has little to do with indirect objects. The Perl parser
differentiates many operators by whether it is expecting an operator
or a term, and *all* of those would be ambiguous if there were a
null operator. You can't even tell whether a + is unary or binary...

Sigh. I've said this any number of times, but it never seems to
sink in. Perl is so slick on this point that people doen't realize
how much of this is happening already in Perl 5:

Char Term Operator
% hash mod
& sub "and"
* glob multiply, exponentiate
- negate subtract
+ noop add
< input, heredoc less than, left shift...
. number concatenation
/ regex division
? regex conditional

Perl 6 will do things differently, but the fact remains that we can't
have a juxtapositional operator in Perl, ever. The only place Perl 5
tries to get away with it is with indirect objects, and it's a mess,
and only works as well as it does because indirect objects aren't
general expressions. As it is, it has to do weird lookahead stuff like
distinguishing

print $x +1;

from

print $x + 1;

We're trying to avoid that kind of evil in Perl 6. We're going for
other evils this time...

Larry

Damian Conway

unread,

Nov 16, 2002, 8:46:50 PM11/16/02

to perl6-l...@perl.org

Dan Sugalski pondered:

> What does:
>
> $foo = any(Bar::new, Baz::new, Xyzzy::new);
> $foo.run;
>
> do?

Creates a disjunction of three classnames, then calls the C<.run> method on
each, in parallel, and returns a disjunction of the results of the calls
(which, in the void context is ignored, or maybe optimized away).

Damian

Dan Sugalski

unread,

Nov 17, 2002, 1:03:50 PM11/17/02

to perl6-l...@perl.org

I was afraid you'd say that. It does rather complicate things, as the
interpreter really isn't set up to be quantum for control flow. Can
we at least guarantee undefined order of operations on things? (I can
pitch heisenbunnies at people if it'll help)

Damian Conway

unread,

Nov 17, 2002, 3:39:55 PM11/17/02

to perl6-l...@perl.org

Dan Sugalski wrote:

>> Creates a disjunction of three classnames, then calls the C<.run>
>> method on each, in parallel, and returns a disjunction of the results
>> of the calls (which, in the void context is ignored, or maybe
>> optimized away).
>
> I was afraid you'd say that.

Then you shouldn't have asked the question. ;-)

> It does rather complicate things, as the
> interpreter really isn't set up to be quantum for control flow.

QCF is definitely not required because "Junctions Are Not Quantum".
Normal threading is quite enough.

And even that's not *essential*. The Q::S module can do this now:

use Quantum::Superpositions;

my $test = all(\&tall, \&dark, \&handsome);
if ($test->()) { print "I'm in lurv!"}

my $frankie = any(Tall->new(), Dark->new(), Gruesome->new());
$frankie->spark();
$frankie->live();
$frankie->menace();
# etc.

and just evaluates the various calls serially.

It would be *vastly* better thought integrate junctive calls with
the standard threading behaviour.

> Can we at least guarantee undefined order of operations on things?

Yes. Please. I would certainly expect that the order of execution
is undefined, since the states of a junction are not themselves ordered.

Damian

Damian Conway

unread,

Nov 17, 2002, 4:22:45 PM11/17/02

to perl6-l...@perl.org

Luke Palmer asked:

> Of course, there will be a pragma or something to instruct it to
> operate serially, yes?

I doubt it. Unless there's a pragma to instruct threads to operate
serially.

In any case, I'm not sure what such a pragma would buy you. The
ordering of evaluation would still be inherently unordered.

BTW, in thinking about it further, I realize that Dan is going
to have to tackle this issue anyway. There's fundamentally no
difference in the exigencies of:

$junction = $x | $y | $z;
foo($junction); # Call foo($x), foo($y), and foo($z)
# in parallel and collect the results
# in a disjunction

and

$junction = &f | &g | &h;
$junction($x); # Call f($x), g($x), and h($x)
# in parallel and collect the results
# in a disjunction

Damian

Luke Palmer

unread,

Nov 17, 2002, 4:08:50 PM11/17/02

to dam...@conway.org, perl6-l...@perl.org

> Date: Mon, 18 Nov 2002 07:39:55 +1100
> From: Damian Conway <dam...@conway.org>

>
> It would be *vastly* better thought integrate junctive calls with
> the standard threading behaviour.

Of course, there will be a pragma or something to instruct it to
operate serially, yes?

Luke

Dan Sugalski

unread,

Nov 17, 2002, 5:51:10 PM11/17/02

to perl6-l...@perl.org

At 7:39 AM +1100 11/18/02, Damian Conway wrote:
>Dan Sugalski wrote:
>
>Creates a disjunction of three classnames, then calls the C<.run>
>method on each, in parallel, and returns a disjunction of the results
>>> of the calls (which, in the void context is ignored, or maybe
>>> optimized away).
>>
>>I was afraid you'd say that.
>
>Then you shouldn't have asked the question. ;-)

Sometimes the answers to the questions I don't ask are scarier than
the answers to the ones I do... ;-P

>>It does rather complicate things, as the interpreter really isn't
>>set up to be quantum for control flow.
>
>QCF is definitely not required because "Junctions Are Not Quantum".
>Normal threading is quite enough.

[Snip]

>It would be *vastly* better thought integrate junctive calls with
>the standard threading behaviour.

Perl's standard threading behaviour's going to be rather heavyweight,
though. I'm not 100% sure we're going to want to go that route,
unless we can sharply restrict what the heisenbunnies can see.
(Though the presentation on Erlang at LL2 has got me thinking more
about efficient multithreading. I don't think we'll be able to use it
for perl, though)

>>Can we at least guarantee undefined order of operations on things?
>
>Yes. Please. I would certainly expect that the order of execution
>is undefined, since the states of a junction are not themselves ordered.

Good. We shall have to enforce that, then. Wedge some randomness into
the quantum thingies or something.

Iain 'Spoon' Truskett

unread,

Nov 17, 2002, 9:00:31 PM11/17/02

to perl6-l...@perl.org

* Dan Sugalski (d...@sidhe.org) [18 Nov 2002 12:56]:

[...]

> Perl's standard threading behaviour's going to be
> rather heavyweight, though.

Silly question time: Why is it going to be rather heavyweight?
(Not complaining or berating, just wanting information =) )

> (Though the presentation on Erlang at LL2 has got me thinking more
> about efficient multithreading.

Good!

> I don't think we'll be able to use it
> for perl, though)

Not so good! =)

cheers,
--
Iain.

Dan Sugalski

unread,

Nov 17, 2002, 11:32:29 PM11/17/02

to perl6-l...@perl.org

At 8:22 AM +1100 11/18/02, Damian Conway wrote:
>Luke Palmer asked:
>
>>Of course, there will be a pragma or something to instruct it to
>>operate serially, yes?
>
>I doubt it. Unless there's a pragma to instruct threads to operate
>serially.
>
>In any case, I'm not sure what such a pragma would buy you. The
>ordering of evaluation would still be inherently unordered.
>
>
>BTW, in thinking about it further, I realize that Dan is going
>to have to tackle this issue anyway. There's fundamentally no
>difference in the exigencies of:

I've been noticing things have been getting rather more quantum
lately. This may have some... interesting repercussions, as that has
some subtle and not so subtle ramifications in how the interpreter
needs to behave.

Dan Sugalski

unread,

Nov 17, 2002, 11:18:22 PM11/17/02

to perl6-l...@perl.org

At 1:00 PM +1100 11/18/02, Iain 'Spoon' Truskett wrote:
>* Dan Sugalski (d...@sidhe.org) [18 Nov 2002 12:56]:
>
>[...]
>> Perl's standard threading behaviour's going to be
>> rather heavyweight, though.
>
>Silly question time: Why is it going to be rather heavyweight?
>(Not complaining or berating, just wanting information =) )

Well, the problem is shared data.

Firing off multiple interpreters isn't that big a deal, though there
is some overhead in initializing an interpreter. If we do Clever
Things, we can cut down the overhead, but there's still some, so
creating a new interpreter will not be dirt cheap. (Which is fine, as
it makes the common case faster)

The expensive part is the shared data. All the structures in an
interpreter are too large to act on atomically without any sort of
synchronization, so everything shared between interpreters needs to
have a mutex associated with it. Mutex operations are generally
cheap, but if you do enough of them they add up.

The threading model that perl leans towards is either a share-lots
scheme, or a share-nothing-but-copy scheme, both of which are pretty
expensive. The copy form, of course, requires copying data, which
isn't cheap. The share scheme requires lots of locking, as the core
data structures are too big for low-level atomic access.

> > (Though the presentation on Erlang at LL2 has got me thinking more
>> about efficient multithreading.
>
>Good!
>
>> I don't think we'll be able to use it
>> for perl, though)
>
>Not so good! =)

Why? It's not perl's problem space. To do efficient large-scale
multithreading you need a shared-nothing system with fast message
passing and very little information being sent between threads.

Dave Whipp

unread,

Nov 18, 2002, 12:10:29 AM11/18/02

to perl6-l...@perl.org

Dan Sugalski wrote:
> The expensive part is the shared data. All the structures in an
> interpreter are too large to act on atomically without any sort of
> synchronization, so everything shared between interpreters needs to have
> a mutex associated with it. Mutex operations are generally cheap, but if
> you do enough of them they add up.

Why do we need to use preemptive threads? If Parrot is a VM, then surely
the threading can be implemented at its level, or even higher. If it is
the VM that implements the threading, then its data structures don't
need to be locked. The main problem with that approach is that the
multithreading would not be able to preempt C-level callouts: but that
could be solved by spawning a true thread only when code makes calls out
of the parrot VM.

Dave.

Nicholas Clark

unread,

Nov 18, 2002, 9:57:29 AM11/18/02

to Damian Conway, perl6-l...@perl.org

On Mon, Nov 18, 2002 at 08:22:45AM +1100, Damian Conway wrote:
> Luke Palmer asked:
>
> > Of course, there will be a pragma or something to instruct it to
> > operate serially, yes?
>
> I doubt it. Unless there's a pragma to instruct threads to operate
> serially.
>
> In any case, I'm not sure what such a pragma would buy you. The
> ordering of evaluation would still be inherently unordered.

If parrot had cheap threading within the same perl interpreter, then it
might buy you something. Without the pragma, there is no reason why parrot
should not use multiple threads (on a SMP machine) to evaluate the parts of
a junction in parallel, so you'd have to have your perl level code capable
of being re-entrant. With the pragma, you could at least specify that you
want one thing to finish before the next starts, even though there is still
no fixed order in which they take place.

But I'm not sure if parrot is going to give the perl interpreter cheap
threading. (Does the async IO mean that one parrot interpreter could
internally co-operatively thread perl in some cases?)

Nicholas Clark

Dan Sugalski

unread,

Nov 18, 2002, 11:52:35 AM11/18/02

to Dave Whipp, perl6-l...@perl.org

Parrot's not just a VM--if we did our own threads that'd slow down
JIT-generated code universally, or forbid the use of the JIT when
running with threads, both of which are no good, not to mention all
the fun we'd have recreating all the threading mistakes of the past.
Plus we wouldn't be able to use multiple processors on systems that
have them.

It's not easy to do right, and there's no real benefit to be had in
doing it at all, so we're not. System threads are the way to go for
us.

Dan Sugalski

unread,

Nov 18, 2002, 11:54:19 AM11/18/02

to perl6-l...@perl.org

At 2:57 PM +0000 11/18/02, Nicholas Clark wrote:
>But I'm not sure if parrot is going to give the perl interpreter cheap
>threading. (Does the async IO mean that one parrot interpreter could
>internally co-operatively thread perl in some cases?)

Oh, it could do it preemptively. And parrot can (and, I think, will)
provide inexpensive threading, but only in cases where there's
minimal mutable data sharing.

Matt Diephouse

unread,

Nov 18, 2002, 4:55:19 PM11/18/02

to Damian Conway, perl6-l...@perl.org

Damian Conway wrote:

> BTW, in thinking about it further, I realize that Dan is going
> to have to tackle this issue anyway. There's fundamentally no
> difference in the exigencies of:
>
> $junction = $x | $y | $z;
> foo($junction); # Call foo($x), foo($y), and foo($z)
> # in parallel and collect the results
> # in a disjunction

Looking at that code, I'm wondering how you pass a junction. Suppose I
want to pass a junction to a subroutine instead of calling the sub with
each value of the junction... how would I do that?

m:att d:iephouse

Dan Sugalski

unread,

Nov 18, 2002, 5:19:19 PM11/18/02

to perl6-l...@perl.org

At 9:05 AM +1100 11/19/02, Damian Conway wrote:

>matt diephouse wrote:
>
>>> $junction = $x | $y | $z;
>>> foo($junction); # Call foo($x), foo($y), and foo($z)
>>> # in parallel and collect the results
>>> # in a disjunction
>>
>>
>>Looking at that code, I'm wondering how you pass a junction.
>>Suppose I want to pass a junction to a subroutine instead of
>>calling the sub with each value of the junction... how would I do
>>that?
>

>Tell the sub that it's expecting an undistributed junction as its argument:

Hrm. What happens if the junction is then used as an iterator?

$junction = File::Open("foo") | File::Open("bar);
for (<$junction>) {
...
}

Which could get interesting if inside the for loop the code creates
more junctions and iterates over them. (Potentially ad infinitum)

And here I thought Quantum INTERCAL was a joke... :)

Damian Conway

unread,

Nov 18, 2002, 5:05:34 PM11/18/02

to perl6-l...@perl.org

matt diephouse wrote:

>> $junction = $x | $y | $z;
>> foo($junction); # Call foo($x), foo($y), and foo($z)
>> # in parallel and collect the results
>> # in a disjunction
>
>
> Looking at that code, I'm wondering how you pass a junction. Suppose I
> want to pass a junction to a subroutine instead of calling the sub with
> each value of the junction... how would I do that?

Tell the sub that it's expecting an undistributed junction as its argument:

sub foo($param is junction) {...}

Damian

Damian Conway

unread,

Nov 18, 2002, 7:15:18 PM11/18/02

to perl6-l...@perl.org

Dan Sugalski wrote:

> Hrm. What happens if the junction is then used as an iterator?
>
> $junction = File::Open("foo") | File::Open("bar);
> for (<$junction>) {
> ...
> }

In Larry's formulation that's just the same as:

while $_ := $junction.next { ... }

which, when called on a junction, C<next>s each state in parallel and
returns a junction of the returned values. In other words, in each iteration
$_ will have a disjunction of the next lines from files foo and bar.

> Which could get interesting if inside the for loop the code creates more
> junctions and iterates over them. (Potentially ad infinitum)

The C<for> loop is still single-threaded, only the values it's iterating
are multiplexed.

> And here I thought Quantum INTERCAL was a joke... :)

Junctions Aren't Quantum.

Damian

Matt Diephouse

unread,

Nov 18, 2002, 6:59:58 PM11/18/02

to perl6-l...@perl.org

Damian Conway wrote:

Doesn't that go against perl's dynamic philosophy? That requires me to
type my methods where I may not want to. Let's say I have a sub that
logs errors:

sub log_error($fh, $error) { # filehandle and error msg
$error_num++; # global
print $fh: "$error_num: $error\n";
}

my $file = open "error.log";
log_error $file, "This message is phony";

However, during my debugging, I realize that I need two error logs
(Don't ask me why, I just do). Instead of changing the one line to

my $file = open "error.log" & "../some/other.log"; # I hope this is
legal

I also need to change the subroutine now, because the error count will
be off, even though my change is temporary. It reduces the ability to
write subs that accept anything and DWIM. The question is when/how do
you choose whether to pass a junction or evaluate all of them. I think
that the solution would be best left out of the sub's signature though.
Of course this has to stop somewhere; you eventually have to pick a state.

m:att d:iephouse

Luke Palmer

unread,

Nov 18, 2002, 8:47:29 PM11/18/02

to ma...@diephouse.com, perl6-l...@perl.org

> Mailing-List: contact perl6-lan...@perl.org; run by ezmlm
> Date: Mon, 18 Nov 2002 18:59:58 -0500
> From: matt diephouse <ma...@diephouse.com>
> X-SMTPD: qpsmtpd/0.12, http://develooper.com/code/qpsmtpd/

It's either that or have your functions, which were perfectly logical
suddenly be subject to junction logic. That is, if $x == 2 and $x
== 3 both being true, when your code relies on them not both firing.
I think it's a very good decision to make sure that functions know
they might be getting junctions and making it explicit.

Luke

> m:att d:iephouse
>
>

David Wheeler

unread,

Nov 18, 2002, 10:46:32 PM11/18/02

to Luke Palmer, ma...@diephouse.com, perl6-l...@perl.org

On Monday, November 18, 2002, at 05:47 PM, Luke Palmer wrote:

> It's either that or have your functions, which were perfectly logical
> suddenly be subject to junction logic. That is, if $x == 2 and $x
> == 3 both being true, when your code relies on them not both firing.
> I think it's a very good decision to make sure that functions know
> they might be getting junctions and making it explicit.

My god, I just realized that junctions are going to *completely* do
away with the complaints of JAPH fans that Perl 6 will be too verbose,
too hard to make obscure...

Oh well, price of power, I guess.

Regards,

David

--
David Wheeler AIM: dwTheory
da...@wheeler.net ICQ: 15726394
http://david.wheeler.net/ Yahoo!: dew7e
Jabber: The...@jabber.org

Damian Conway

unread,

Nov 18, 2002, 11:00:43 PM11/18/02

to perl6-l...@perl.org

matt diephouse wrote:

>> sub foo($param is junction) {...}

> Doesn't that go against perl's dynamic philosophy?

???

> That requires me to type my methods where I may not want to.
> Let's say I have a sub that logs errors:
>
> sub log_error($fh, $error) { # filehandle and error msg
> $error_num++; # global
> print $fh: "$error_num: $error\n";
> }
> my $file = open "error.log";
> log_error $file, "This message is phony";
>
> However, during my debugging, I realize that I need two error logs
> (Don't ask me why, I just do). Instead of changing the one line to
>
> my $file = open "error.log" & "../some/other.log"; # I hope this is legal

Under my junctive semantics it is. It simply calls C<open> twice, with
the two states, and returns a conjunction of the resulting filehandles.
Though you probably really want a *dis*junction there.

>
> I also need to change the subroutine now, because the error count will
> be off, even though my change is temporary. It reduces the ability to
> write subs that accept anything and DWIM.

So how does C<print> know to parallelize, rather than just pass in the
junction?

You can't have it both ways. Either the default is to parallelize at
the point a junction is passed to a subroutine, and you have to mark
subroutines that preserve their junctive arguments; or the default is
that junctions pass into subroutines, and you have to mark subroutines
that parallelize when given a junction.

I've thought about it at considerable length, and played around with
the Q::S module. I concluded that passing junctions into subroutines
by default is a Very Bad Idea. The reason, as Luke has already pointed
out, is that junctive logic is different from scalar logic. So most
subroutines won't be able to "accept anything and DWIM" anyway.

Damian

Damian Conway

unread,

Nov 18, 2002, 11:45:51 PM11/18/02

to perl6-l...@perl.org

Dave Whipp wrote:

>>Under my junctive semantics it is. It simply calls C<open> twice, with
>>the two states, and returns a conjunction of the resulting filehandles.
>>Though you probably really want a *dis*junction there.
>
>

> The thing that's worrying me is: what happens when one of them throws an
> exception?

Then the exception propagates back to the calling context and the result
junction is never created.

> Can I catch half of the junction?

You mean: can you catch the exception generated by half a junction. Yes.

> Do the two threads ever join?

Yes. At the point of recombining the values of the parallel calls.

> Does the exception get deferred until after all the threads have completed?

I would doubt it.

> If both throw an exception: what happens then?

You just get one or the other, in no defined order.
(No, I'm *not* going to scare Dan by suggesting that $! ends up with
a junction of the two exceptions. ;-)

Damian

Dave Whipp

unread,

Nov 18, 2002, 11:38:59 PM11/18/02

to perl6-l...@perl.org

"Damian Conway" <dam...@conway.org> wrote > > my $file = open "error.log"

& "../some/other.log"; # I hope this is legal
>
> Under my junctive semantics it is. It simply calls C<open> twice, with
> the two states, and returns a conjunction of the resulting filehandles.
> Though you probably really want a *dis*junction there.

The thing that's worrying me is: what happens when one of them throws an
exception? Can I catch half of the junction? Do the two threads ever join?

Does the exception get deferred until after all the threads have completed?

If both throw an exception: what happens then?

Dave.

Dan Sugalski

unread,

Nov 19, 2002, 12:09:43 AM11/19/02

to Damian Conway, perl6-l...@perl.org

At 3:45 PM +1100 11/19/02, Damian Conway wrote:

>Dave Whipp wrote:
>>Does the exception get deferred until after all the threads have completed?
>
>I would doubt it.

We're definitely going to need to nail the semantics down. Would one
thread throwing an exception require all the threads being aborted,
for example?

>>If both throw an exception: what happens then?
>
>You just get one or the other, in no defined order.
>(No, I'm *not* going to scare Dan by suggesting that $! ends up with
>a junction of the two exceptions. ;-)

Don't worry about me. I'm still trying to shake that whole quantum thing... :-P

Damian Conway

unread,

Nov 19, 2002, 2:09:01 AM11/19/02

to perl6-l...@perl.org

Dan Sugalski wrote:

> We're definitely going to need to nail the semantics down. Would one
> thread throwing an exception require all the threads being aborted, for
> example?

I would imagine so. You can't reasonably build a junction out of values
that weren't successfully created. If you write:

$var = die();

you get an exception thrown. Why should it be any different if the rvalue
was trying to assume other non-exceptional values as well:

$var = foo() | bar() | die();

The whole point of junctions is "No Visible Parallelism": all the parallelism
occurs inside the junction constructor. At the end of construction your single
thread gets back a single scalar. And if the construction of that single
scalar involved an exception, your single thread should get that exception.

As for short-circuiting: why not? Junctions are inherently unordered, so
there's no guarantee which state of the junction is processed first
(whether they're being processed in parallel or series).

Damian

Dan Sugalski

unread,

Nov 19, 2002, 12:01:42 PM11/19/02

to perl6-l...@perl.org

At 6:09 PM +1100 11/19/02, Damian Conway wrote:
>Dan Sugalski wrote:
>
>>We're definitely going to need to nail the semantics down. Would
>>one thread throwing an exception require all the threads being
>>aborted, for example?
>
>I would imagine so. You can't reasonably build a junction out of values
>that weren't successfully created.

Whups, misunderstanding there. I realize that we need to throw an
exception (or a junction of exception and not exception) if
evaluating one of the junction members. The question is whether we
should evaluate them all regardless and then figure it out at the
end, and what to do with currently running junction evaluations if
we've spawned off multiple threads to evaluate them in parallel. I
expect I'm getting a bit too Quantum here, though.

I'm thinking that we shouldn't parallelize junction evaluation by
default. Dealing with threads has too many issues that must be dealt
with to spring it on unsuspecting programs.

>As for short-circuiting: why not? Junctions are inherently unordered, so
>there's no guarantee which state of the junction is processed first
>(whether they're being processed in parallel or series).

That's fine. One of the semantics I want nailed down. :)

Martin D Kealey

unread,

Nov 18, 2002, 9:34:18 PM11/18/02

to Dave Whipp, perl6-l...@perl.org

On Mon, 2002-11-18 at 18:10, Dave Whipp wrote:
> Why do we need to use preemptive threads? If Parrot is a VM, then surely
> the threading can be implemented at its level, or even higher.

And what about *lower*? Like down among the CPUs?

I want Perl to run 128 times faster on a 128 CPU machine... now I know
that's not entirely realistic, but it should be able to run at least say
60 times faster.

It's not that we necessarily want *preemptive* threads, but if we can't
do that, we certainly can't do multiprocessor threads.

-Martin

Mark Biggar

unread,

Nov 20, 2002, 12:57:45 PM11/20/02

to perl6-l...@perl.org

Martin D Kealey wrote:
> On Mon, 2002-11-18 at 18:10, Dave Whipp wrote:
>
>>Why do we need to use preemptive threads? If Parrot is a VM, then surely
>>the threading can be implemented at its level, or even higher.
>
>
> And what about *lower*? Like down among the CPUs?
>
> I want Perl to run 128 times faster on a 128 CPU machine... now I know
> that's not entirely realistic, but it should be able to run at least say
> 60 times faster.

Amdahl's law applies here: "no amount of paralellism will speed up
an inheirently sequential algorithm"

--
Mark Biggar
mark.a...@attbi.com

Damian Conway

unread,

Nov 20, 2002, 3:40:09 PM11/20/02

to perl6-l...@perl.org

Dan Sugalski wrote:

> Whups, misunderstanding there. I realize that we need to throw an
> exception (or a junction of exception and not exception) if evaluating
> one of the junction members. The question is whether we should evaluate
> them all regardless and then figure it out at the end, and what to do
> with currently running junction evaluations if we've spawned off
> multiple threads to evaluate them in parallel. I expect I'm getting a
> bit too Quantum here, though.

Not at all. It's an important question, especially if the other threads
have side effects. I suspect, however, that once one state of a junction
throws an exception, we should just kill off the other states immediately.

> I'm thinking that we shouldn't parallelize junction evaluation by
> default. Dealing with threads has too many issues that must be dealt
> with to spring it on unsuspecting programs.

Perhaps. But we need to think through the issues so that we can eventually
move to threaded implementations without changing the semantics.

Damian

Martin D Kealey

unread,

Nov 20, 2002, 9:22:57 PM11/20/02

to Mark Biggar, perl6-l...@perl.org

On Thu, 2002-11-21 at 06:57, Mark Biggar wrote:

> Martin D Kealey wrote:
> > I want Perl to run 128 times faster on a 128 CPU machine... now I know
> > that's not entirely realistic, but it should be able to run at least say
> > 60 times faster.
>
> Amdahl's law applies here: "no amount of paralellism will speed up
> an inheirently sequential algorithm"

True in the abstract, but in practice in most languages an awful lot of
algorithms that I<aren't> inherently sequential get serialized by the
compiler because it can't tell it's safe to do otherwise.

This is where pure-functional or applicative languages can have a big
performance win - because the compile almost alway I<can> see that
things are safe to parallelize.

-Martin

Simon Cozens

unread,

Nov 23, 2002, 2:34:47 PM11/23/02

to perl6-l...@perl.org

Smy...@stripey.com (Smylers) writes:
> > ... they believed that the + should concatenate the two strings.
> >
> > Makes perfect sense to me.
>
> Makes sense in a language where variables are typed

It also makes sense in a language where values are typed. They just
have to be slightly more strongly typed than just "scalar". But Perl 6
is already going to support INT and STRING built-in types, right? So I
see no problem with + doing string concat. I could mention some other
languages (or at least, a language (of which I'm becoming considerably
more fond as I get to know it (especially having just come back from
Japan (excuse the jet lag)))) which takes this approach.

--
Sigh. I like to think it's just the Linux people who want to be on
the "leading edge" so bad they walk right off the precipice. (Craig
E. Groeschel)

Paul Johnson

unread,

Nov 23, 2002, 3:02:16 PM11/23/02

to Simon Cozens, perl6-l...@perl.org

On Sat, Nov 23, 2002 at 07:34:47PM +0000, Simon Cozens wrote:

> I could mention some other
> languages (or at least, a language (of which I'm becoming considerably
> more fond as I get to know it (especially having just come back from
> Japan (excuse the jet lag)))) which takes this approach.

Lisp is Japanese?

--
Paul Johnson - pa...@pjcj.net
http://www.pjcj.net

Larry Wall

unread,

Nov 23, 2002, 4:05:20 PM11/23/02

to perl6-l...@perl.org

On Sat, Nov 23, 2002 at 07:34:47PM +0000, Simon Cozens wrote:

: Smy...@stripey.com (Smylers) writes:
: > > ... they believed that the + should concatenate the two strings.
: > >
: > > Makes perfect sense to me.
: >
: > Makes sense in a language where variables are typed
:
: It also makes sense in a language where values are typed. They just
: have to be slightly more strongly typed than just "scalar". But Perl 6
: is already going to support INT and STRING built-in types, right? So I
: see no problem with + doing string concat. I could mention some other
: languages (or at least, a language (of which I'm becoming considerably
: more fond as I get to know it (especially having just come back from
: Japan (excuse the jet lag)))) which takes this approach.

While no assumption is going unquestioned for Perl 6, I do still
believe that the decision not to overload + for concatenation is one
of the few things I did right in Perl 1. When people look at $a + $b
in Perl they don't have to wonder what it means. Addition is such a
fundamental operation that it should be kept as clean as possible, both
for readability and for optimizability. (The two are not unrelated.)

There are several things I like about Ruby, but using + for string
concatenation is not one of them. It's another one of those areas
where the Principle of Least Astonishment is misapplied. Any language
that doesn't occasionally surprise the novice will pay for it by
continually surprising the expert. Ruby's scoping rules also fail
on this point, in my estimation.

Larry

Simon Cozens

unread,

Nov 24, 2002, 3:25:35 AM11/24/02

to perl6-l...@perl.org

la...@wall.org (Larry Wall) writes:
> While no assumption is going unquestioned for Perl 6, I do still
> believe that the decision not to overload + for concatenation is one
> of the few things I did right in Perl 1.

Fair enough. And maybe I'm getting ahead of myself (or behind myself)
anyway. Presumably those who want Ruby-style can do something resembling

sub operator:+(STRING, STRING) { $^a ~ $^b }

anyway.

--
I've looked at the listing, and it's right!
-- Joel Halpern