Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

string/list division

6 views
Skip to first unread message

Juerd

unread,
Mar 28, 2005, 12:12:38 PM3/28/05
to perl6-l...@perl.org
It was a matter of time, of course, after my last thread.

How often do we want chunks of a string or list? And how often do we
abuse a temporary copy and substr/splice for that?

What if instead of

my @copy = @array;
while (my @chunk = splice @copy, 0, $chunksize) {
...
} # ^1

we could just write

for @array [/] $chunksize -> @chunk { ... }

and instead of

my $copy = $string;
while (defined(my $chunk = substr $copy, 0, $chunksize)) {
...
} # ^2

we could use

for $string ~/ $chunksize -> $chunk { ... }

I think it'd make life much easier.

Of course, [/] is subject to the same discussion as the other thread,
and should perhaps be (/) or */.


Juerd

PS. http://tnx.nl/3689VBOF # consistency gone mad (expires in 1 day)

^1 Yes, I know it can be made more efficient by using the "list
reference" trick, sub { \@_ }->(LIST).
^2 Same thing, but with \substr. Too bad there is only one LVALUE,
because you can't keep a reference around. Will this be fixed in P6?
--
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html
http://convolution.nl/gajigu_juerd_n.html

Luke Palmer

unread,
Mar 28, 2005, 1:38:48 PM3/28/05
to Juerd, perl6-l...@perl.org
Juerd writes:
> What if instead of
>
> my @copy = @array;
> while (my @chunk = splice @copy, 0, $chunksize) {
> ...
> } # ^1

Well, I for one never write that. I very seldom use splice.

>
> we could just write
>
> for @array [/] $chunksize -> @chunk { ... }

Well, we could also write:

for @array -> $a, $b, $c {...}

But if we're dealing with the chunks, rather than named pieces of the
chunks, then I wonder if this would DTRT:

for @array -> *@chunk is shape($chunksize) {...}

I have difficulty believing that that would work. Essentially we need a
way of saying that the arity of a parameter list is a specific size, but
that we want to put all the arguments into an array. And of course if
it were possible, then the above would be the way to do it.

Your "list mod" idea is interesting, though. I fear that adding too
many list operators will start to make us look like Haskell, where we
have *extremely* expressive single lines that take an hour to write and
an hour to read (I call this "concise hell"). I think the fact that we
have other WTDIs might save us from that, though[1].

I remember talking to an APL programmer at the last OSCON, who believed
that APL captured a program's essence. The fact that you could do very
complex things using combinatorial meta operators in very few characters
seemed to him to define the true nature of programming (Identifying
programming much more with mathematics than with engineering on this
front). It is questionable whether we should give him his essence in
Perl 6. On one hand, it's another paradigm, and Perl 6 is
multiparadigmatic so that everybody feels comfortable writing it. On
the other hand, the more paradigms we include, the more code readers
won't understand (and learning new paradigms is very much harder than
learning new syntax).

Still, Perl's module theory of encapsulation works somewhat well for a
multiparadigmatic language, where generally only one person has to
understand the code... but he has to understand it very well.

But I digress (one wonders how *that* came out of this proposal).

[1] Keep in mind that I don't necessarily consider concise hell to be a
bad thing. I just consider it very different from the direction Our
Successful Perl has gone so far, and certainly different from the way I
like to write/read code. But I'll use concise hell as a negative
connotation for the rest of this email, because I'm subjective like
that.

> and instead of
>
> my $copy = $string;
> while (defined(my $chunk = substr $copy, 0, $chunksize)) {
> ...
> } # ^2

Which is very rare in the regexy world of Perl. I almost never use
C<substr>, since regexes do everything I ever wanted to do with strings
but was afraid to ask.

> we could use
>
> for $string ~/ $chunksize -> $chunk { ... }
>
> I think it'd make life much easier.

Supposing that we see a need for an operator like this.

I wonder if we're diving so far into the realm of consistency with this
proposal that it would be better to talk about the ~ and [ ]
metaoperators. And of course the + variants of these operators could
not exist since numbers behave quite differently from sequences.

So, let's consider for a moment what a string looks like if it were just
like a list of characters (barring Unicode discussions which I generally
try to avoid). Let's use ~[] for string-meta and *[] for list-meta.

~[~] is just like ~, but *[~] concats two lists, an operation perl
has seemed lacking (not that that's been an issue).

*[<<] is shift, *[>>] is pop, ~[>>] is chop, ~[<<] is that other one
that people have requested.

I don't know what the meta-comparison operators would do. If we
define them so that: [$a, $b, $c] *[<] [$d, $e, $f] === $a < $d &&
$b < $e && $c < $f, then we find that *[<] and *[>] are mostly
adding syntax for something relatively uncommon that deserves a
name (and is starting to look like concise hell). *[==] on the
other hand is extremely useful, and is something that perl has
lacked in its core for ages.

Uh, that's about all I can really think of for useful operators.
Perhaps I'm missing some (like your modulus, but that's defining a new
sequence mod operator anyway). All in all, I don't think meta operators
buy us very much in this case, except for more tickets to concise hell.

Perhaps a good thing to do here is to separate out a class of "sequence"
operators, like ~ and your [/], define them parallel-like for
strings and lists, and see how that looks. If I were the one doing
this, I would try to keep the list of sequence operators small (not
every operator need have a sequential analog).

Luke

Aaron Sherman

unread,
Mar 28, 2005, 3:09:56 PM3/28/05
to Luke Palmer, Perl6 Language List
On Mon, 2005-03-28 at 13:38, Luke Palmer wrote:

> Your "list mod" idea is interesting, though. I fear that adding too
> many list operators will start to make us look like Haskell, where we
> have *extremely* expressive single lines that take an hour to write and
> an hour to read (I call this "concise hell").

Ah, to have access to the Concise Hell compiler... ;-)

Seriously, if you have a line of code which takes an hour to write and
an hour to read, is that any better or worse than the equivalent 10
pages of code that take an hour to write and an hour to read?

That said, one of Perl's largest PR problems has always been the power
of the syntax and grammar of the language. When people look at a program
and see:

my %uid_map = map {/^(\w+):[^:]*:(\d+)/} <>;

They tend to get a hair upset. It's not that they have to stare at this
any longer than they have to stare at the equivalent in C or C++ or
Java, but that they have this understanding of "how long it should take
to read a line of code," without respect to the semantic content of the
line.

In this one line we have (yeah, I know you can all read it, but I'm
going to list the components so that we stop and think about them) the
equivalent of a block of code with three loops (a read loop, a
match/store loop, and a hash builder loop), along with the implicit
behavior of <> over the pseudo-filehandle ARGV, along with the implicit
behavior of m/.../ in a list-context.

Write that in C and it would be a LOT of code (depending on the toolkits
you used).

Which one is easier to read? I'd argue that the Perl is much easier to
read, but that's only IF you know Perl well enough that you can absorb
such semantic complexity at a glance.

The question is: should you be apologetic and/or fail to follow this
semantic compression down to its logical conclusion because some people
will find it off-putting.

I think that in the specific case of Perl 6, the answer is a resounding
NO for two reasons:

1. Perl 6 can (dangerous lack of the future subjunctive here)
introspect to a degree that almost no other language can (some
functional languages are exceptions here), so writing a "strict"
module that forces programmers to avoid these semantically rich
constructs (or limit their use to "reasonable situations") is at
your fingertips. You could even make it invalid at a parser
level to chain more than some given number of operations.
2. Parrot provides access to Perl 6 code from other languages, so
if Damian writes some wiz-bang module that everyone in the world
wants access to, even the people who don't like Perl 6 can do so
without having to get used to Perl 6. If APL had had Parrot, we
might all be running APL code to this day.

Because of this, Perl 6 need not apologize for it's APL/Haskell-like
penchant for semantic compression. It can evolve along the lines that
Perl 4 and 5 made available to it, and not leave anyone behind.

--
Aaron Sherman <a...@ajs.com>
Senior Systems Engineer and Toolsmith
"It's the sound of a satellite saying, 'get me down!'" -Shriekback


Luke Palmer

unread,
Mar 29, 2005, 9:13:49 AM3/29/05
to Aaron Sherman, Perl6 Language List
Aaron Sherman writes:
> On Mon, 2005-03-28 at 13:38, Luke Palmer wrote:
>
> > Your "list mod" idea is interesting, though. I fear that adding too
> > many list operators will start to make us look like Haskell, where we
> > have *extremely* expressive single lines that take an hour to write and
> > an hour to read (I call this "concise hell").
>
> Ah, to have access to the Concise Hell compiler... ;-)
>
> Seriously, if you have a line of code which takes an hour to write and
> an hour to read, is that any better or worse than the equivalent 10
> pages of code that take an hour to write and an hour to read?

Depends on your reading style, but I think it is.

>
> That said, one of Perl's largest PR problems has always been the power
> of the syntax and grammar of the language. When people look at a program
> and see:
>
> my %uid_map = map {/^(\w+):[^:]*:(\d+)/} <>;

That line takes about the average amout of time for me to read, but for
a less experienced programmer, this is exactly what I'm talking about.
And indeed, if you did it in C, it would be a lot longer and more
verbose, even though each line would be easier to read than this one.

To illustrate my point, for the novice, this would be a nicer way to
write that:

my %uid_map;
while (<>) {
my ($key, $value) = /^(\w):[^:]:(\d+)/;
$uid_map{$key} = $value;
}

Five lines instead of one, not ten pages instead of one. And after
having illustrated my point, I see why it's a Good Thing. It forces you
to name things. If the novice does not understand some syntax, he will
be guided by how you name your intermediate variables.

[snip]

> The question is: should you be apologetic and/or fail to follow this
> semantic compression down to its logical conclusion because some people
> will find it off-putting.
>
> I think that in the specific case of Perl 6, the answer is a resounding
> NO for two reasons:
>
> 1. Perl 6 can (dangerous lack of the future subjunctive here)
> introspect to a degree that almost no other language can (some
> functional languages are exceptions here), so writing a "strict"
> module that forces programmers to avoid these semantically rich
> constructs (or limit their use to "reasonable situations") is at
> your fingertips. You could even make it invalid at a parser
> level to chain more than some given number of operations.
> 2. Parrot provides access to Perl 6 code from other languages, so
> if Damian writes some wiz-bang module that everyone in the world
> wants access to, even the people who don't like Perl 6 can do so
> without having to get used to Perl 6. If APL had had Parrot, we
> might all be running APL code to this day.
>
> Because of this, Perl 6 need not apologize for it's APL/Haskell-like
> penchant for semantic compression. It can evolve along the lines that
> Perl 4 and 5 made available to it, and not leave anyone behind.

These are definitely good points. You have to understand that I am
somewhat conflicted about my role in this thread. I am not opposed to
semantic/syntactic sophistication (like `%hash{dims @a}{@b} = @c` from
earlier today or your example), but I am opposed to semantic/syntactic
complexity. But you can't have one without the other. For example:

# sophisticated (from Perl6::Placeholders)
my $code = $2;
my $vars = join ",", sort $code =~ m/(\$\^\w+)/g;
my $decl = qq{my($vars)=\@_;};

# complex (from my ass)
my $decl = 'my(' . join(',', sort $2 =~ m/(\$\^\w+)/g) . ')=@_;';

It's the same code. And it's exactly what I illustrated eariler. It's
just that the latter is doing a bunch of unrelated stuff in one line,
and the other is keeping the "semantic units" together... and naming
them.

So I believe that my opposition to operators like `[/]` is that they
lend themselves to both these kinds of code, but the fear comes from the
latter. I don't think I would mind so much if the same operator were
called `chunks` (or even something sane ;-) instead of `[/]`. Because
then you could look it up (maybe we just need a big list of operators
with mnemonics and descriptions like we have for special variables in
Perl 5).

Luke

0 new messages