Slices and iterators

Dan Sugalski

unread,

Jun 14, 2004, 12:53:11 PM6/14/04

to perl6-i...@perl.org

Since we're going to need these, and I'm in a documenting and
defining mood (yes, I am making a final decision on strings today.
Whee!) I figure we need to tackle them.

First, slices. Perl's got 'em, Python has them, Ruby, interestingly,
doesn't. (Sort of) A slice is a subset of elements in an aggregate.
They don't have to be contiguous, unique, or in any order. As an
example:

@foo = ('A', 'B', 'C', 'D');
@bar = @foo[0,2]; # A slice--elements 0 and 2

@bar is now ('A', 'C'). Or:

@bar = @foo[0,0,0,0,0,0];

@bar is now ('A', 'A', 'A', 'A', 'A', 'A').

I think for this to work we need to add a slice vtable entry. Not
because I'm particularly fond of vtable entries as such, but it's a
pretty fundamental operation. (Python devotes opcodes to it even)

The slice vtable entry should take as its parameter a slice pmc. This
should be an array of typed from/to values, so we can do something
like:

@foo[0..2,4..8,12..];

with three entries in the slice array--one with a from/to of 0/2, one
with 4/8, and one with 12/inf. Typed since these will be used with
hashes, and we'll need to differentiate between something that should
be taken as a string and something taken as an integer. (If the range
ends are PMCs, since they may behave differently depending on which
way they're read)

This vtable entry should return an iterator, which is why they're
here--not because I've any particular love of the things, but because
if someone does:

@foo = @bar[0..];

on an array that generates data randomly we'll get caught in an
infinite loop, which is generally a bad thing.

Since we're working on iterators, all aggregates should be able to
generate them, which leads to the iterator vtable entry (since
everyone wants to iterate over everything).

So, the proposal:

*) We add a slice vtable entry which takes a slice pmc and returns an interator
*) We add an iterator vtable entry which returns an interator for the PMC
*) We consider ways to make slices. I can see ops, or I can see basic
functions. Either is fine, depends on how often the things are used.
(Ops have less overhead, functions mean fewer ops)

Please, let discussion ensue. We'll decide on the slice creation
method in a day or two and then just make it all happen.
--
Dan

--------------------------------------it's like this-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Luke Palmer

unread,

Jun 14, 2004, 3:21:45 PM6/14/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski writes:
> The slice vtable entry should take as its parameter a slice pmc. This
> should be an array of typed from/to values, so we can do something
> like:
>
> @foo[0..2,4..8,12..];
>
> with three entries in the slice array--one with a from/to of 0/2, one
> with 4/8, and one with 12/inf.

Perl also has:

@foo[0..12 :by(3)] # 0,3,6,9,12

PDL has affine slices.

To me, it seems like the best thing to do is to give slice an iterator,
and slice would return an iterator that maps keys to values. So, doing
C<@bar = @foo[0..2,4..8,12...]> would look something like:

Construct iterator for 0..2, 4..8, 12...
Call @foo->VTABLE_slice(iterator)
Initialize @bar from returned iterator

Iterators have the advantage over arrays since they can be infinite.
With arrays, how do you represent:

@foo[12... :by(3)]

Do we still have multidimensional keys?

Luke

Dan Sugalski

unread,

Jun 14, 2004, 3:30:09 PM6/14/04

to Luke Palmer, perl6-i...@perl.org

At 1:21 PM -0600 6/14/04, Luke Palmer wrote:
>Dan Sugalski writes:
>> The slice vtable entry should take as its parameter a slice pmc. This
>> should be an array of typed from/to values, so we can do something
>> like:
>>
>> @foo[0..2,4..8,12..];
>>
>> with three entries in the slice array--one with a from/to of 0/2, one
>> with 4/8, and one with 12/inf.
>
>Perl also has:
>
> @foo[0..12 :by(3)] # 0,3,6,9,12
>
>PDL has affine slices.

Yeah, but at some point you have to draw the line and say "This is as
far as we're going at the low level."

>Iterators have the advantage over arrays since they can be infinite.
>With arrays, how do you represent:
>
> @foo[12... :by(3)]

And that is probably well past it. :)

>Do we still have multidimensional keys?

Yes, we do.

Luke Palmer

unread,

Jun 14, 2004, 10:59:10 PM6/14/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski writes:
> At 1:21 PM -0600 6/14/04, Luke Palmer wrote:
> >Dan Sugalski writes:
> >> The slice vtable entry should take as its parameter a slice pmc. This
> >> should be an array of typed from/to values, so we can do something
> >> like:
> >>
> >> @foo[0..2,4..8,12..];
> >>
> >> with three entries in the slice array--one with a from/to of 0/2, one
> >> with 4/8, and one with 12/inf.
> >
> >Perl also has:
> >
> > @foo[0..12 :by(3)] # 0,3,6,9,12
> >
> >PDL has affine slices.
>
> Yeah, but at some point you have to draw the line and say "This is as
> far as we're going at the low level."
>
> >Iterators have the advantage over arrays since they can be infinite.
> >With arrays, how do you represent:
> >
> > @foo[12... :by(3)]
>
> And that is probably well past it. :)
>
> >Do we still have multidimensional keys?
>
> Yes, we do.

Then these are both clear arguments for giving an iterator to slice
rather than an array. We have no way to represent @foo[12... :by(3)],
so how do we represent it? We have multidimensional keys but no way to
slice by them.

The former is solved by constructing the lazy iterator for C<12... :by(3)>
and giving it to @foo's slice. The latter is solved by creating an
iterator that yields multidimensional keys.

What advantage does the simple array case give, other than the one fewer
opcode from extracting the iterator?

Luke

Leopold Toetsch

unread,

Jun 16, 2004, 5:25:04 AM6/16/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:
> *) We consider ways to make slices. I can see ops, or I can see basic
> functions. Either is fine, depends on how often the things are used.

I'll start from the end of the proposal. What about just extending the
keyed syntax:

Px = Py[0] # key
Px = Py["a"; 1] # multi key
Px = Py[0..2, 4] # slice
Px = Py["a", "b"] # slice
Px = Py[I0, I1..] # slice

So a slice PMC is basically a Key PMC with two additional flags per key
component: start_range .. end_range. The comma starts a new key
component. Key components are typed and constructed at compile time, so
we would have already what we need with low overhead.

BTW: is there something like:

Px = Py["a".."x"]

or even

Px = Py["aaa".."xxx"]

leo

Dan Sugalski

unread,

Jun 16, 2004, 10:51:37 AM6/16/04

to l...@toetsch.at, perl6-i...@perl.org

At 11:25 AM +0200 6/16/04, Leopold Toetsch wrote:
>Dan Sugalski <d...@sidhe.org> wrote:
>> *) We consider ways to make slices. I can see ops, or I can see basic
>> functions. Either is fine, depends on how often the things are used.
>
>I'll start from the end of the proposal. What about just extending the
>keyed syntax:
>
> Px = Py[0] # key
> Px = Py["a"; 1] # multi key
> Px = Py[0..2, 4] # slice
> Px = Py["a", "b"] # slice
> Px = Py[I0, I1..] # slice
>
>So a slice PMC is basically a Key PMC with two additional flags per key
>component: start_range .. end_range. The comma starts a new key
>component. Key components are typed and constructed at compile time, so
>we would have already what we need with low overhead.

That works for me. We'll want the slice to return a different PMC
type than a key so we can tell 'em apart, but other than that it's
good.

>BTW: is there something like:
>
> Px = Py["a".."x"]
>
>or even
>
> Px = Py["aaa".."xxx"]

Yes. Which reminds me--we need to have a syntax to distinguish
between key types. A key value can be an int, string, or PMC, and it
can be taken as an int, string, or PMC. Someone could do:

Px = Py[Sx]

but want Sx to be taken as an int for array access rather than a
string for key access. While that's an unusual case, this:

Px = Py[Pz]

won't be, and we'll need to note whether Pz should be taken as an
int, string, or PMC. (PMC for the fancier keying of hashes)

As I remember there were bits in key entries to note both the type of
the key value and how the key value should be taken, so I'd like to
add syntax for this at the pir/pasm level. A simple :i, :s, :p (for
int, string, or pmc) suffix to the key would be fine, with a missing
suffix defaulting to the type of the value. (So ints default to :i,
strings to :s, and pmcs to :p)

Dan Sugalski

unread,

Jun 16, 2004, 1:26:36 PM6/16/04

to Brent 'Dax' Royal-Gordon, l...@toetsch.at, perl6-i...@perl.org

At 10:06 AM -0700 6/16/04, Brent 'Dax' Royal-Gordon wrote:

>Dan Sugalski wrote:
>>Which reminds me--we need to have a syntax to distinguish between key types.
>

>Perl already gives us two of the three:
> Px[Iy]
> Px{Sy}
>
>For the third, I suggest we extend the analogy:
> Px<Pz>

Except it breaks really really badly for multidimensional keys:

Px[S1:i;S2:s;P3:i]

Eirik Berg Hanssen

unread,

Jun 16, 2004, 1:48:22 PM6/16/04

to Dan Sugalski, Brent 'Dax' Royal-Gordon, l...@toetsch.at, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> writes:

> At 10:06 AM -0700 6/16/04, Brent 'Dax' Royal-Gordon wrote:
>>Dan Sugalski wrote:
>>>Which reminds me--we need to have a syntax to distinguish between key types.
>>
>>Perl already gives us two of the three:
>> Px[Iy]
>> Px{Sy}
>>
>>For the third, I suggest we extend the analogy:
>> Px<Pz>
>
> Except it breaks really really badly for multidimensional keys:
>
> Px[S1:i;S2:s;P3:i]

Except it does not:

Px[S1]{S2}[P3]

Of course, that is not good for the syntactically diabetic ... but
more readable to human eyes, IMO. Compare:

Px[S1:i..I1,P1:i,P2:i..S2:i;P3:s]

vs:

Px[S1..I1,P1,P2..S2]{P3}

... not that I think anything remotely human would actually write
something like that in real code ... nevermind ...

Eirik
--
Boston's Irreversible Law of Clutter:
In any household, junk accumulates to fill the space available
for its storage.

Dan Sugalski

unread,

Jun 16, 2004, 1:56:36 PM6/16/04

to Eirik Berg Hanssen, Brent 'Dax' Royal-Gordon, l...@toetsch.at, perl6-i...@perl.org

At 7:48 PM +0200 6/16/04, Eirik Berg Hanssen wrote:
>Dan Sugalski <d...@sidhe.org> writes:
>
>> At 10:06 AM -0700 6/16/04, Brent 'Dax' Royal-Gordon wrote:
>>>Dan Sugalski wrote:
>>>>Which reminds me--we need to have a syntax to distinguish between
>>>>key types.
>>>
>>>Perl already gives us two of the three:
>>> Px[Iy]
>>> Px{Sy}
>>>
>>>For the third, I suggest we extend the analogy:
>>> Px<Pz>
>>
>> Except it breaks really really badly for multidimensional keys:
>>
>> Px[S1:i;S2:s;P3:i]
>
> Except it does not:
>
> Px[S1]{S2}[P3]
>
> Of course, that is not good for the syntactically diabetic ... but
>more readable to human eyes, IMO.

Yeah, but given that this code will be generated by compilers 90+% of
the time... the assembly generation and parsing of the assembly's
easier with the postfix notation.

Brent 'Dax' Royal-Gordon

unread,

Jun 16, 2004, 3:55:36 PM6/16/04

to Dan Sugalski, Eirik Berg Hanssen, l...@toetsch.at, perl6-i...@perl.org

Dan Sugalski wrote:
> Yeah, but given that this code will be generated by compilers 90+% of
> the time... the assembly generation and parsing of the assembly's easier
> with the postfix notation.

My understanding is that compilers will generate an AST, not textual
PIR. Thus, we are looking for a notation that's convenient for a human
who has to work at the level of a compiler. In fact, I wouldn't be
surprised if, post-1.0, this notation was never generated by anything
but a PIR decompiler.

Currently, most compilers *do* generate textual PIR--but IIRC most of
these compilers are written in high-level languages like Perl, where the
difference between the notations is:

print "["
for(@keys) {
print $_->data, ':', $letter{$_->type};
}
print "]";

vs.
for(@keys) {
print $left{$_->type}, $_->data, $right{$_->type};
}

You may have a point about parsing, since that's done in C. However, I
can't imagine it's *that* much more difficult to do (except that <>'s
use in comparison ops might make that particular pair of characters
unsuitable).

--
Brent "Dax" Royal-Gordon <br...@brentdax.com>
Perl and Parrot hacker

Oceania has always been at war with Eastasia.

Leopold Toetsch

unread,

Jun 19, 2004, 5:47:03 AM6/19/04

to Dan Sugalski, perl6-i...@perl.org

Dan Sugalski <d...@sidhe.org> wrote:

> Px = Py[Pz]

> won't be, and we'll need to note whether Pz should be taken as an
> int, string, or PMC. (PMC for the fancier keying of hashes)

> ... A simple :i, :s, :p [...] suffix

Do we really need that? The aggregate is responsible to do the right
thing. When C<Py> is a PerlHash PMC, it calls C<get_string_keyed()>. An
array PMC does C<get_integer_keyed>. The key type doesn't really matter.
The key must just provide these vtable functions.

What I'm missing?

leo