Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Adding support for different storage formats to ParallelArray (again)

49 views
Skip to first unread message

Herhut, Stephan A

unread,
Oct 9, 2012, 4:55:58 PM10/9/12
to dev-tech-js-en...@lists.mozilla.org, Sreeram, Jaswanth, Shpeisman, Tatiana, Hudson, Rick, Dave Herman, Shu-yu Guo (shu@rfrn.org)
Dear all,

during our last meeting we discussed adding support for different internal storage formats to ParallelArray but deferred further discussion to get Dave's opinion on it. So, to start the discussion again, here a summary:

We have looked at a lot of uses cases for River Trail and many of them include some form of visual computing. In both 2D and 3D cases, one typically wants the end result of the computation to be in a format that can be understood by either WebGL or 2D canvas. This means that one has to copy the contents of a ParallelArray to either a Float32 or Uint8Clamped array before being able to use the computed data. This operation should be as fast as possible, as it typically occurs at every frame and this is on the critical path.

To facilitate this, we have thought about adding conversion methods to ParallelArray, like toFloat32 and toUint8Clamped, that return a _frozen_ typed array (and buffer) that contains the values contained in a ParallelArray structure. Details of these operations have still to be worked out but to make this efficient, one really wants the ParallelArray to store the data in the right format from the start.

Also, when working with 2d canvas, our code very often contain explicit clamping to model the behavior of the Uint8ClampedArray. Ideally, we would want to have a ParallelArray object that mimics the typed array returned from canvas in that regard.

Initially, we had proposed that ParallelArray objects inherit the storage class form the array they were created from. This was rejected by Dave on the grounds that there is no precedent and that he dislikes the fact that the semantics of a ParallelArray object change implicitly based on how they were created. To address this, we have come up with an alternative proposal: The constructor of ParallelArray now gets an optional extra argument that specifies the semantics of the underlying storage by providing one of a predefined set of array constructors:

var a = new ParallelArray([1,2,3,4,5], Float32Array);

would create a new ParallelArray object that uses Float32 as _storage_ or at least it exhibits the behavior as if it was using Float32 as storage. Likewise, to create a ParallelArray from canvas (and preserve the storage semantics) one would use

var b = new ParallelArray(canvas.getContext('2d').getImageData().data, Uint8ClampedArray); // or similar, haven't actually tried this line

If the extra argument is omitted, it defaults to Array, i.e., the usual JavaScript storage rules apply. Methods that create a new ParallelArray out of an existing one, like map, inherit the storage semantics from the parent. This is implicit but at least the programmer has at least once explicitly expressed her intent.

There are some rough edges we have to smooth out. For instance, what happens if the elemental function returns objects but uses a Float32Array for storage? Is that allowed? Is it ok if those objects are array-like and contain only numbers? Are only ParallelArray objects of the same storage class allowed? How far can we go here?

So, if we could agree on the feasibility of this approach, more work needs to be done. I would rather not invest a lot of thought into it just for it to get shot down in the end. That's why I seek input (especially from Dave) early on.

Best
Stephan

Niko Matsakis

unread,
Oct 9, 2012, 5:54:16 PM10/9/12
to Herhut, Stephan A, Hudson, Rick, Shpeisman, Tatiana, Shu-yu Guo (shu@rfrn.org), dev-tech-js-en...@lists.mozilla.org, Dave Herman, Sreeram, Jaswanth
What do typed arrays do when you store an inappropriate value?

> Herhut, Stephan A <mailto:stephan....@intel.com>
> October 9, 2012 1:55 PM
> _______________________________________________
> dev-tech-js-engine-rivertrail mailing list
> dev-tech-js-en...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-js-engine-rivertrail

David Herman

unread,
Oct 9, 2012, 7:05:08 PM10/9/12
to Herhut, Stephan A, Shpeisman, Tatiana, dev-tech-js-en...@lists.mozilla.org, Hudson, Rick, Sreeram, Jaswanth, Shu-yu Guo (shu@rfrn.org)
Typed arrays are generalized by and superseded by the ES6 binary data types library. This library provides a full generalization of first-class type descriptors, via a little combinator library. So instead of taking typed array constructors, a better API would take type descriptor values representing the element type:

var paContainingUint8s = new ParallelArray(/* whatever */, uint8);
var paContainingFloat32s = new ParallelArray(/* whatever */, float32);

Dave

Herhut, Stephan A

unread,
Oct 9, 2012, 7:33:04 PM10/9/12
to David Herman, Shpeisman, Tatiana, dev-tech-js-en...@lists.mozilla.org, Hudson, Rick, Sreeram, Jaswanth, Shu-yu Guo (shu@rfrn.org)
You are referring to http://wiki.ecmascript.org/doku.php?id=harmony:binary_data?

Great! Even better. Will this replace typed arrays or be synonymous with corresponding typed arrays?

Also, does your comment imply that you could live with this from a semantics standpoint?

Stephan

David Herman

unread,
Oct 9, 2012, 7:34:39 PM10/9/12
to Herhut, Stephan A, Shpeisman, Tatiana, dev-tech-js-en...@lists.mozilla.org, Hudson, Rick, Sreeram, Jaswanth, Shu-yu Guo (shu@rfrn.org)
Essentially synonymous, at least, but hopefully truly synonymous. We'll actually standardize the typed array constructors, along with ArrayBuffer, in ES6, but they should be totally inter-compatible.

Dave

Herhut, Stephan A

unread,
Oct 9, 2012, 7:36:07 PM10/9/12
to David Herman, Shpeisman, Tatiana, dev-tech-js-en...@lists.mozilla.org, Hudson, Rick, Sreeram, Jaswanth, Shu-yu Guo (shu@rfrn.org)
That answers the first part of my email :-D

Niko Matsakis

unread,
Oct 9, 2012, 7:56:40 PM10/9/12
to dhe...@mozilla.com, stephan....@intel.com, s...@rfrn.org, dev-tech-js-en...@lists.mozilla.org, rick....@intel.com, jaswanth...@intel.com, tatiana....@intel.com
I am wondering if we can just say that the backing store is an array of some type T where T is provided in the constructor. It could default to "any" to get the current behavior, essentially. 


Niko

(Sent from a cramped, unreliable keyboard)

Dave

_______________________________________________

Herhut, Stephan A

unread,
Oct 9, 2012, 8:24:06 PM10/9/12
to Niko Matsakis, dhe...@mozilla.com, s...@rfrn.org, dev-tech-js-en...@lists.mozilla.org, Hudson, Rick, Sreeram, Jaswanth, Shpeisman, Tatiana
Then the question is what that means for multi-dimensional arrays. Should we expose that they use a flattened representation or can one only have 1D ParallelArray objects when using other backing stores than any? If we expose that they have a flattened representation, than we will also need to define how deep we flatten, which in essence makes the current dynamic shape a static property of the layout.

Consider processing canvas data on a per pixel level. The map operation will want to use uint8clamped but the elemental function will return 4 element arrays. Are these converted to 0 when writing to the backing store or do we add 4 elements to the backing store and make the PA object an array of 4-tuples?

The beauty of dynamic shape is that we can always revert to a nested backing store yet, after the fact, create an n-dimensional array. If the constructor only defines storage semantics, we can still do that. If it gives the actual backing store, we can’t as an array of type T cannot simply be nested.

Another wild idea would be to have the programmer specify the exact type of the outcome (the one he wants). For the above example, that would be ArrayType( ArrayType(4, uint8clamped), <outer dim size>). This would then also specify a template for conversion of the results of the elemental function into what the ParallelArray object actually stores. Somewhat awkward to use but maybe good enough for a performance programmer? This should always be optional, though!

I currently still prefer dynamic shape with user defined backing-store semantics but might convince myself otherwise…

Stephan

From: Niko Matsakis [mailto:nikoma...@gmail.com] On Behalf Of Niko Matsakis
Sent: Tuesday, October 09, 2012 4:57 PM
To: dhe...@mozilla.com; Herhut, Stephan A
Cc: Shpeisman, Tatiana; dev-tech-js-en...@lists.mozilla.org; Hudson, Rick; Sreeram, Jaswanth; s...@rfrn.org
Subject: Re: Adding support for different storage formats to ParallelArray (again)

I am wondering if we can just say that the backing store is an array of some type T where T is provided in the constructor. It could default to "any" to get the current behavior, essentially.


Niko

(Sent from a cramped, unreliable keyboard)


-------- Original message --------
Subject: Re: Adding support for different storage formats to ParallelArray (again)
From: David Herman <dhe...@mozilla.com<mailto:dhe...@mozilla.com>>
To: "Herhut, Stephan A" <stephan....@intel.com<mailto:stephan....@intel.com>>
CC: "Shpeisman, Tatiana" <tatiana....@intel.com<mailto:tatiana....@intel.com>>,"dev-tech-js-en...@lists.mozilla.org<mailto:dev-tech-js-en...@lists.mozilla.org>" <dev-tech-js-en...@lists.mozilla.org<mailto:dev-tech-js-en...@lists.mozilla.org>>,"Hudson, Rick" <rick....@intel.com<mailto:rick....@intel.com>>,"Sreeram, Jaswanth" <jaswanth...@intel.com<mailto:jaswanth...@intel.com>>,"Shu-yu Guo (s...@rfrn.org<mailto:s...@rfrn.org>)" <s...@rfrn.org<mailto:s...@rfrn.org>>

Typed arrays are generalized by and superseded by the ES6 binary data types library. This library provides a full generalization of first-class type descriptors, via a little combinator library. So instead of taking typed array constructors, a better API would take type descriptor values representing the element type:

var paContainingUint8s = new ParallelArray(/* whatever */, uint8);
var paContainingFloat32s = new ParallelArray(/* whatever */, float32);

Dave

On Oct 9, 2012, at 1:55 PM, "Herhut, Stephan A" <stephan....@intel.com<mailto:stephan....@intel.com>> wrote:

> Dear all,
>
> during our last meeting we discussed adding support for different internal storage formats to ParallelArray but deferred further discussion to get Dave's opinion on it. So, to start the discussion again, here a summary:
>
> We have looked at a lot of uses cases for River Trail and many of them include some form of visual computing. In both 2D and 3D cases, one typically wants the end result of the computation to be in a format that can be understood by either WebGL or 2D canvas. This means that one has to copy the contents of a ParallelArray to either a Float32 or Uint8Clamped array before being able to use the computed data. This operation should be as fast as possible, as it typically occurs at every frame and this is on the critical path.
>
> To facilitate this, we have thought about adding conversion methods to ParallelArray, like toFloat32 and toUint8Clamped, that return a _frozen_ typed array (and buffer) that contains the values contained in a ParallelArray structure. Details of these operations have still to be worked out but to make this efficient, one really wants the ParallelArray to store the data in the right format from the start.
>
> Also, when working with 2d canvas, our code very often contain explicit clamping to model the behavior of the Uint8ClampedArray. Ideally, we would want to have a ParallelArray object that mimics the typed array returned from canvas in that regard.
>
> Initially, we had proposed that ParallelArray objects inherit the storage class form the array they were created from. This was rejected by Dave on the grounds that there is no precedent and that he dislikes the fact that the semantics of a ParallelArray object change implicitly based on how they were created. To address this, we have come up with an alternative proposal: The constructor of ParallelArray now gets an optional extra argument that specifies the semantics of the underlying storage by providing one of a predefined set of array constructors:
>
> var a = new ParallelArray([1,2,3,4,5], Float32Array);
>
> would create a new ParallelArray object that uses Float32 as _storage_ or at least it exhibits the behavior as if it was using Float32 as storage. Likewise, to create a ParallelArray from canvas (and preserve the storage semantics) one would use
>
> var b = new ParallelArray(canvas.getContext('2d').getImageData().data, Uint8ClampedArray); // or similar, haven't actually tried this line
>
> If the extra argument is omitted, it defaults to Array, i.e., the usual JavaScript storage rules apply. Methods that create a new ParallelArray out of an existing one, like map, inherit the storage semantics from the parent. This is implicit but at least the programmer has at least once explicitly expressed her intent.
>
> There are some rough edges we have to smooth out. For instance, what happens if the elemental function returns objects but uses a Float32Array for storage? Is that allowed? Is it ok if those objects are array-like and contain only numbers? Are only ParallelArray objects of the same storage class allowed? How far can we go here?
>
> So, if we could agree on the feasibility of this approach, more work needs to be done. I would rather not invest a lot of thought into it just for it to get shot down in the end. That's why I seek input (especially from Dave) early on.
>
> Best
> Stephan

_______________________________________________
dev-tech-js-engine-rivertrail mailing list
dev-tech-js-en...@lists.mozilla.org<mailto:dev-tech-js-en...@lists.mozilla.org>
https://lists.mozilla.org/listinfo/dev-tech-js-engine-rivertrail

Niko Matsakis

unread,
Oct 9, 2012, 11:10:14 PM10/9/12
to Herhut, Stephan A, Hudson, Rick, Shpeisman, Tatiana, s...@rfrn.org, dev-tech-js-en...@lists.mozilla.org, dhe...@mozilla.com, Sreeram, Jaswanth
Yes, I was wondering about this question as well on the train ride
home. In general it seems that multidimensional map, dynamic shape, and
user-specified backing stores interact in a rather strange way that is
not especially intuitive to me.

For example, if you create an uint8[NxNx4] array and then use map, you
will be mapping over an uint8[Nx4] PA elements. We said that the array
which results from higher-order operations carries the same backing
store type as its origin... but that doesn't quite seem to make sense in
this scenario. What happens if you then return uint16[N] array? I know
this is the same question you were asking (or I think it was) but I
don't quite understand what answer you proposed. :)


Niko

> Herhut, Stephan A <mailto:stephan....@intel.com>
> October 9, 2012 5:24 PM
> *From:*Niko Matsakis [mailto:nikoma...@gmail.com] *On Behalf Of
> *Niko Matsakis
> *Sent:* Tuesday, October 09, 2012 4:57 PM
> *To:* dhe...@mozilla.com; Herhut, Stephan A
> *Cc:* Shpeisman, Tatiana;
> dev-tech-js-en...@lists.mozilla.org; Hudson, Rick;
> Sreeram, Jaswanth; s...@rfrn.org
> *Subject:* Re: Adding support for different storage formats to

Shu-yu Guo

unread,
Oct 10, 2012, 6:16:55 AM10/10/12
to Niko Matsakis, Hudson, Rick, Shpeisman, Tatiana, Herhut, Stephan A, dev-tech-js-en...@lists.mozilla.org, dhe...@mozilla.com, Sreeram, Jaswanth
Insofar that the dimensionality, inferred or otherwise, is intended to
reflect "logical" dimensions of the data, even if the dimensionality
of iteration may be different, I feel like the dimensionality of
iteration still should be constrained to the "logical" dimensions.

That is, an NxN image expressed as uint8[NxNx4] still seems weird to
me. In the new binary-data world, shouldn't it be something like
Color[NxN] where Color is StructType({ r: uint8, g: uint8, b: uint8
})? If we can pass any binary type descriptors to the ParallelArray
constructor, I'm leaning towards something like this:

var line1 = new ParallelArray([pixel1, ..., pixelN], Color); // 1D
var image = new ParallelArray([line1, ..., lineN]); // 2D, inferred

We should be able to figure out that |image| is a "binary data
ParallelArray", and back it with an ArrayBuffer.

Shape is inferred up to the struct type Color. So in Stephan's
example, the elemental function to process pixels would no longer
return an array of 3 (4? rgb + alpha?) uint8s, but a |new Color({ r:
..., g: ..., b: ... })|.

When we want to pass it to Canvas, we should be able to grab the
backing ArrayBuffer and make whatever view we need on it.

var passToCanvas = new Uint8ClampedArray(image.buffer);

The question is what should image.buffer do. I'm guessing that it
should eagerly pack its 2Dness into a new ArrayBuffer if it's not
already packed, but if already packed, return the existing
ArrayBuffer. In both cases the buffer should be frozen in some sense.

Is this sane? It's how I would use it, given my absolute zero
experience in graphics programming. :)

On Tue, Oct 9, 2012 at 8:10 PM, Niko Matsakis <ni...@alum.mit.edu> wrote:
>
> Yes, I was wondering about this question as well on the train ride home. In general it seems that multidimensional map, dynamic shape, and user-specified backing stores interact in a rather strange way that is not especially intuitive to me.
>
> For example, if you create an uint8[NxNx4] array and then use map, you will be mapping over an uint8[Nx4] PA elements. We said that the array which results from higher-order operations carries the same backing store type as its origin... but that doesn't quite seem to make sense in this scenario. What happens if you then return uint16[N] array? I know this is the same question you were asking (or I think it was) but I don't quite understand what answer you proposed. :)
>
>
> Niko
>
> Herhut, Stephan A
> October 9, 2012 5:24 PM
>
> Then the question is what that means for multi-dimensional arrays. Should we expose that they use a flattened representation or can one only have 1D ParallelArray objects when using other backing stores than any? If we expose that they have a flattened representation, than we will also need to define how deep we flatten, which in essence makes the current dynamic shape a static property of the layout.
>
>
>
> Consider processing canvas data on a per pixel level. The map operation will want to use uint8clamped but the elemental function will return 4 element arrays. Are these converted to 0 when writing to the backing store or do we add 4 elements to the backing store and make the PA object an array of 4-tuples?
>
>
>
> The beauty of dynamic shape is that we can always revert to a nested backing store yet, after the fact, create an n-dimensional array. If the constructor only defines storage semantics, we can still do that. If it gives the actual backing store, we can’t as an array of type T cannot simply be nested.
>
>
>
> Another wild idea would be to have the programmer specify the exact type of the outcome (the one he wants). For the above example, that would be ArrayType( ArrayType(4, uint8clamped), <outer dim size>). This would then also specify a template for conversion of the results of the elemental function into what the ParallelArray object actually stores. Somewhat awkward to use but maybe good enough for a performance programmer? This should always be optional, though!
>
>
>
> I currently still prefer dynamic shape with user defined backing-store semantics but might convince myself otherwise…
>
>
>
> Stephan
>
>
>
> From: Niko Matsakis [mailto:nikoma...@gmail.com] On Behalf Of Niko Matsakis
> Sent: Tuesday, October 09, 2012 4:57 PM
> To: dhe...@mozilla.com; Herhut, Stephan A
> Cc: Shpeisman, Tatiana; dev-tech-js-en...@lists.mozilla.org; Hudson, Rick; Sreeram, Jaswanth; s...@rfrn.org
> Subject: Re: Adding support for different storage formats to ParallelArray (again)
>
>
>
> I am wondering if we can just say that the backing store is an array of some type T where T is provided in the constructor. It could default to "any" to get the current behavior, essentially.
>
>
> Niko
>
> (Sent from a cramped, unreliable keyboard)
>
>
> -------- Original message --------
> Subject: Re: Adding support for different storage formats to ParallelArray (again)
> From: David Herman <dhe...@mozilla.com>
> To: "Herhut, Stephan A" <stephan....@intel.com>
> CC: "Shpeisman, Tatiana" <tatiana....@intel.com>,"dev-tech-js-en...@lists.mozilla.org" <dev-tech-js-en...@lists.mozilla.org>,"Hudson, Rick" <rick....@intel.com>,"Sreeram, Jaswanth" <jaswanth...@intel.com>,"Shu-yu Guo (s...@rfrn.org)" <s...@rfrn.org>
>
> Typed arrays are generalized by and superseded by the ES6 binary data types library. This library provides a full generalization of first-class type descriptors, via a little combinator library. So instead of taking typed array constructors, a better API would take type descriptor values representing the element type:
>
> var paContainingUint8s = new ParallelArray(/* whatever */, uint8);
> var paContainingFloat32s = new ParallelArray(/* whatever */, float32);
>
> Dave
>
> On Oct 9, 2012, at 1:55 PM, "Herhut, Stephan A" <stephan....@intel.com> wrote:
>
> > Dear all,
> >
> > during our last meeting we discussed adding support for different internal storage formats to ParallelArray but deferred further discussion to get Dave's opinion on it. So, to start the discussion again, here a summary:
> >
> > We have looked at a lot of uses cases for River Trail and many of them include some form of visual computing. In both 2D and 3D cases, one typically wants the end result of the computation to be in a format that can be understood by either WebGL or 2D canvas. This means that one has to copy the contents of a ParallelArray to either a Float32 or Uint8Clamped array before being able to use the computed data. This operation should be as fast as possible, as it typically occurs at every frame and this is on the critical path.
> >
> > To facilitate this, we have thought about adding conversion methods to ParallelArray, like toFloat32 and toUint8Clamped, that return a _frozen_ typed array (and buffer) that contains the values contained in a ParallelArray structure. Details of these operations have still to be worked out but to make this efficient, one really wants the ParallelArray to store the data in the right format from the start.
> >
> > Also, when working with 2d canvas, our code very often contain explicit clamping to model the behavior of the Uint8ClampedArray. Ideally, we would want to have a ParallelArray object that mimics the typed array returned from canvas in that regard.
> >
> > Initially, we had proposed that ParallelArray objects inherit the storage class form the array they were created from. This was rejected by Dave on the grounds that there is no precedent and that he dislikes the fact that the semantics of a ParallelArray object change implicitly based on how they were created. To address this, we have come up with an alternative proposal: The constructor of ParallelArray now gets an optional extra argument that specifies the semantics of the underlying storage by providing one of a predefined set of array constructors:
> >
> > var a = new ParallelArray([1,2,3,4,5], Float32Array);
> >
> > would create a new ParallelArray object that uses Float32 as _storage_ or at least it exhibits the behavior as if it was using Float32 as storage. Likewise, to create a ParallelArray from canvas (and preserve the storage semantics) one would use
> >
> > var b = new ParallelArray(canvas.getContext('2d').getImageData().data, Uint8ClampedArray); // or similar, haven't actually tried this line
> >
> > If the extra argument is omitted, it defaults to Array, i.e., the usual JavaScript storage rules apply. Methods that create a new ParallelArray out of an existing one, like map, inherit the storage semantics from the parent. This is implicit but at least the programmer has at least once explicitly expressed her intent.
> >
> > There are some rough edges we have to smooth out. For instance, what happens if the elemental function returns objects but uses a Float32Array for storage? Is that allowed? Is it ok if those objects are array-like and contain only numbers? Are only ParallelArray objects of the same storage class allowed? How far can we go here?
> >
> > So, if we could agree on the feasibility of this approach, more work needs to be done. I would rather not invest a lot of thought into it just for it to get shot down in the end. That's why I seek input (especially from Dave) early on.
> >
> > Best
> > Stephan
>
> _______________________________________________
> dev-tech-js-engine-rivertrail mailing list
> dev-tech-js-en...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-js-engine-rivertrail




--
shu

Niko Matsakis

unread,
Oct 10, 2012, 9:48:23 AM10/10/12
to Shu-yu Guo, Hudson, Rick, Shpeisman, Tatiana, Herhut, Stephan A, dev-tech-js-en...@lists.mozilla.org, dhe...@mozilla.com, Sreeram, Jaswanth
I think it probably makes sense to use a struct as the leaf type, unless
you wanted to apply some function to RGBA components individually. In
any case, whether or not you use uint8 or a struct type as the type of
the leaf isn't really the point I was getting at. Suppose I do this:

var x = new ParallelArray([X, Y], T, function(x, y) { ... }); //
where T is any type
var y = x.map(e => new ParallelArray(e, U));

Now x is backed by T, but what is Y backed by? The only choice seems to
be U. But we said that map takes its backing from the receiver. What
about if the map function returns parallel arrays with consistent shape
but inconsistent backing type?

At first I thought maybe you could "logical shape" idea to "logical
backing store". That is, if all of the parallel arrays are consistent,
you get the type T, otherwise you get "any". The problem, though, is
that we have to build the backing store before we know what data goes
into it! This isn't a problem in the example above, so much, but rather
in the case where map() returns values directly (that is, where the
resulting shape from map() is single-dimensional).

One other note: in the current binary data proposal, if you assign a
value to a location of type T which would require conversion, the result
is an exception. Therefore, if you have a parallel array constructor
like so:

new ParallelArray([22], Uint8, function(i) { return "Hello, world."; })

You would get an exception since "Hello, world" is not a number and
hence not convertible to Uint8. I think this is fine but it's worth
nothing.


Niko

Niko Matsakis

unread,
Oct 10, 2012, 10:37:49 AM10/10/12
to Shu-yu Guo, Hudson, Rick, Shpeisman, Tatiana, Herhut, Stephan A, dev-tech-js-en...@lists.mozilla.org, dhe...@mozilla.com, Sreeram, Jaswanth
To summarize some discussion between Shu and I on IRC:

In some sense, we *can* have a "logical backing store". But the
*physical backing store* that comes out of map() and friends would be
"any", since we must be prepared for any return value. But this seems
to defeat the whole point of this exercise. As I understood it, the
goal was to allow us to allocate something that is appropriately sized
and to give us permission to force the return value into this spot, so
that the user had some guarantees. So if the user says uint8 but
returns 257 we get to store 1 (257 % 256).

We would not have this permission with the "logical backing store"
model, instead we'd have to infer a backing store type that accommodates
all return values without losing any information. If we're back at the
place where we have to use inference to decide when we can allocate a
tighter backing store without losing information, which it sounds like
we are, then we haven't really gotten very far. In that case I might
prefer to just scrap the idea entirely and say "if you would like a
uint8 backing store, coerce your return values to uint8 and cross your
fingers that your engine implementors took the time to optimize that case".

Moreover, given that the binary data spec errors when an inappropriate
value is used, it seems mildly inconsistent that the *constructor* form
yields an error if you specify uint8 but return strings, but *map*
silently changes backing store type to any.


Niko


> Niko Matsakis <mailto:ni...@alum.mit.edu>
> October 10, 2012 6:48 AM
> Shu-yu Guo <mailto:s...@rfrn.org>
> October 10, 2012 3:16 AM
> --
> shu
> Niko Matsakis <mailto:ni...@alum.mit.edu>
> October 9, 2012 8:10 PM
> Yes, I was wondering about this question as well on the train ride
> home. In general it seems that multidimensional map, dynamic shape,
> and user-specified backing stores interact in a rather strange way
> that is not especially intuitive to me.
>
> For example, if you create an uint8[NxNx4] array and then use map, you
> will be mapping over an uint8[Nx4] PA elements. We said that the
> array which results from higher-order operations carries the same
> backing store type as its origin... but that doesn't quite seem to
> make sense in this scenario. What happens if you then return
> uint16[N] array? I know this is the same question you were asking (or
> I think it was) but I don't quite understand what answer you proposed. :)
>
>
> Niko
>
> Herhut, Stephan A <mailto:stephan....@intel.com>
> *From:*Niko Matsakis [mailto:nikoma...@gmail.com] *On Behalf Of
> *Niko Matsakis
> *Sent:* Tuesday, October 09, 2012 4:57 PM
> *To:* dhe...@mozilla.com; Herhut, Stephan A
> *Cc:* Shpeisman, Tatiana;
> dev-tech-js-en...@lists.mozilla.org; Hudson, Rick;
> Sreeram, Jaswanth; s...@rfrn.org
> *Subject:* Re: Adding support for different storage formats to
> ParallelArray (again)
>
> I am wondering if we can just say that the backing store is an array
> of some type T where T is provided in the constructor. It could
> default to "any" to get the current behavior, essentially.
>
>
> Niko
>
> (Sent from a cramped, unreliable keyboard)
>
>
> -------- Original message --------
> Subject: Re: Adding support for different storage formats to
> ParallelArray (again)
> From: David Herman <dhe...@mozilla.com <mailto:dhe...@mozilla.com>>
> To: "Herhut, Stephan A" <stephan....@intel.com
> <mailto:stephan....@intel.com>>
> CC: "Shpeisman, Tatiana" <tatiana....@intel.com
> <mailto:dev-tech-js-en...@lists.mozilla.org>>,"Hudson,
> Rick" <rick....@intel.com <mailto:rick....@intel.com>>,"Sreeram,
> Jaswanth" <jaswanth...@intel.com
> <mailto:jaswanth...@intel.com>>,"Shu-yu Guo (s...@rfrn.org
> <mailto:s...@rfrn.org>)" <s...@rfrn.org <mailto:s...@rfrn.org>>
>
> Typed arrays are generalized by and superseded by the ES6 binary data
> types library. This library provides a full generalization of
> first-class type descriptors, via a little combinator library. So
> instead of taking typed array constructors, a better API would take
> type descriptor values representing the element type:
>
> var paContainingUint8s = new ParallelArray(/* whatever */, uint8);
> var paContainingFloat32s = new ParallelArray(/* whatever */, float32);
>
> Dave
>
> On Oct 9, 2012, at 1:55 PM, "Herhut, Stephan A"
> <mailto:dev-tech-js-en...@lists.mozilla.org>
> https://lists.mozilla.org/listinfo/dev-tech-js-engine-rivertrail
>
> Niko Matsakis <mailto:ni...@alum.mit.edu>
> October 9, 2012 4:56 PM

Herhut, Stephan A

unread,
Oct 10, 2012, 1:51:25 PM10/10/12
to Niko Matsakis, Shu-yu Guo, Sreeram, Jaswanth, dev-tech-js-en...@lists.mozilla.org, Hudson, Rick, dhe...@mozilla.com, Shpeisman, Tatiana
I see basically two options here:


a) Use the full binary data proposal to describe the entire storage layout

b) Support only the numerical data types to describe the storage layout of leafs but not the results overall structure

If we go for a, which I think has been discussed here but is not what I aimed at initially, I think it is only feasible to have programmers specify the storage layout for every operation. So, for instance,

const Color = StructType({r:uint8, g:uint8, b:uint8, a:uint8});
var a = new ParallelArray([10], function () { return new Color({r:1, g:2, b:3, a:255});}, ArrayType(Color, 10));
var b = a.map(toGrayScale, ArrayType(uint8, 10)); // where toGrayScale converts rgba to a single luminance value

This tells the engine exactly how to represent data and also gives enough information to preallocate. In this setting, it also makes sense to reject values that cannot be converted into the specified storage layout. The drawback is that it is more difficult to use as the programmer has to do the type inference in his head. Then again, if he wants a certain storage layout, he probably has a reason for that. In the end, if binary data takes off and JS developers learn to write code that way, why shouldn’t they learn to write River Trail that way.

Option b is less restrictive and allows for dynamic inference. It won’t support the use of structures or similar but the user will at least be able to define a storage class for the values. The above example would then be

var a = new ParallelArray([10], function () { return [1,2,3,255];}, uint8); // inferred as ArrayType(ArrayType(uint8, 4), 10)
var b = a.map(toGrayScale); // uint8 inherited, structure inferred as ArrayType(uint8, 10)

We only tell the engine what type the leaves should have. The existing rules for dynamic shape etc. apply. Here, we should not fail but convert e.g. “hello” to 0. However, we could fail if the leaf of the dynamic shape is not a primitive. So

var a = new ParallelArray([4], function ( return “hello”;}, uint8) //-> <0,0,0,0>
var b = new ParallelArray([4], function ( return [1,2];}, uint8) //-> <<1,2>,…,<1,2>>

but

var x = new ParallelArray([4], function(i) { return new Array(i); } //-> fail, as the dynamic shape is [4] and leaves are not primitives.

I still think option b is easier to use (less specification overhead) and powerful enough. Option a is more expressive but it feels like writing typed code.

Stephan


From: Niko Matsakis [mailto:nikoma...@gmail.com] On Behalf Of Niko Matsakis
Sent: Wednesday, October 10, 2012 7:38 AM
To: Shu-yu Guo
Cc: Herhut, Stephan A; dhe...@mozilla.com; Shpeisman, Tatiana; dev-tech-js-en...@lists.mozilla.org; Hudson, Rick; Sreeram, Jaswanth
Subject: Re: Adding support for different storage formats to ParallelArray (again)

To summarize some discussion between Shu and I on IRC:

In some sense, we *can* have a "logical backing store". But the *physical backing store* that comes out of map() and friends would be "any", since we must be prepared for any return value. But this seems to defeat the whole point of this exercise. As I understood it, the goal was to allow us to allocate something that is appropriately sized and to give us permission to force the return value into this spot, so that the user had some guarantees. So if the user says uint8 but returns 257 we get to store 1 (257 % 256).

We would not have this permission with the "logical backing store" model, instead we'd have to infer a backing store type that accommodates all return values without losing any information. If we're back at the place where we have to use inference to decide when we can allocate a tighter backing store without losing information, which it sounds like we are, then we haven't really gotten very far. In that case I might prefer to just scrap the idea entirely and say "if you would like a uint8 backing store, coerce your return values to uint8 and cross your fingers that your engine implementors took the time to optimize that case".

Moreover, given that the binary data spec errors when an inappropriate value is used, it seems mildly inconsistent that the *constructor* form yields an error if you specify uint8 but return strings, but *map* silently changes backing store type to any.


Niko



[cid:image0...@01CDA6D2.AFC579F0]
Niko Matsakis<mailto:ni...@alum.mit.edu>
October 10, 2012 6:48 AM
I think it probably makes sense to use a struct as the leaf type, unless you wanted to apply some function to RGBA components individually. In any case, whether or not you use uint8 or a struct type as the type of the leaf isn't really the point I was getting at. Suppose I do this:

var x = new ParallelArray([X, Y], T, function(x, y) { ... }); // where T is any type
var y = x.map(e => new ParallelArray(e, U));

Now x is backed by T, but what is Y backed by? The only choice seems to be U. But we said that map takes its backing from the receiver. What about if the map function returns parallel arrays with consistent shape but inconsistent backing type?

At first I thought maybe you could "logical shape" idea to "logical backing store". That is, if all of the parallel arrays are consistent, you get the type T, otherwise you get "any". The problem, though, is that we have to build the backing store before we know what data goes into it! This isn't a problem in the example above, so much, but rather in the case where map() returns values directly (that is, where the resulting shape from map() is single-dimensional).

One other note: in the current binary data proposal, if you assign a value to a location of type T which would require conversion, the result is an exception. Therefore, if you have a parallel array constructor like so:

new ParallelArray([22], Uint8, function(i) { return "Hello, world."; })

You would get an exception since "Hello, world" is not a number and hence not convertible to Uint8. I think this is fine but it's worth nothing.


Niko
[cid:image0...@01CDA6D2.AFC579F0]
[cid:image0...@01CDA6D2.AFC579F0]
Niko Matsakis<mailto:ni...@alum.mit.edu>
October 9, 2012 8:10 PM
Yes, I was wondering about this question as well on the train ride home. In general it seems that multidimensional map, dynamic shape, and user-specified backing stores interact in a rather strange way that is not especially intuitive to me.

For example, if you create an uint8[NxNx4] array and then use map, you will be mapping over an uint8[Nx4] PA elements. We said that the array which results from higher-order operations carries the same backing store type as its origin... but that doesn't quite seem to make sense in this scenario. What happens if you then return uint16[N] array? I know this is the same question you were asking (or I think it was) but I don't quite understand what answer you proposed. :)


Niko
[cid:image0...@01CDA6D2.AFC579F0]
Herhut, Stephan A<mailto:stephan....@intel.com>
October 9, 2012 5:24 PM
Then the question is what that means for multi-dimensional arrays. Should we expose that they use a flattened representation or can one only have 1D ParallelArray objects when using other backing stores than any? If we expose that they have a flattened representation, than we will also need to define how deep we flatten, which in essence makes the current dynamic shape a static property of the layout.

Consider processing canvas data on a per pixel level. The map operation will want to use uint8clamped but the elemental function will return 4 element arrays. Are these converted to 0 when writing to the backing store or do we add 4 elements to the backing store and make the PA object an array of 4-tuples?

The beauty of dynamic shape is that we can always revert to a nested backing store yet, after the fact, create an n-dimensional array. If the constructor only defines storage semantics, we can still do that. If it gives the actual backing store, we can’t as an array of type T cannot simply be nested.

Another wild idea would be to have the programmer specify the exact type of the outcome (the one he wants). For the above example, that would be ArrayType( ArrayType(4, uint8clamped), <outer dim size>). This would then also specify a template for conversion of the results of the elemental function into what the ParallelArray object actually stores. Somewhat awkward to use but maybe good enough for a performance programmer? This should always be optional, though!

I currently still prefer dynamic shape with user defined backing-store semantics but might convince myself otherwise…

Stephan

From: Niko Matsakis [mailto:nikoma...@gmail.com] On Behalf Of Niko Matsakis
[cid:image0...@01CDA6D2.AFC579F0]

Niko Matsakis

unread,
Oct 10, 2012, 2:34:50 PM10/10/12
to Herhut, Stephan A, Hudson, Rick, Shu-yu Guo, Shpeisman, Tatiana, dev-tech-js-en...@lists.mozilla.org, dhe...@mozilla.com, Sreeram, Jaswanth
I have been assuming that we wanted to give the user a "no copy"
guarantee. That is, if the user specified a backing store, they could
convert to a typed array at zero cost. I think this guarantee is
incompatible with shape inference, for the reasons I spelled out
earlier. But perhaps that is not the intended guarantee. Perhaps the
guarantee is simply that the leaves will have been coerced to the
specified leaf type.

In that case, I don't see how it makes any difference what kind of leaf
type is specified, whether it be a struct or a uint8. The bottom line
is that for any invocation |pa.map(f)|, the array |pa| will have some
leaf type |T|. The function |f| can always return parallel arrays. Each
of those parallel arrays will have some leaf type |U| (the various
arrays returned may or may not have the same leaf type, of course). We
have to decide what to do when |T != U|. I think your proposal was that
we coerce |U| to |T|, throwing an error should such a coercion not be
permitted under the rules given in the BinaryData spec. Naturally this
will require that we copy the returned array(s) for which coercion is
needed.

I think the interaction between inferred shape and a specified leaf type
is semi-unclear. Or at least needs to be spelled out very carefully.
For example, I could imagine a semantics being something like this:
Given a call to |pa.map(f)| where |pa| has leaf type |T|:

- compute the inferred shape that is returned by |f|
- the "leaf type" is then the type found at the end of this
inferred shape
- each leaf value must be coercable to |T|

So, to get more specific, imagine that the source array |pa| has
type/shape uint8[2] and `map` returns a uint8[AxBxC] and a
uint16[AxBxC'] is returned. The logical shape is thus [AxB] and the
"leaf type" is ParallelArray. This is an error because ParallelArray is
not coercable to |uint8|.

There are other possibilities too. Like, perhaps what you meant is that
we could coerce the |uint16| that was returned to |uint8|?

For all this complexity we have to ask what this is buying us. To avoid
copies, the implementation must still use inference or speculation, just
as today. The advantage is that the implementation knows it has
permission to coerce numbers that don't fit the desired range and so
forth. So maybe it can be more aggressive in the use of optimized
backing stores, I'm not sure.


Niko

> Herhut, Stephan A <mailto:stephan....@intel.com>
> October 10, 2012 10:51 AM
>
> I see basically two options here:
>
> a)Use the full binary data proposal to describe the entire storage layout
>
> b)Support only the numerical data types to describe the storage layout
> *From:*Niko Matsakis [mailto:nikoma...@gmail.com] *On Behalf Of
> *Niko Matsakis
> *Sent:* Wednesday, October 10, 2012 7:38 AM
> *To:* Shu-yu Guo
> *Cc:* Herhut, Stephan A; dhe...@mozilla.com; Shpeisman, Tatiana;
> dev-tech-js-en...@lists.mozilla.org; Hudson, Rick;
> Sreeram, Jaswanth
> *Subject:* Re: Adding support for different storage formats to
> *Niko Matsakis* <mailto:ni...@alum.mit.edu>
> *Shu-yu Guo* <mailto:s...@rfrn.org>
> *Niko Matsakis* <mailto:ni...@alum.mit.edu>
>
> October 9, 2012 8:10 PM
>
> Yes, I was wondering about this question as well on the train ride
> home. In general it seems that multidimensional map, dynamic shape,
> and user-specified backing stores interact in a rather strange way
> that is not especially intuitive to me.
>
> For example, if you create an uint8[NxNx4] array and then use map, you
> will be mapping over an uint8[Nx4] PA elements. We said that the
> array which results from higher-order operations carries the same
> backing store type as its origin... but that doesn't quite seem to
> make sense in this scenario. What happens if you then return
> uint16[N] array? I know this is the same question you were asking (or
> I think it was) but I don't quite understand what answer you proposed. :)
>
>
> Niko
>
> *Herhut, Stephan A* <mailto:stephan....@intel.com>
> *From:*Niko Matsakis [mailto:nikoma...@gmail.com] *On Behalf Of
> *Niko Matsakis
> *Sent:* Tuesday, October 09, 2012 4:57 PM
> *To:* dhe...@mozilla.com <mailto:dhe...@mozilla.com>; Herhut, Stephan A
> *Cc:* Shpeisman, Tatiana;
> *Subject:* Re: Adding support for different storage formats to
> *Niko Matsakis* <mailto:ni...@alum.mit.edu>
> Niko Matsakis <mailto:ni...@alum.mit.edu>
> October 10, 2012 7:37 AM
> To summarize some discussion between Shu and I on IRC:
>
> In some sense, we *can* have a "logical backing store". But the
> *physical backing store* that comes out of map() and friends would be
> "any", since we must be prepared for any return value. But this seems
> to defeat the whole point of this exercise. As I understood it, the
> goal was to allow us to allocate something that is appropriately sized
> and to give us permission to force the return value into this spot, so
> that the user had some guarantees. So if the user says uint8 but
> returns 257 we get to store 1 (257 % 256).
>
> We would not have this permission with the "logical backing store"
> model, instead we'd have to infer a backing store type that
> accommodates all return values without losing any information. If
> we're back at the place where we have to use inference to decide when
> we can allocate a tighter backing store without losing information,
> which it sounds like we are, then we haven't really gotten very far.
> In that case I might prefer to just scrap the idea entirely and say
> "if you would like a uint8 backing store, coerce your return values to
> uint8 and cross your fingers that your engine implementors took the
> time to optimize that case".
>
> Moreover, given that the binary data spec errors when an inappropriate
> value is used, it seems mildly inconsistent that the *constructor*
> form yields an error if you specify uint8 but return strings, but
> *map* silently changes backing store type to any.
>
>
> Niko
>
>

Shu-yu Guo

unread,
Oct 11, 2012, 6:29:32 AM10/11/12
to Herhut, Stephan A, Hudson, Rick, Shpeisman, Tatiana, dev-tech-js-en...@lists.mozilla.org, Niko Matsakis, dhe...@mozilla.com, Sreeram, Jaswanth
I don't see why the main con of option a) is that it's like writing typed
code. As long as we maintain expressiveness parity with binary data, I
think it will feel natural.

I'm not too invested in which way we go -- but if anything, the discussion
thus far, especially the analogy of inferred backing type with inferred
shape preventing us from getting "no copy" performance guarantee we wanted
in the first place, I'd be happier with less inference.

Shape inference is compelling as the shape of iteration might very well be
different from the shape of storage. I suppose there is a parallel here
with types also, but I don't find that use case compelling, or at least
difficult to reason about.

On Wed, Oct 10, 2012 at 10:51 AM, Herhut, Stephan A <
stephan....@intel.com> wrote:

> I see basically two options here:****
>
> ** **
>
> **a) **Use the full binary data proposal to describe the entire
> storage layout****
>
> **b) **Support only the numerical data types to describe the storage
> layout of leafs but not the results overall structure****
>
> ** **
>
> If we go for a, which I think has been discussed here but is not what I
> aimed at initially, I think it is only feasible to have programmers specify
> the storage layout for every operation. So, for instance,****
>
> ** **
>
> const Color = StructType({r:uint8, g:uint8, b:uint8, a:uint8});****
>
> var a = new ParallelArray([10], function () { return new Color({r:1, g:2,
> b:3, a:255});}, ArrayType(Color, 10));****
>
> var b = a.map(toGrayScale, ArrayType(uint8, 10)); // where toGrayScale
> converts rgba to a single luminance value****
>
> ** **
>
> This tells the engine exactly how to represent data and also gives enough
> information to preallocate. In this setting, it also makes sense to reject
> values that cannot be converted into the specified storage layout. The
> drawback is that it is more difficult to use as the programmer has to do
> the type inference in his head. Then again, if he wants a certain storage
> layout, he probably has a reason for that. In the end, if binary data takes
> off and JS developers learn to write code that way, why shouldn’t they
> learn to write River Trail that way.****
>
> ** **
>
> Option b is less restrictive and allows for dynamic inference. It won’t
> support the use of structures or similar but the user will at least be able
> to define a storage class for the values. The above example would then be*
> ***
>
> ** **
>
> var a = new ParallelArray([10], function () { return [1,2,3,255];},
> uint8); // inferred as ArrayType(ArrayType(uint8, 4), 10)****
>
> var b = a.map(toGrayScale); // uint8 inherited, structure inferred as
> ArrayType(uint8, 10)****
>
> ** **
>
> We only tell the engine what type the leaves should have. The existing
> rules for dynamic shape etc. apply. Here, we should not fail but convert
> e.g. “hello” to 0. However, we could fail if the leaf of the dynamic shape
> is not a primitive. So ****
>
> ** **
>
> var a = new ParallelArray([4], function ( return “hello”;}, uint8) //->
> <0,0,0,0>****
>
> var b = new ParallelArray([4], function ( return [1,2];}, uint8) //->
> <<1,2>,…,<1,2>>****
>
> ** **
>
> but****
>
> ** **
>
> var x = new ParallelArray([4], function(i) { return new Array(i); } //->
> fail, as the dynamic shape is [4] and leaves are not primitives.****
>
> ** **
>
> I still think option b is easier to use (less specification overhead) and
> powerful enough. Option a is more expressive but it feels like writing
> typed code. ****
>
> ** **
>
> Stephan****
>
> ** **
>
> ** **
>
> *From:* Niko Matsakis [mailto:nikoma...@gmail.com] *On Behalf Of *Niko
> Matsakis
> *Sent:* Wednesday, October 10, 2012 7:38 AM
> *To:* Shu-yu Guo
> *Cc:* Herhut, Stephan A; dhe...@mozilla.com; Shpeisman, Tatiana;
> dev-tech-js-en...@lists.mozilla.org; Hudson, Rick; Sreeram,
> Jaswanth
>
> *Subject:* Re: Adding support for different storage formats to
> ParallelArray (again)****
>
> ** **
>
> To summarize some discussion between Shu and I on IRC:
>
> In some sense, we *can* have a "logical backing store". But the *physical
> backing store* that comes out of map() and friends would be "any", since we
> must be prepared for any return value. But this seems to defeat the whole
> point of this exercise. As I understood it, the goal was to allow us to
> allocate something that is appropriately sized and to give us permission to
> force the return value into this spot, so that the user had some
> guarantees. So if the user says uint8 but returns 257 we get to store 1
> (257 % 256).
>
> We would not have this permission with the "logical backing store" model,
> instead we'd have to infer a backing store type that accommodates all
> return values without losing any information. If we're back at the place
> where we have to use inference to decide when we can allocate a tighter
> backing store without losing information, which it sounds like we are, then
> we haven't really gotten very far. In that case I might prefer to just
> scrap the idea entirely and say "if you would like a uint8 backing store,
> coerce your return values to uint8 and cross your fingers that your engine
> implementors took the time to optimize that case".
>
> Moreover, given that the binary data spec errors when an inappropriate
> value is used, it seems mildly inconsistent that the *constructor* form
> yields an error if you specify uint8 but return strings, but *map* silently
> changes backing store type to any.
>
>
> Niko
>
>
>
> ****
>
> ****
>
> *Niko Matsakis* <ni...@alum.mit.edu>****
>
> October 10, 2012 6:48 AM****
> Niko ****
>
> ****
>
> *Shu-yu Guo* <s...@rfrn.org>****
>
> October 10, 2012 3:16 AM****
> experience in graphics programming. :)****
>
>
>
>
>
> --
> shu****
>
> ****
>
> *Niko Matsakis* <ni...@alum.mit.edu>****
>
> October 9, 2012 8:10 PM****
>
> Yes, I was wondering about this question as well on the train ride home.
> In general it seems that multidimensional map, dynamic shape, and
> user-specified backing stores interact in a rather strange way that is not
> especially intuitive to me.
>
> For example, if you create an uint8[NxNx4] array and then use map, you
> will be mapping over an uint8[Nx4] PA elements. We said that the array
> which results from higher-order operations carries the same backing store
> type as its origin... but that doesn't quite seem to make sense in this
> scenario. What happens if you then return uint16[N] array? I know this is
> the same question you were asking (or I think it was) but I don't quite
> understand what answer you proposed. :)
>
>
> Niko****
>
> ****
>
> *Herhut, Stephan A* <stephan....@intel.com>****
>
> October 9, 2012 5:24 PM****
>
> Then the question is what that means for multi-dimensional arrays. Should
> we expose that they use a flattened representation or can one only have 1D
> ParallelArray objects when using other backing stores than any? If we
> expose that they have a flattened representation, than we will also need to
> define how deep we flatten, which in essence makes the current dynamic
> shape a static property of the layout. ****
>
> ****
>
> Consider processing canvas data on a per pixel level. The map operation
> will want to use uint8clamped but the elemental function will return 4
> element arrays. Are these converted to 0 when writing to the backing store
> or do we add 4 elements to the backing store and make the PA object an
> array of 4-tuples? ****
>
> ****
>
> The beauty of dynamic shape is that we can always revert to a nested
> backing store yet, after the fact, create an n-dimensional array. If the
> constructor only defines storage semantics, we can still do that. If it
> gives the actual backing store, we can’t as an array of type T cannot
> simply be nested.****
>
> ****
>
> Another wild idea would be to have the programmer specify the exact type
> of the outcome (the one he wants). For the above example, that would be
> ArrayType( ArrayType(4, uint8clamped), <outer dim size>). This would then
> also specify a template for conversion of the results of the elemental
> function into what the ParallelArray object actually stores. Somewhat
> awkward to use but maybe good enough for a performance programmer? This
> should always be optional, though!****
>
> ****
>
> I currently still prefer dynamic shape with user defined backing-store
> semantics but might convince myself otherwise…****
>
> ****
>
> Stephan****
>
> ****
>
> *From:* Niko Matsakis [mailto:nikoma...@gmail.com<nikoma...@gmail.com>]
> *On Behalf Of *Niko Matsakis
> *Sent:* Tuesday, October 09, 2012 4:57 PM
> *To:* dhe...@mozilla.com; Herhut, Stephan A
> *Cc:* Shpeisman, Tatiana; dev-tech-js-en...@lists.mozilla.org;
> Hudson, Rick; Sreeram, Jaswanth; s...@rfrn.org
> *Subject:* Re: Adding support for different storage formats to
> ParallelArray (again)****
>
> ****
>
> I am wondering if we can just say that the backing store is an array of
> some type T where T is provided in the constructor. It could default to
> "any" to get the current behavior, essentially.
>
>
> Niko
>
> (Sent from a cramped, unreliable keyboard)
>
>
> -------- Original message --------
> Subject: Re: Adding support for different storage formats to ParallelArray
> (again)
> From: David Herman <dhe...@mozilla.com>
> To: "Herhut, Stephan A" <stephan....@intel.com>
> CC: "Shpeisman, Tatiana" <tatiana....@intel.com>,"
> dev-tech-js-en...@lists.mozilla.org" <
> dev-tech-js-en...@lists.mozilla.org>,"Hudson, Rick" <
> rick....@intel.com>,"Sreeram, Jaswanth" <jaswanth...@intel.com>,"Shu-yu
> Guo (s...@rfrn.org)" <s...@rfrn.org>
>
>
> ****
>
> Typed arrays are generalized by and superseded by the ES6 binary data
> types library. This library provides a full generalization of first-class
> type descriptors, via a little combinator library. So instead of taking
> typed array constructors, a better API would take type descriptor values
> representing the element type:
>
> var paContainingUint8s = new ParallelArray(/* whatever */, uint8);
> var paContainingFloat32s = new ParallelArray(/* whatever */, float32);
>
> Dave
>
> On Oct 9, 2012, at 1:55 PM, "Herhut, Stephan A" <
> https://lists.mozilla.org/listinfo/dev-tech-js-engine-rivertrail****
>
> ****
>
> *Niko Matsakis* <ni...@alum.mit.edu>****
>
> October 9, 2012 4:56 PM****
>
> I am wondering if we can just say that the backing store is an array of
> some type T where T is provided in the constructor. It could default to
> "any" to get the current behavior, essentially.
>
>
> Niko
>
> (Sent from a cramped, unreliable keyboard)
>
>
> -------- Original message --------
> Subject: Re: Adding support for different storage formats to ParallelArray
> (again)
> From: David Herman <dhe...@mozilla.com> <dhe...@mozilla.com>
> To: "Herhut, Stephan A" <stephan....@intel.com><stephan....@intel.com>
> CC: "Shpeisman, Tatiana" <tatiana....@intel.com><tatiana....@intel.com>
> ,"dev-tech-js-en...@lists.mozilla.org"<dev-tech-js-en...@lists.mozilla.org>
> <dev-tech-js-en...@lists.mozilla.org><dev-tech-js-en...@lists.mozilla.org>,"Hudson,
> Rick" <rick....@intel.com> <rick....@intel.com>,"Sreeram, Jaswanth"
> <jaswanth...@intel.com> <jaswanth...@intel.com>,"Shu-yu Guo (
> s...@rfrn.org)" <s...@rfrn.org> <s...@rfrn.org>
>
> ****
>
> Typed arrays are generalized by and superseded by the ES6 binary data
> types library. This library provides a full generalization of first-class
> type descriptors, via a little combinator library. So instead of taking
> typed array constructors, a better API would take type descriptor values
> representing the element type:
>
> var paContainingUint8s = new ParallelArray(/* whatever */, uint8);
> var paContainingFloat32s = new ParallelArray(/* whatever */, float32);
>
> Dave
>
> On Oct 9, 2012, at 1:55 PM, "Herhut, Stephan A"
> https://lists.mozilla.org/listinfo/dev-tech-js-engine-rivertrail****
>



--
shu

Herhut, Stephan A

unread,
Oct 23, 2012, 12:58:36 PM10/23/12
to Herhut, Stephan A, Niko Matsakis, Shu-yu Guo, Shpeisman, Tatiana, dev-tech-js-en...@lists.mozilla.org, Hudson, Rick, dhe...@mozilla.com, Sreeram, Jaswanth
Dear all,

Just to summarize our discussion during out last meeting:

Nomenclature: I will use the concept of cell and frame shape in the following discussion. Here is an informal definition: The frame shape describes the structure that is created by the array operation. For example, in a comprehension constructor new ParallelArray([2,4], f) the frame shape would be [2,4]. This is essentially the scaffolding that the array operation creates to hold the values produced by the elemental function. I will use cell shape to refer to the shape of those values, i.e., the values produced by the elemental function. Thus, the cell shape is a property of the elemental function.

The key idea in the binary data proposal for ParallelArray then is to allow the programmer to provide type descriptions for the cell shape as an optional argument _to every array operation_. There is no implicit propagation nor inference. Also, the annotations are designed such that they are fully optional, i.e., the programmer can simply erase them and the code will still work, but might produce slightly different results due to lack of data coercions. Type annotations may only contain fixed size types, i.e., we do not allow ArrayType with dynamic length.

An example would be:
new ParallelArray([2,3], function (j,k) { return [j,k]; }, ArrayType(uint8, 2)}

Here, the resulting ParallelArray object would have a backing store of type ArrayType(ArrayType(ArrayType(uint8, 2), 3), 2). On write, we use the existing coercion rules to convert the result of the elemental function into the annotated cell type. If that fails (not even sure it can, depends on the final binary data spec), we throw the same exception that sequential code would throw in such situations.

Interaction with dynamic shape: In the above example, the result would have a dynamic shape of [2,3], as the leaf elements are not ParallelArray structures. If the programmer instead writes

new ParallelArray([2,3], function (j,k) { return new ParallelArray([j,k], ArrayType(uint8, 2)); }, ArrayType(uint8, 2)}

the dynamic shape will be [2,3,2]. Note the explicit type annotation for the inner ParallelArray. The backing store in this case would be the same as before.

Shu's criticism was that in the above case, the type annotations can no longer be understood as coercions on write, as the ParallelArray object returned by the elemental function is not actually coerced into a ArrayType(uint8, 2) but remains a ParallelArray object with a matching backing store. I have not come up with a solution to this. In the end, if we want to keep type annotations optional and therefore orthogonal to the rest of ParallelArray's workings (which I am strongly in favor of), we have to keep the same dynamic shape semantics as in the untyped case.

To allow programmers to make direct use of the binary data store, we will add a new method to ParallelArray objects

asBinary()

that returns a frozen view of the binary data backing store. If no type annotations were supplied, this view will be a nesting of ArrayType up to the length of the dynamic shape with Any as the element type. If called on data that was not created using type annotations, it is perfectly valid to copy the data to create the requested view.

I hope this covers the results of our discussion.

Stephan

> -----Original Message-----
> From: dev-tech-js-engine-rivertrail-
> bounces+stephan.a.herhut=inte...@lists.mozilla.org [mailto:dev-tech-js-
> engine-rivertrail-bounces+stephan.a.herhut=inte...@lists.mozilla.org] On
> Behalf Of Herhut, Stephan A
> Sent: Wednesday, October 10, 2012 10:51 AM
> To: Niko Matsakis; Shu-yu Guo
> Cc: Sreeram, Jaswanth; dev-tech-js-en...@lists.mozilla.org;
> Hudson, Rick; dhe...@mozilla.com; Shpeisman, Tatiana
> Subject: RE: Adding support for different storage formats to ParallelArray
> (again)
>
> I see basically two options here:
>
>
> a) Use the full binary data proposal to describe the entire storage layout
>
> b) Support only the numerical data types to describe the storage layout of
> leafs but not the results overall structure
>
> If we go for a, which I think has been discussed here but is not what I aimed
> at initially, I think it is only feasible to have programmers specify the storage
> layout for every operation. So, for instance,
>
> const Color = StructType({r:uint8, g:uint8, b:uint8, a:uint8}); var a = new
> ParallelArray([10], function () { return new Color({r:1, g:2, b:3, a:255});},
> ArrayType(Color, 10)); var b = a.map(toGrayScale, ArrayType(uint8, 10)); //
> where toGrayScale converts rgba to a single luminance value
>
> This tells the engine exactly how to represent data and also gives enough
> information to preallocate. In this setting, it also makes sense to reject values
> that cannot be converted into the specified storage layout. The drawback is
> that it is more difficult to use as the programmer has to do the type inference
> in his head. Then again, if he wants a certain storage layout, he probably has
> a reason for that. In the end, if binary data takes off and JS developers learn
> to write code that way, why shouldn’t they learn to write River Trail that way.
>
> Option b is less restrictive and allows for dynamic inference. It won’t support
> the use of structures or similar but the user will at least be able to define a
> storage class for the values. The above example would then be
>
> var a = new ParallelArray([10], function () { return [1,2,3,255];}, uint8); //
> inferred as ArrayType(ArrayType(uint8, 4), 10) var b = a.map(toGrayScale); //
> uint8 inherited, structure inferred as ArrayType(uint8, 10)
>
> We only tell the engine what type the leaves should have. The existing rules
> for dynamic shape etc. apply. Here, we should not fail but convert e.g.
> “hello” to 0. However, we could fail if the leaf of the dynamic shape is not a
> primitive. So
>
> var a = new ParallelArray([4], function ( return “hello”;}, uint8) //-> <0,0,0,0>
> var b = new ParallelArray([4], function ( return [1,2];}, uint8) //->
> <<1,2>,…,<1,2>>
>
> but
>
> var x = new ParallelArray([4], function(i) { return new Array(i); } //-> fail, as
> the dynamic shape is [4] and leaves are not primitives.
>
> I still think option b is easier to use (less specification overhead) and powerful
> enough. Option a is more expressive but it feels like writing typed code.
>
> Stephan
>
>
> From: Niko Matsakis [mailto:nikoma...@gmail.com] On Behalf Of Niko
> Matsakis
> Sent: Wednesday, October 10, 2012 7:38 AM
> To: Shu-yu Guo
> Cc: Herhut, Stephan A; dhe...@mozilla.com; Shpeisman, Tatiana; dev-
> tech-js-engi...@lists.mozilla.org; Hudson, Rick; Sreeram, Jaswanth
> Subject: Re: Adding support for different storage formats to ParallelArray
> (again)
>
> To summarize some discussion between Shu and I on IRC:
>
> In some sense, we *can* have a "logical backing store". But the *physical
> backing store* that comes out of map() and friends would be "any", since we
> must be prepared for any return value. But this seems to defeat the whole
> point of this exercise. As I understood it, the goal was to allow us to allocate
> something that is appropriately sized and to give us permission to force the
> return value into this spot, so that the user had some guarantees. So if the
> user says uint8 but returns 257 we get to store 1 (257 % 256).
>
> We would not have this permission with the "logical backing store" model,
> instead we'd have to infer a backing store type that accommodates all return
> values without losing any information. If we're back at the place where we
> have to use inference to decide when we can allocate a tighter backing store
> without losing information, which it sounds like we are, then we haven't
> really gotten very far. In that case I might prefer to just scrap the idea
> entirely and say "if you would like a uint8 backing store, coerce your return
> values to uint8 and cross your fingers that your engine implementors took
> the time to optimize that case".
>
> Moreover, given that the binary data spec errors when an inappropriate
> value is used, it seems mildly inconsistent that the *constructor* form yields
> an error if you specify uint8 but return strings, but *map* silently changes
> backing store type to any.
>
>
> Niko
>
>
>
> [cid:image0...@01CDA6D2.AFC579F0]
> Niko Matsakis<mailto:ni...@alum.mit.edu>
> October 10, 2012 6:48 AM
> [cid:image0...@01CDA6D2.AFC579F0]
> Shu-yu Guo<mailto:s...@rfrn.org>
> October 10, 2012 3:16 AM
> graphics programming. :)
>
>
>
>
> --
> shu
> [cid:image0...@01CDA6D2.AFC579F0]
> Niko Matsakis<mailto:ni...@alum.mit.edu>
> October 9, 2012 8:10 PM
> Yes, I was wondering about this question as well on the train ride home. In
> general it seems that multidimensional map, dynamic shape, and user-
> specified backing stores interact in a rather strange way that is not especially
> intuitive to me.
>
> For example, if you create an uint8[NxNx4] array and then use map, you will
> be mapping over an uint8[Nx4] PA elements. We said that the array which
> results from higher-order operations carries the same backing store type as
> its origin... but that doesn't quite seem to make sense in this scenario. What
> happens if you then return uint16[N] array? I know this is the same question
> you were asking (or I think it was) but I don't quite understand what answer
> you proposed. :)
>
>
> Niko
> [cid:image0...@01CDA6D2.AFC579F0]
> Herhut, Stephan A<mailto:stephan....@intel.com>
> October 9, 2012 5:24 PM
> Then the question is what that means for multi-dimensional arrays. Should
> we expose that they use a flattened representation or can one only have 1D
> ParallelArray objects when using other backing stores than any? If we expose
> that they have a flattened representation, than we will also need to define
> how deep we flatten, which in essence makes the current dynamic shape a
> static property of the layout.
>
> Consider processing canvas data on a per pixel level. The map operation will
> want to use uint8clamped but the elemental function will return 4 element
> arrays. Are these converted to 0 when writing to the backing store or do we
> add 4 elements to the backing store and make the PA object an array of 4-
> tuples?
>
> The beauty of dynamic shape is that we can always revert to a nested
> backing store yet, after the fact, create an n-dimensional array. If the
> constructor only defines storage semantics, we can still do that. If it gives the
> actual backing store, we can’t as an array of type T cannot simply be nested.
>
> Another wild idea would be to have the programmer specify the exact type
> of the outcome (the one he wants). For the above example, that would be
> ArrayType( ArrayType(4, uint8clamped), <outer dim size>). This would then
> also specify a template for conversion of the results of the elemental
> function into what the ParallelArray object actually stores. Somewhat
> awkward to use but maybe good enough for a performance programmer?
> This should always be optional, though!
>
> I currently still prefer dynamic shape with user defined backing-store
> semantics but might convince myself otherwise…
>
> Stephan
>
> From: Niko Matsakis [mailto:nikoma...@gmail.com] On Behalf Of Niko
> Matsakis
> Sent: Tuesday, October 09, 2012 4:57 PM
> To: dhe...@mozilla.com<mailto:dhe...@mozilla.com>; Herhut, Stephan
> A
> Cc: Shpeisman, Tatiana; dev-tech-js-engine-
> river...@lists.mozilla.org<mailto:dev-tech-js-engine-
> river...@lists.mozilla.org>; Hudson, Rick; Sreeram, Jaswanth;
> s...@rfrn.org<mailto:s...@rfrn.org>
> Subject: Re: Adding support for different storage formats to ParallelArray
> (again)
>
> I am wondering if we can just say that the backing store is an array of some
> type T where T is provided in the constructor. It could default to "any" to get
> the current behavior, essentially.
>
>
> Niko
>
> (Sent from a cramped, unreliable keyboard)
>
>
> -------- Original message --------
> Subject: Re: Adding support for different storage formats to ParallelArray
> (again)
> From: David Herman
> <dhe...@mozilla.com<mailto:dhe...@mozilla.com>>
> To: "Herhut, Stephan A"
> <stephan....@intel.com<mailto:stephan....@intel.com>>
> CC: "Shpeisman, Tatiana"
> <tatiana....@intel.com<mailto:tatiana....@intel.com>>,"de
> v-tech-js-eng...@lists.mozilla.org<mailto:dev-tech-js-engine-
> river...@lists.mozilla.org>" <dev-tech-js-engine-
> river...@lists.mozilla.org<mailto:dev-tech-js-engine-
> river...@lists.mozilla.org>>,"Hudson, Rick"
> <rick....@intel.com<mailto:rick....@intel.com>>,"Sreeram,
> Jaswanth"
> <jaswanth...@intel.com<mailto:jaswanth...@intel.com>>,"Shu
> -yu Guo (s...@rfrn.org<mailto:s...@rfrn.org>)"
> <s...@rfrn.org<mailto:s...@rfrn.org>>
>
>
> Typed arrays are generalized by and superseded by the ES6 binary data types
> library. This library provides a full generalization of first-class type descriptors,
> via a little combinator library. So instead of taking typed array constructors, a
> better API would take type descriptor values representing the element type:
>
> var paContainingUint8s = new ParallelArray(/* whatever */, uint8);
> var paContainingFloat32s = new ParallelArray(/* whatever */, float32);
>
> Dave
>
> On Oct 9, 2012, at 1:55 PM, "Herhut, Stephan A"
> <stephan....@intel.com<mailto:stephan....@intel.com>>
> dev-tech-js-engine-rivertrail mailing list dev-tech-js-engine-
> river...@lists.mozilla.org<mailto:dev-tech-js-engine-
> river...@lists.mozilla.org>
> https://lists.mozilla.org/listinfo/dev-tech-js-engine-rivertrail
> [cid:image0...@01CDA6D2.AFC579F0]
> Niko Matsakis<mailto:ni...@alum.mit.edu>
> October 9, 2012 4:56 PM
> I am wondering if we can just say that the backing store is an array of some
> type T where T is provided in the constructor. It could default to "any" to get
> the current behavior, essentially.
>
>
> Niko
>
> (Sent from a cramped, unreliable keyboard)
>
>
> -------- Original message --------
> Subject: Re: Adding support for different storage formats to ParallelArray
> (again)
> From: David Herman
> <dhe...@mozilla.com><mailto:dhe...@mozilla.com>
> To: "Herhut, Stephan A"
> <stephan....@intel.com><mailto:stephan....@intel.com>
> CC: "Shpeisman, Tatiana"
> <tatiana....@intel.com><mailto:tatiana....@intel.com>,"de
> v-tech-js-eng...@lists.mozilla.org"<mailto:dev-tech-js-engine-
> river...@lists.mozilla.org> <dev-tech-js-engine-
> river...@lists.mozilla.org><mailto:dev-tech-js-engine-
> river...@lists.mozilla.org>,"Hudson, Rick"
> <rick....@intel.com><mailto:rick....@intel.com>,"Sreeram,
> Jaswanth"
> <jaswanth...@intel.com><mailto:jaswanth...@intel.com>,"Shu
> -yu Guo (s...@rfrn.org<mailto:s...@rfrn.org>)"
> <s...@rfrn.org><mailto:s...@rfrn.org>
>
> Typed arrays are generalized by and superseded by the ES6 binary data types
> library. This library provides a full generalization of first-class type descriptors,
> via a little combinator library. So instead of taking typed array constructors, a
> better API would take type descriptor values representing the element type:
>
> var paContainingUint8s = new ParallelArray(/* whatever */, uint8);
> var paContainingFloat32s = new ParallelArray(/* whatever */, float32);
>
> Dave
>
> On Oct 9, 2012, at 1:55 PM, "Herhut, Stephan A"
> <stephan....@intel.com><mailto:stephan....@intel.com>
> dev-tech-js-engine-rivertrail mailing list dev-tech-js-engine-
> river...@lists.mozilla.org<mailto:dev-tech-js-engine-
> river...@lists.mozilla.org>
> https://lists.mozilla.org/listinfo/dev-tech-js-engine-rivertrail
0 new messages