Multi-dimensional array, part 2

136 views
Skip to first unread message

Daryle Walker

unread,
Oct 4, 2013, 3:49:43 PM10/4/13
to std-pr...@isocpp.org
I figured out how to activate the web page feature of the GitHub repository for my array proposal.  It is at http://ctmacuser.github.io/multiarray-iso-proposal.  It should lead to the copy of the proposal copied there from the regular branch.
 
The next submission deadline is in a week, so I need a final check over.  Are there any mistakes or omissions?  Is anything unclear?
 
Daryle W.
 

Magnus Fromreide

unread,
Oct 4, 2013, 4:32:30 PM10/4/13
to std-pr...@isocpp.org
I think you need to put a lot more effort into the motivation section.

Explain why this is something that is needed.



In "Design Decisions" you are waving your hands trying to claim that
operator[]() calls are inherently slower than calls to the built in
operator[].
Do you have any data to support this claim?
Preferably data that support that it is infeasible to achieve the same
performance.
This is important as you are disabling user defined indexing for data
stored in your class based on this.



I am strongly opposed to misappropriating the function call operator for
indexing.

std::array<int, 17> anArray;
std::function<int(int)> fun(anArray);

One can have lots of fun with that function call operator.

/MF


Róbert Dávid

unread,
Oct 4, 2013, 7:02:35 PM10/4/13
to std-pr...@isocpp.org


2013. október 4., péntek 22:32:30 UTC+2 időpontban Magnus Fromreide a következőt írta:
On Fri, 2013-10-04 at 12:49 -0700, Daryle Walker wrote:
> I figured out how to activate the web page feature of the GitHub
> repository for my array proposal.  It is at
> http://ctmacuser.github.io/multiarray-iso-proposal.  It should lead to
> the copy of the proposal copied there from the regular branch.
>  
> The next submission deadline is in a week, so I need a final check
> over.  Are there any mistakes or omissions?  Is anything unclear?

I think you need to put a lot more effort into the motivation section.

Explain why this is something that is needed.



In "Design Decisions" you are waving your hands trying to claim that
operator[]() calls are inherently slower than calls to the built in
operator[].
Do you have any data to support this claim?
Preferably data that support that it is infeasible to achieve the same
performance.
This is important as you are disabling user defined indexing for data
stored in your class based on this.
I guess he referred to the std::array<X,n> implementation may have extra data besides the element, so indexing std::array<std::array<X,n>,m> with (a,b) "coordinates" isn't the same as indexing a X[n*m] with a*n+b.
 



I am strongly opposed to misappropriating the function call operator for
indexing.

std::array<int, 17> anArray;
std::function<int(int)> fun(anArray);

One can have lots of fun with that function call operator.
A multidimensional array needs to have multicoordinate indexing, otherwise it is useless.

Before C++11, it was done using the function call operator, because it was totally awkward to use operator[] (call syntax would been something like my_array_obj[array_indexes(1,2,3)];). I haven't seen any multiarray/linalg matrix library that started from zero using C++11, that's why all of them still use operator(). But this is no reason to have this new proposed class to use it.

As for operator[], the 'reference' implementation allows this, without any error:
array_md<int, 2, 3> my2dArray;
my2dArray[{1,2,3,4}] = 42;
Instead of initializer_list (as you cannot restrict the accepted init list size), I think it would be better to use std::array<size_type, dimensionality> (or something equivalent) as the parameter for operator[].

Also, I find linear addresses of a multidim array error-prone. Instead, I'd give a reshape() function that would allow us to reshape the contiguous memory layout to another dimension (internally, that can be done with a static_cast), then use it to reshape to an 1D array, then index it - with the linear index. The idea is to do something like Matlab's reshape (it does have linear indexing - and I get burned by that every now and then).

Regards, Robert

Daryle Walker

unread,
Oct 5, 2013, 6:46:42 AM10/5/13
to std-pr...@isocpp.org
On Friday, October 4, 2013 4:32:30 PM UTC-4, Magnus Fromreide wrote:
On Fri, 2013-10-04 at 12:49 -0700, Daryle Walker wrote:
> I figured out how to activate the web page feature of the GitHub
> repository for my array proposal.  It is at
> http://ctmacuser.github.io/multiarray-iso-proposal.  It should lead to
> the copy of the proposal copied there from the regular branch.
>  
> The next submission deadline is in a week, so I need a final check
> over.  Are there any mistakes or omissions?  Is anything unclear?

I think you need to put a lot more effort into the motivation section.

Explain why this is something that is needed.
 
  1. The current std::array wraps a built-in array into a Regular type.  Said wrapper also acts as a Standard (sequence) container.
  2. Why use multi-dimensional arrays?  I don't know; that's up to your choices for the application.
  3. The new std::array puts (1) and (2) together.  It wraps a multi-dimensional built-in array as a Regular type.  Also, the element type is what you originally specified, not some intermediate sub-array type.
  4. Declaring and using a built-in array involves a (series of) bracketed number.  This isn't compatible with C++11's variadic stuff.  The new std::array's extent specification and dereference calls both use comma-separated lists.
(See the reply to Robert David for your other questions.)
 
Daryle W.

 

Daryle Walker

unread,
Oct 5, 2013, 7:58:15 AM10/5/13
to std-pr...@isocpp.org
On Friday, October 4, 2013 7:02:35 PM UTC-4, Róbert Dávid wrote:
2013. október 4., péntek 22:32:30 UTC+2 időpontban Magnus Fromreide a következőt írta:
On Fri, 2013-10-04 at 12:49 -0700, Daryle Walker wrote:
> I figured out how to activate the web page feature of the GitHub
> repository for my array proposal.  It is at
> http://ctmacuser.github.io/multiarray-iso-proposal.  It should lead to
> the copy of the proposal copied there from the regular branch.
>  
> The next submission deadline is in a week, so I need a final check
> over.  Are there any mistakes or omissions?  Is anything unclear?
[SNIP] 
In "Design Decisions" you are waving your hands trying to claim that
operator[]() calls are inherently slower than calls to the built in
operator[].
Do you have any data to support this claim?
Preferably data that support that it is infeasible to achieve the same
performance.
This is important as you are disabling user defined indexing for data
stored in your class based on this.
I guess he referred to the std::array<X,n> implementation may have extra data besides the element, so indexing std::array<std::array<X,n>,m> with (a,b) "coordinates" isn't the same as indexing a X[n*m] with a*n+b. 
 
It doesn't have to be extra data; extra padding alone can be a disadvantage.  But I was talking about calling a member operator of a class versus a built-in operator, although inline-based optimizations can remove the difference.  But why hope for optimization when I can avoid the need under some circumstances?
 
The new array doesn't support user-defined indexing because built-in arrays don't.  It's not an every-feature class.  And how would the user-defined indexing be submitted to the class?  Where would it be stored?  How could either not break the std::array interface?  Why should users who never need index order scrambling have to pay for it?
 
Someone from the previous discussion wanted configurable indexing too.  I am coming up with an adaptor class template for this, like stack or queue.  There's no size-changing methods, just multi-coordinate access and storing the extents and index-priorities.  (The container is protected, so you can make a derived class if you need to add size-changing methods.)
 
I am strongly opposed to misappropriating the function call operator for
indexing.

std::array<int, 17> anArray;
std::function<int(int)> fun(anArray);

One can have lots of fun with that function call operator.
A multidimensional array needs to have multicoordinate indexing, otherwise it is useless.

Before C++11, it was done using the function call operator, because it was totally awkward to use operator[] (call syntax would been something like my_array_obj[array_indexes(1,2,3)];). I haven't seen any multiarray/linalg matrix library that started from zero using C++11, that's why all of them still use operator(). But this is no reason to have this new proposed class to use it.
 
operator[] always only takes one argument, so multi-coordinate indexing would need to submit a tuple object.  Or use operator() instead (or along with).  C++11 adds to the syntax of operator[] to take a braced-list.  You have to create an operator[] overload that takes a type that's compatible with a braced-list.  I chose std::initializer_list since that is what the example used.
 
Can you clarify what "any multiarray/linalg matrix library that started from zero" means?
 
As for operator[], the 'reference' implementation allows this, without any error:
array_md<int, 2, 3> my2dArray;
my2dArray[{1,2,3,4}] = 42;
Instead of initializer_list (as you cannot restrict the accepted init list size), I think it would be better to use std::array<size_type, dimensionality> (or something equivalent) as the parameter for operator[].
 
For the current containers, at() throws when given a bad index, while operator[] tries to use it even when it causes Undefined Behavior.  My operator[] does the same thing; it forfeits checking the list length or the individual values, even if it causes a crash.  Use at() again to check those and throw if they're found wanting, but forfeit speed due to those checks.
 
Someone from the previous discussion brought up using std::array<std::size_t, extent_count> as the index type for tuple-enabled operator[].  He also mentioned the down side.  We get an error if too many items are in the initializer, but not if there are too few.  In the latter case, the missing elements will be zero instead of causing an error.  There's also the cosmetic issue of a double brace pair is sometimes needed for a std::array "literal" because brace-elision isn't always accepted.  (And it's weird to use std::array as a helper type since this code will be redefining array!)
 
Also, I find linear addresses of a multidim array error-prone. Instead, I'd give a reshape() function that would allow us to reshape the contiguous memory layout to another dimension (internally, that can be done with a static_cast), then use it to reshape to an 1D array, then index it - with the linear index. The idea is to do something like Matlab's reshape (it does have linear indexing - and I get burned by that every now and then).
 
For "linear addresses of a multidim array," are you talking about the begin/end functions?  Without those, you can't use an array with Standard algorithms or range-for, or meet the Container requirements.  Your kind of reshaping would require a pointer or reference reinterpret_cast, especially since built-in arrays don't support static_cast at all.  (A static_cast would be more appropriate for changing the element type, not the extent layout.)  And what would happen if the source and destination type have different total element counts?  Or would such conversions be blocked?  A copy seems better all around, and I can change element type and/or count at the same time.  The linear order lets me define how elements transfer locations when the source and destination extent tuples are not equal.  (A specification of randomly scrambled order wouldn't be too useful.)
 
Besides one-dimensional arrays, you can't really get linear access outside of explicitly messing around with a copy of the return value from begin or data.
 
Daryle W.
 

morw...@gmail.com

unread,
Oct 5, 2013, 2:17:33 PM10/5/13
to std-pr...@isocpp.org
It's stillpossible to create a new multi-index standard type, let's say std::index<>, which would more or less be like this:
I tried to use it the way you wanted to. I works fine and static_assert is triggered when needed, allowing a meaningful
error message.

http://coliru.stacked-crooked.com/a/c678edd816def10d

Róbert Dávid

unread,
Oct 6, 2013, 6:13:55 PM10/6/13
to std-pr...@isocpp.org


2013. október 5., szombat 13:58:15 UTC+2 időpontban Daryle Walker a következőt írta:
 
Can you clarify what "any multiarray/linalg matrix library that started from zero" means?
Just a library that does not have any legacy features from the age before C++11, don't think too much into this :)

For the current containers, at() throws when given a bad index, while operator[] tries to use it even when it causes Undefined Behavior.  My operator[] does the same thing; it forfeits checking the list length or the individual values, even if it causes a crash.  Use at() again to check those and throw if they're found wanting, but forfeit speed due to those checks.
Your reference implementation adds a run-time check not just for out-of-range indexing, but also for the amount of indexes. That should be a compile-time error, for both operator[] and at().
 
Someone from the previous discussion brought up using std::array<std::size_t, extent_count> as the index type for tuple-enabled operator[].  He also mentioned the down side.  We get an error if too many items are in the initializer, but not if there are too few.  In the latter case, the missing elements will be zero instead of causing an error.  There's also the cosmetic issue of a double brace pair is sometimes needed for a std::array "literal" because brace-elision isn't always accepted.  (And it's weird to use std::array as a helper type since this code will be redefining array!)
Yeah, but this helper is a 1D array. 1D arrays will still have a operator[](size_t) parameter, at least because of compatibility.
 
 
Also, I find linear addresses of a multidim array error-prone. Instead, I'd give a reshape() function that would allow us to reshape the contiguous memory layout to another dimension (internally, that can be done with a static_cast), then use it to reshape to an 1D array, then index it - with the linear index. The idea is to do something like Matlab's reshape (it does have linear indexing - and I get burned by that every now and then).
 
For "linear addresses of a multidim array," are you talking about the begin/end functions?
No. An operator[](size_t). That should be given only for 1D arrays, I believe. While at it, I think begin() and end() also should give an iterator to an array with one-less extents (except for 1D arrays): similarly to
int a[][][] = {...}
begin(a);
giving a pointer to int[][]. I think it is both more logical, and compatible with built-in arrays with multiple extents.

And what would happen if the source and destination type have different total element counts?  Or would such conversions be blocked?
Blocked, of course. Using the same memory are for items of different size and/or amount is going straight against any type system. (((((Although: what about array<uint8_t, 4> vs array<_simd_vector_of_4_uint8, 1>?)))))
If one needs that, make a copy (rather, being different type, make a conversion).
 
  A copy seems better all around, and I can change element type and/or count at the same time.
I think it is quite bad to have my 3D array to be viewed in a 1D or 2D fashion, I need to make a copy.. (Alternatively to this reshape, an appropriate array_view could be a good solution as well.)

Regards, Robert

Daryle Walker

unread,
Oct 10, 2013, 10:12:22 AM10/10/13
to std-pr...@isocpp.org
On Sunday, October 6, 2013 6:13:55 PM UTC-4, Róbert Dávid wrote:
2013. október 5., szombat 13:58:15 UTC+2 időpontban Daryle Walker a következőt írta:
For the current containers, at() throws when given a bad index, while operator[] tries to use it even when it causes Undefined Behavior.  My operator[] does the same thing; it forfeits checking the list length or the individual values, even if it causes a crash.  Use at() again to check those and throw if they're found wanting, but forfeit speed due to those checks.
Your reference implementation adds a run-time check not just for out-of-range indexing, but also for the amount of indexes. That should be a compile-time error, for both operator[] and at().
 
at() throws on index value errors and, for the std::initializer_list overload, index amount errors.  The length is a run-time attribute; it cannot be checked to become a compile-time error.
 
The overloads of at() and operator() that take individual indexes already have a static_assert if there are too many indexes.  But my proposal has an enable_if or equivalent instead.  Fewer indexes is OK; the code stops at the appropriate level and returns a (built-in) inner array.  You can omit all indexes to get a reference to the (array) data object.  (It's your only choice when there are no extents.)
 
Also, I find linear addresses of a multidim array error-prone. Instead, I'd give a reshape() function that would allow us to reshape the contiguous memory layout to another dimension (internally, that can be done with a static_cast), then use it to reshape to an 1D array, then index it - with the linear index. The idea is to do something like Matlab's reshape (it does have linear indexing - and I get burned by that every now and then).
 
For "linear addresses of a multidim array," are you talking about the begin/end functions?
No. An operator[](size_t). That should be given only for 1D arrays, I believe. While at it, I think begin() and end() also should give an iterator to an array with one-less extents (except for 1D arrays): similarly to
int a[][][] = {...}
begin(a);
giving a pointer to int[][]. I think it is both more logical, and compatible with built-in arrays with multiple extents.
 
Having new std::array be a container of T, instead of some (possibly multidimensional) array-of-T, is one of the main points of the class.  Most of the functions work in units of T instead of a type users usually won't care about.  (And remember, array types have limited functionality.  Another one of the points of std::array is to provided a wrapper that makes arrays a Regular type.)
 
You can use array<remove_extent_t<ArrayType>, extent<ArrayType>::value>, where ArrayType is the built-in array type you're using, to get this effect.
 
And what would happen if the source and destination type have different total element counts?  Or would such conversions be blocked?
Blocked, of course. Using the same memory are for items of different size and/or amount is going straight against any type system. (((((Although: what about array<uint8_t, 4> vs array<_simd_vector_of_4_uint8, 1>?)))))
If one needs that, make a copy (rather, being different type, make a conversion).
 
  A copy seems better all around, and I can change element type and/or count at the same time.
I think it is quite bad to have my 3D array to be viewed in a 1D or 2D fashion, I need to make a copy.. (Alternatively to this reshape, an appropriate array_view could be a good solution as well.)
 
To transform arrays with different ranks, you have to explicitly call reshape_array to make a copy; it happens when the user actually wants it, not by accident.  If you want to reshape at the view level, something like reinterpret_cast<T(&)[the][new][shape]>(my_array()) should work.
 
Daryle W.
 
Reply all
Reply to author
Forward
0 new messages