operator[](...)

495 views
Skip to first unread message

Ryan Nicholl

unread,
Nov 25, 2017, 2:08:25 PM11/25/17
to ISO C++ Standard - Future Proposals

My suggestion is to allow overloading operator [] with multiple arguments, e.g.:

class image
{
 
...
public:
 
...

  pixel
& operator[](size_t x, size_t y)
 
{
   
return internal_data[x+y*width];
 
}
};

/// used like:

image my_image
= ...;
...
pixel px
= my_image[5, 3];

Would be most useful where I see
[{...}]

currently. E.g.:
output[{x,y}] = value;
when processing images.

Todd Fleming

unread,
Nov 25, 2017, 4:09:52 PM11/25/17
to ISO C++ Standard - Future Proposals
This conflicts with already valid syntax:

int f(const std::vector<int>& v) {
   return v[5, 3];
}

Todd

Nicol Bolas

unread,
Nov 25, 2017, 4:11:21 PM11/25/17
to ISO C++ Standard - Future Proposals
On Saturday, November 25, 2017 at 2:08:25 PM UTC-5, Ryan Nicholl wrote:
You've essentially undermined your proposal by highlighting a problem with most multi-argument `[]` proposals.

As it currently stands `output[x, y]` already has a meaning. Namely, it's equivalent to `output[y]`; the entire text between the `[]` is taken as a single expression, rather than a list of arguments.

So you have to invent a new syntax to make what you want possible. Namely, `output[{x, y}]`.

However, `{}` already has a meaning; it's a braced-init-list. Which means that all you need to do to make this syntax work right now... is create a simple index type:

struct two_indices { std::size_t x; std::size_t y; };

pixel
& operator[](two_indices ixs)
{
 
return internal_data[ixs.x + ixs.y * width];
}

Oh look, now it works.

Indeed, you can create a struct that represents any arbitrary number of indices:

template<int index_count>
using index_list = std::array<std::size_t, index_count>;

pixel
& operator[](index_list<2> ixs)
{
 
auto [x, y] = ixs;
 
return internal_data[x + y * width];
}




Jonathan Müller

unread,
Nov 25, 2017, 4:25:26 PM11/25/17
to std-pr...@isocpp.org
On 25.11.2017 22:11, Nicol Bolas wrote:
> You've essentially undermined your proposal by highlighting a problem
> with most multi-argument `[]` proposals.
>
> As it currently stands `output[x, y]` already has a meaning. Namely,
> it's equivalent to `output[y]`; the entire text between the `[]` is
> taken as a single expression, rather than a list of arguments.
>
> So you have to invent a new syntax to make what you want possible.
> Namely, `output[{x, y}]`.

What would *technically* work is being more liberal when invoking operators:

`output[x][y]` would look for an `operator[]` for `decltype(output)`
taking a single argument.
If none is found, it looks for an `operator[]` on `decltype(output)`
taking two arguments. Here it will find one and invoke it.

Nicol Bolas

unread,
Nov 25, 2017, 5:24:57 PM11/25/17
to ISO C++ Standard - Future Proposals
But what if you want both?

A common thing for matrix types is to be able to access a scalar member as well as being able to access a vector column. `matrix[1]` accesses a column (by reference or by copy); `matrix[1][2]` ought to access a scalar (by reference).

By using the `{}` syntax, you can get exactly what you want. 1D access returns a column. 2D access returns a reference to a scalar. And overload resolution tells which is which.

It's really the sane way to go. It even allows you to use UDLs to get row access for column-major matrices:

matrix[1]; //Accesses column.
matrix
[1_row]; //Accesses row.
matrix
[{1, 2}]; //Accesses scalar.

No need for a language solution when aggregate initialization and structured binding gives us everything we need.

Vinnie Falco

unread,
Nov 25, 2017, 5:47:56 PM11/25/17
to ISO C++ Standard - Future Proposals
On Saturday, November 25, 2017 at 11:08:25 AM UTC-8, Ryan Nicholl wrote:

My suggestion is to allow overloading operator [] with multiple arguments
 
Or, just use the power of your mind to imagine that the parenthesis used
with operator() are actually brackets and go with something like this:

struct image
{
  pixeloperator()(size_t x, size_t y);
...


Jakob Riedle

unread,
Nov 27, 2017, 9:42:40 AM11/27/17
to ISO C++ Standard - Future Proposals
I understand, that this breaks existing code. But the question is rather: How much?

Since I don't have access to a bigger code base, here is a regular expression for occourences of this situation:
(?<!\[)\[([^[\](){}]*(?<!\[)\[(?1)\][^[\](){}]*|[^[\](){}]*\((?1)\)[^[\](){}]*|[^[\](){}]*\{(?1)\}[^[\](){}]*|[^[\](){}]*)+,(?1)+\]

Hope this helps,
Jakob

Jakob Riedle

unread,
Nov 27, 2017, 10:59:55 AM11/27/17
to ISO C++ Standard - Future Proposals
As I was trying to search for the pattern in the boost libraries, it appeared to be very slow and also matched capturing expressions of lambdas as well as comments and several other stuff.

Here is the improved version:
(?:\n|^)(?:\/(?!\/|\*)|[^\/\n])*[a-zA-Z_)\]}>$]\s*(?<!\[)\[((?<!\[)\[(?1)*\]|\((?1)*\)|\{(?1)*\}|[^[\](){}])*,(?1)*\]

There are no false negatives that I know of.
The only false positives with this version are:
  • Templates are being instantiated inside the brackets, e.g.
    foo[ bar<4,5>::value ]
  • Multiline Comments with the start of the comment in a line before the match, e.g.
/**
   \code
   placeholder<int> _i;
   placeholder<double> _d;
   sregex rex = ( some >> regex >> here )
       [ ++_i, _d *= _d ];
   \endcode
*/

Yours,
Jakob 


Nicol Bolas

unread,
Nov 27, 2017, 11:37:30 AM11/27/17
to ISO C++ Standard - Future Proposals
On Monday, November 27, 2017 at 9:42:40 AM UTC-5, Jakob Riedle wrote:
I understand, that this breaks existing code. But the question is rather: How much?

No, the question is whether any amount of breakage is worth it to get something we already effectively have? I don't agree that `[1, 2]` is significantly better enough than `[{1, 2}]` that we should permit it.

Especially since language-based multidimensional arrays won't be able to use it.


Thiago Macieira

unread,
Nov 27, 2017, 1:34:23 PM11/27/17
to std-pr...@isocpp.org
On Monday, 27 November 2017 08:37:30 PST Nicol Bolas wrote:
> On Monday, November 27, 2017 at 9:42:40 AM UTC-5, Jakob Riedle wrote:
> > I understand, that this breaks existing code. But the question is rather:
> > How much?
>
> No, the question is whether *any* amount of breakage is worth it to get
> something we already effectively have? I don't agree that `[1, 2]` is
> significantly better enough than `[{1, 2}]` that we should permit it.

And it's not better than (1, 2), which anyone who has needed multi-dimensional
access in the last 20 years has used.

> Especially since language-based multidimensional arrays won't be able to
> use it.

They don't use the parenthetical syntax either, unfortunately.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Ryan Nicholl

unread,
Dec 14, 2017, 2:35:54 PM12/14/17
to ISO C++ Standard - Future Proposals
Correct. And I am suggesting a backwards incompatible change.
It shouldn't be, in practice, any more difficult than the "A"_s backwards incompatible change, I would think.

//old code

a
[b, c];

// new code
a
[(b, c)]

So the fix is fairly simple.

Yes, it breaks backwards compatibility, but the fix could just be a pair of parenthesis inserted by an automatic C++NN converter tool. I can't imagine that [a , b] is used very often.

Ryan Nicholl

unread,
Dec 14, 2017, 2:42:01 PM12/14/17
to ISO C++ Standard - Future Proposals
Actually, here's a way to make the change 100% backwards compatible.

When foo[a, b, ...] is detected, the compiler must check for operator[](...) in the dereferenced type foo. If any overload of operator[] has multiple arguments, then it must call operator[](a, b, c). Otherwise, the call is operator[]((a, b, c)) etc. instead.
This check would be done during semantic analysis anyway, so I don't think it'd be very difficult to do.

So, vector[a, b] would use operator,(...), whereas image[a, b] would use operator[](size_t x, size_t y). So there you go, no backwards compatibility breaks.


On Saturday, November 25, 2017 at 2:08:25 PM UTC-5, Ryan Nicholl wrote:

Myriachan

unread,
Dec 14, 2017, 4:24:19 PM12/14/17
to ISO C++ Standard - Future Proposals
On Thursday, December 14, 2017 at 11:42:01 AM UTC-8, Ryan Nicholl wrote:
Actually, here's a way to make the change 100% backwards compatible.

When foo[a, b, ...] is detected, the compiler must check for operator[](...) in the dereferenced type foo. If any overload of operator[] has multiple arguments, then it must call operator[](a, b, c). Otherwise, the call is operator[]((a, b, c)) etc. instead.
This check would be done during semantic analysis anyway, so I don't think it'd be very difficult to do.

So, vector[a, b] would use operator,(...), whereas image[a, b] would use operator[](size_t x, size_t y). So there you go, no backwards compatibility breaks.


I'd rather have a multi-argument operator [] work like existing multidimensional arrays, with each segment in separate [] groups.

class Meow
{
...
   
int &operator[](size_t x, size_t y);
};

Meow meow;
meow
[1][2] = 3;   // same as meow.operator[](1, 2) = 3;

However, this is grammatically and semantically problematic.  So I guess we'll live with either having to do [{1, 2}] or having operator [] return an object with its own operator [].

By the way, have you considered just overloading operator () on the object?  Then it's meow(1, 2) = 3;.

Melissa

adrian....@gmail.com

unread,
Dec 16, 2017, 11:24:02 AM12/16/17
to ISO C++ Standard - Future Proposals


On Saturday, November 25, 2017 at 4:09:52 PM UTC-5, Todd Fleming wrote:
Does that work as a 2D ref?  Looks like it would be interpreted as two comma separated expressions, returning the 4th element of v.  5 would be ignored. 

adrian....@gmail.com

unread,
Dec 16, 2017, 11:40:42 AM12/16/17
to ISO C++ Standard - Future Proposals


On Saturday, November 25, 2017 at 4:11:21 PM UTC-5, Nicol Bolas wrote:
On Saturday, November 25, 2017 at 2:08:25 PM UTC-5, Ryan Nicholl wrote:

My suggestion is to allow overloading operator [] with multiple arguments, e.g.:

class image
{
 
...
public:
 
...

  pixel
& operator[](size_t x, size_t y)
 
{
   
return internal_data[x+y*width];
 
}
};

/// used like:

image my_image
= ...;
...
pixel px
= my_image[5, 3];

Would be most useful where I see
[{...}]

currently. E.g.:
output[{x,y}] = value;
when processing images.


You've essentially undermined your proposal by highlighting a problem with most multi-argument `[]` proposals.

As it currently stands `output[x, y]` already has a meaning. Namely, it's equivalent to `output[y]`; the entire text between the `[]` is taken as a single expression, rather than a list of arguments.

So you have to invent a new syntax to make what you want possible. Namely, `output[{x, y}]`.

Why would anyone currently want to use a comma separated list inside of a `[]` is beyond me.  It is confusing and could be misconstrued as a multidimensional array index by newer users to the language.  Yes, there is a work around, but this is a workaround to something that shouldn't need to be worked around as it makes it more difficult to use and understand the language.

Nicol Bolas

unread,
Dec 16, 2017, 11:45:34 AM12/16/17
to ISO C++ Standard - Future Proposals, adrian....@gmail.com


On Saturday, December 16, 2017 at 11:40:42 AM UTC-5, adrian....@gmail.com wrote:


On Saturday, November 25, 2017 at 4:11:21 PM UTC-5, Nicol Bolas wrote:
On Saturday, November 25, 2017 at 2:08:25 PM UTC-5, Ryan Nicholl wrote:

My suggestion is to allow overloading operator [] with multiple arguments, e.g.:

class image
{
 
...
public:
 
...

  pixel
& operator[](size_t x, size_t y)
 
{
   
return internal_data[x+y*width];
 
}
};

/// used like:

image my_image
= ...;
...
pixel px
= my_image[5, 3];

Would be most useful where I see
[{...}]

currently. E.g.:
output[{x,y}] = value;
when processing images.


You've essentially undermined your proposal by highlighting a problem with most multi-argument `[]` proposals.

As it currently stands `output[x, y]` already has a meaning. Namely, it's equivalent to `output[y]`; the entire text between the `[]` is taken as a single expression, rather than a list of arguments.

So you have to invent a new syntax to make what you want possible. Namely, `output[{x, y}]`.

Why would anyone currently want to use a comma separated list inside of a `[]` is beyond me.

It is legal code today, so changing its meaning is a breaking change.

Now, you can try to argue that it wouldn't be that much of a breaking change. But remember: you're making this change for the sole purpose of being slightly more convenient to do what we already can do with `[{...}]` syntax.

Since we're talking about a feature of such minor consequence, I would say that even minor breaking changes means that it's not worth the risk.

Todd Fleming

unread,
Dec 16, 2017, 11:47:15 AM12/16/17
to ISO C++ Standard - Future Proposals, adrian....@gmail.com
On Saturday, December 16, 2017 at 11:24:02 AM UTC-5, adrian....@gmail.com wrote:
This conflicts with already valid syntax:

int f(const std::vector<int>& v) {
   return v[5, 3];
}
 
Does that work as a 2D ref?  Looks like it would be interpreted as two comma separated expressions, returning the 4th element of v.  5 would be ignored. 

It returns v[3]. The syntax allows side effects, which I didn't use here.

adrian....@gmail.com

unread,
Dec 17, 2017, 7:00:49 PM12/17/17
to ISO C++ Standard - Future Proposals, adrian....@gmail.com


On Saturday, December 16, 2017 at 11:45:34 AM UTC-5, Nicol Bolas wrote:


On Saturday, December 16, 2017 at 11:40:42 AM UTC-5, adrian....@gmail.com wrote:


On Saturday, November 25, 2017 at 4:11:21 PM UTC-5, Nicol Bolas wrote:
 
Since we're talking about a feature of such minor consequence, I would say that even minor breaking changes means that it's not worth the risk.

Hmmm, C/C++'s way to make/index arrays has always been a bit strange, but looking into this further, I think agree for a different reason.  The weird syntax would just get more confused if multidimensional indexes were allowed in a different format than is used already.

Jake Arkinstall

unread,
Dec 18, 2017, 12:04:41 PM12/18/17
to std-pr...@isocpp.org


On 16 Dec 2017 16:45, "Nicol Bolas" <jmck...@gmail.com> wrote:
It is legal code today, so changing its meaning is a breaking change.

Not necessarily though. If the array access operator accepts N arguments and M > N arguments are provided, the comma operator can apply. If it accepts N arguments and N arguments are provided, it handles them accordingly. 

If we follow that rule, then this is not a breaking change - simply because we have no multi-dimensional array access operator.

If this proposal does gain some traction, I'd like to see a rectangular array class to go along with it, both as a usage example and for the standard library.

Nicol Bolas

unread,
Dec 18, 2017, 1:13:20 PM12/18/17
to ISO C++ Standard - Future Proposals


On Monday, December 18, 2017 at 12:04:41 PM UTC-5, Jake Arkinstall wrote:


On 16 Dec 2017 16:45, "Nicol Bolas" <jmck...@gmail.com> wrote:
It is legal code today, so changing its meaning is a breaking change.

Not necessarily though. If the array access operator accepts N arguments and M > N arguments are provided, the comma operator can apply. If it accepts N arguments and N arguments are provided, it handles them accordingly.

If we follow that rule, then this is not a breaking change - simply because we have no multi-dimensional array access operator.

But then it becomes a very confusing change. It causes the parsing of an "expression" to change based on non-local information.

That is, if you see `thing(5, 3, 4)`, then you know that those commas are not comma operators; they're argument separators in a function call. Making `thing[5, 3, 4]` sometimes be comma operators and sometimes be argument separators is weird. Weirder still would be having the first comma be a separator and the second being an operator, in the case of the matching `operator[]` taking only 2 arguments.

Remember: comma-as-operator has different semantics from comma-as-function-argument-separator. In the former case, the comma operator guarantees left-to-right ordering of the sub-expressions. In the latter case, it does not.

Matthew Woehlke

unread,
Dec 22, 2017, 9:05:42 PM12/22/17
to std-pr...@isocpp.org
On 2017-12-16 11:40, adrian....@gmail.com wrote:
> Why would anyone currently want to use a comma separated list inside of a
> `[]` is beyond me. It is confusing and could be misconstrued as a
> multidimensional array index by newer users to the language. Yes, there is
> a work around, but this is a workaround to something that shouldn't need to
> be worked around as it makes it more difficult to use and understand the
> language.

I think, if this sort of proposal is to ever have any chance, we should
start by deprecating use of the comma operator inside of []s, unless
enclosed in ()s.

IOW:

a[b, c]; // deprecated
a[(b, c)]; // okay, same meaning

For that matter, we could go further and deprecate the comma operator
except in a few special cases (e.g. when enclosed in ()s). This might
even be a worthwhile change in its own right.

--
Matthew

Jakob Riedle

unread,
Dec 26, 2017, 5:43:00 PM12/26/17
to ISO C++ Standard - Future Proposals
+1, Great Idea!

Ryan Nicholl

unread,
Dec 30, 2017, 7:40:16 PM12/30/17
to ISO C++ Standard - Future Proposals
Why would anyone currently want to overloaded operator `[]` is beyond me.  It is confusing and could be misconstrued as an array index by newer users to the language.  Yes, there is a work around, but this is a workaround to something that shouldn't need to be worked around as it makes it more difficult to use and understand the language. (/sarcasm)

schreiber...@gmail.com

unread,
Jan 6, 2018, 7:38:50 AM1/6/18
to ISO C++ Standard - Future Proposals
I think it is clear that the motivation for allowing multi-argument overloads to operator[] is purely syntactic sugar. One can achieve the same thing (almost, see below) with current language constructs, such as:
matrix[{1,2}];
matrix(1,2);
matrix[1][2];

Neither of these constructs are ideal though.

The first require an extra pair of curly braces for no apparent good reason, and it gives yet another use for braces that a new user would have to get their head around. I understand that this is not truly "another use for braces", because it uses already existing C++ mechanisms, but from a functional point of view, users will not want to know if "is this an initializer list? or a constructor call? or a structured binding? or..."; they will need to remember "use bracket for array indexing, and add braces for array multi-indexing". Worse, a new user might even try to see if it compiles without the braces, or simply forget them. And compile it will. Perhaps with a warning from the compiler about an unused statement, but note that there is no possibility for a library solution to warn about this error.

The second does not look like an array indexing operation, but like a function call or a constructor. This is the solution that most libraries nowadays adopt (see references below). It is not dangerous as the above, but it is irritating because it does not convey the correct intent. Syntax highlighters need to parse the definition of "matrix" to know whether "matrix(1,2)" is a function call or not. As a result, most highlighters I have seen treat "matrix(1,2)" as a function call, which is incorrect. Then you have "matrix[1]" for sequential element access (i.e., as laid out in memory), and "matrix(1,2)" for structured element access, why the need for two separate notations when the concept is the same? It's another cognitive burden placed on the user.

The third fixes the issues of the above two, but it also has several drawbacks of its own. First, it is not possible to disentangle cases where one wants sequential access (go through the array as it is laid out in memory) and structured access (follow the array's multi-dimensional shape). "matrix[1]" represents the second row (or column), not a scalar value. This can still be done by going through a proxy class/function, e.g., "matrix.sequential[1]", so I would say it is not such a big deal. Second, "matrix[1][2]" requires splitting the indexing between two function calls, and creating a temporary for "matrix[1]". This may or may not imply runtime overheads, but certainly will increase compile time and library complexity. In case of >2D arrays, it also makes it impossible to spot cases where the user forgot to specify the last index (i.e., "matrix[1][2]" instead of "matrix[1][2][3]") at the location of the indexing; the error (if any) will happen latter when the user tries to use "matrix[1][2]" as a scalar.

Lastly, all above solutions still do not fix the issue that "matrix[1,2]" is currently well defined and most certainly does not do what anyone would want it to.

So yes, allowing "matrix[1,2]" as an operator[] overload is a breaking change, but it is my opinion one case where benefits greatly outweigh the costs. And this is not a niche case. Data science is becoming an increasingly important discipline nowadays, and I think C++ is lagging behind other languages like python, R, etc, (even though C++ outperforms them all) in part because it lacks such simple things.

If this has to happen in 2025, after we have flagged the comma operator inside brackets as deprecated, so be it...

References:
Blazelib: https://bitbucket.org/blaze-lib/blaze/wiki/Matrix%20Operations#!element-access
Eigen: https://eigen.tuxfamily.org/dox/group__TutorialMatrixClass.html
xtensor: https://xtensor.readthedocs.io/en/latest/expression.html#element-access

Nicol Bolas

unread,
Jan 6, 2018, 10:41:04 AM1/6/18
to ISO C++ Standard - Future Proposals, schreiber...@gmail.com
... why would they think it's "a structured binding"? That uses `auto []` syntax. And "an initializer list" is a "constructor call" (or at least, it can be).

I get the general idea that, without experience with list initialization, it looks like odd syntax. However:
 
they will need to remember "use bracket for array indexing, and add braces for array multi-indexing".

They could simply be taught to always use `[{1}]`. Currently, using `{}` with regular array indexing doesn't work, but we could change it to allow it, initializing as if you had done `some_integer_type{1}`.

Worse, a new user might even try to see if it compiles without the braces, or simply forget them. And compile it will.

Only if the type allows 1D indexing as well. If it only allows 2D indexing, it won't compile.

Also, even in the cases of types that allow both, it's highly unlikely that what gets returned by 1D indexing will be compatible with what gets returned by 2D indexing. So odds are the entire code won't compile. And while the compile error likely won't direct you to the problem statement, at least it isn't silently executing.

Perhaps with a warning from the compiler about an unused statement, but note that there is no possibility for a library solution to warn about this error.

The second does not look like an array indexing operation, but like a function call or a constructor. This is the solution that most libraries nowadays adopt (see references below). It is not dangerous as the above, but it is irritating because it does not convey the correct intent. Syntax highlighters need to parse the definition of "matrix" to know whether "matrix(1,2)" is a function call or not. As a result, most highlighters I have seen treat "matrix(1,2)" as a function call, which is incorrect.

But it is a function call. It's either a call to `operator()` or a call to a constructor.
 
Then you have "matrix[1]" for sequential element access (i.e., as laid out in memory), and "matrix(1,2)" for structured element access, why the need for two separate notations when the concept is the same? It's another cognitive burden placed on the user.

The third fixes the issues of the above two, but it also has several drawbacks of its own. First, it is not possible to disentangle cases where one wants sequential access (go through the array as it is laid out in memory) and structured access (follow the array's multi-dimensional shape). "matrix[1]" represents the second row (or column), not a scalar value. This can still be done by going through a proxy class/function, e.g., "matrix.sequential[1]", so I would say it is not such a big deal. Second, "matrix[1][2]" requires splitting the indexing between two function calls, and creating a temporary for "matrix[1]". This may or may not imply runtime overheads, but certainly will increase compile time and library complexity. In case of >2D arrays, it also makes it impossible to spot cases where the user forgot to specify the last index (i.e., "matrix[1][2]" instead of "matrix[1][2][3]") at the location of the indexing; the error (if any) will happen latter when the user tries to use "matrix[1][2]" as a scalar.

Lastly, all above solutions still do not fix the issue that "matrix[1,2]" is currently well defined and most certainly does not do what anyone would want it to.

So yes, allowing "matrix[1,2]" as an operator[] overload is a breaking change, but it is my opinion one case where benefits greatly outweigh the costs. And this is not a niche case. Data science is becoming an increasingly important discipline nowadays, and I think C++ is lagging behind other languages like python, R, etc, (even though C++ outperforms them all) in part because it lacks such simple things.

Are you seriously telling me that people would use C++ for these applications, but are warded off just because they can't use `[]` and would have to resort to `()`? I find myself doubtful that any person is picking languages based on trivial syntax like that.

I think you're exaggerating the importance of this syntax.

Jake Arkinstall

unread,
Jan 6, 2018, 10:53:33 AM1/6/18
to std-pr...@isocpp.org
On Sat, Jan 6, 2018 at 3:41 PM, Nicol Bolas <jmck...@gmail.com> wrote:
Are you seriously telling me that people would use C++ for these applications, but are warded off just because they can't use `[]` and would have to resort to `()`? I find myself doubtful that any person is picking languages based on trivial syntax like that.

I think you're exaggerating the importance of this syntax.

I agree with you that this is far from a reason to abandon the language. That being said, it is still a part of the language that IMO is in need of improvement, which is why we're here.

Nicol Bolas

unread,
Jan 6, 2018, 11:33:08 AM1/6/18
to ISO C++ Standard - Future Proposals
My overall point is this.

We can all agree that `[1, 2]` would be ideal. But because this already has meaning in C++, we can't change it without going through a round of deprecation. So you'd be looking at 6-9 years before we could even add the language feature that lets us give `[1, 2]` the meaning we want.

Is this feature worth that wait? Is it worth the effort of deprecating comma expressions in brackets? Or should we just encourage the use of alternatives?

I say it'd be easier to add a language feature to allow `[{1, 2}]` to work on language arrays than to make `[1, 2]` work. Let's canonize that idiom by adding it to the language. That's something that could (in theory) happen in the C++20 time frame, since it doesn't break backwards compatibility.

So you can either wait 6-9 years for perfection, or get something right now that is almost as good.

schreiber...@gmail.com

unread,
Jan 6, 2018, 12:07:22 PM1/6/18
to ISO C++ Standard - Future Proposals
On Saturday, January 6, 2018 at 5:33:08 PM UTC+1, Nicol Bolas wrote:
My overall point is this.

We can all agree that `[1, 2]` would be ideal. But because this already has meaning in C++, we can't change it without going through a round of deprecation. So you'd be looking at 6-9 years before we could even add the language feature that lets us give `[1, 2]` the meaning we want.

Is this feature worth that wait? Is it worth the effort of deprecating comma expressions in brackets? Or should we just encourage the use of alternatives?

I say it'd be easier to add a language feature to allow `[{1, 2}]` to work on language arrays than to make `[1, 2]` work. Let's canonize that idiom by adding it to the language. That's something that could (in theory) happen in the C++20 time frame, since it doesn't break backwards compatibility.

So you can either wait 6-9 years for perfection, or get something right now that is almost as good.

To reply to your previous message, yes I think people get turned off C++ because of small awkwardness like these. Of course it's not one such little detail that tips the balance, but the whole package of it. We're slowly making things simpler with each iteration of the language (range-based loops, auto, etc), and I think we should continue down that road to make C++ as intuitive to use as possible, if it is to compete with cute newborn languages.

As for your proposal. Consider someone new to the language seeing the "[{1,2}]" construct, and wondering why it is spelled like this. Can you imagine what they will think when they are told "yeah, it is because [1,2] is a valid syntax that throws away 1 and uses 2 as an index"? My bet is on "why on earth would someone want that? and why is it given a simpler, more accessible syntax?". It does not inspire trust in how the language was designed. So I'm willing to wait for perfect. If I know anything about C++ standardization, is that 6-9 years of wait is as good as it gets ;)

Jake Arkinstall

unread,
Jan 6, 2018, 12:12:10 PM1/6/18
to std-pr...@isocpp.org
I understand that, and waiting 6-9 years is fine with me. For things like this, it's worth it.

That being said, I'm only 99% convinced that we need deprecation before reaping the benefits. The 1% is hanging to the idea of making this only impact cases where operator[] has multiple arguments, which is currently a compilation error and thus this cannot impact legacy code, and deprecate the comma operator for single argument operator[]s. Is this something that would take a lot of effort for compilers?

Even if that 1% hope gets shot down, I'm still happy waiting through the deprecation period. Making do with what we already have is not progressive.

Nicol Bolas

unread,
Jan 6, 2018, 12:57:25 PM1/6/18
to ISO C++ Standard - Future Proposals, schreiber...@gmail.com
On Saturday, January 6, 2018 at 12:07:22 PM UTC-5, schreiber...@gmail.com wrote:
On Saturday, January 6, 2018 at 5:33:08 PM UTC+1, Nicol Bolas wrote:
My overall point is this.

We can all agree that `[1, 2]` would be ideal. But because this already has meaning in C++, we can't change it without going through a round of deprecation. So you'd be looking at 6-9 years before we could even add the language feature that lets us give `[1, 2]` the meaning we want.

Is this feature worth that wait? Is it worth the effort of deprecating comma expressions in brackets? Or should we just encourage the use of alternatives?

I say it'd be easier to add a language feature to allow `[{1, 2}]` to work on language arrays than to make `[1, 2]` work. Let's canonize that idiom by adding it to the language. That's something that could (in theory) happen in the C++20 time frame, since it doesn't break backwards compatibility.

So you can either wait 6-9 years for perfection, or get something right now that is almost as good.

To reply to your previous message, yes I think people get turned off C++ because of small awkwardness like these. Of course it's not one such little detail that tips the balance, but the whole package of it. We're slowly making things simpler with each iteration of the language (range-based loops, auto, etc), and I think we should continue down that road to make C++ as intuitive to use as possible, if it is to compete with cute newborn languages.

As for your proposal. Consider someone new to the language seeing the "[{1,2}]" construct, and wondering why it is spelled like this.

Why do they care? A new user needs to learn that things are spelled as they're spelled.

"Why" is not an important matter for their learning at this point. Syntax is whatever it is.
 
Can you imagine what they will think when they are told "yeah, it is because [1,2] is a valid syntax that throws away 1 and uses 2 as an index"?

But that's not why. They should be told "The expression `1, 2` is a valid expression that throws away 1 and uses 2. So we wrap it up in an initializer list so that it will be treated as a sequence of values and not an expression." That `[]` is involved is not really the point.

My bet is on "why on earth would someone want that? and why is it given a simpler, more accessible syntax?". It does not inspire trust in how the language was designed.

This one wart will not be noticed among the thousands of other warts C++ has. And however much you may think things in C++ are becoming simpler, the number of warts increases with each revision.

Structured binding is quite a wart. Having comparison operator overloading other than `operator<=>` is a wart. Overloading `operator<<` for writing to streams is a wart. And so on.

Also, the "accessible" argument is trivial at best. You're talking about two characters here. They're even free to think of `[{` and `}]` as being the opening and closing syntax. It's perfectly reasonable to just tell new users "that's the syntax; there is no 'why'."

So I'm willing to wait for perfect. If I know anything about C++ standardization, is that 6-9 years of wait is as good as it gets ;)

It's not just about the wait time. It's also about the pain of it. Getting such a change standardized will not be easy, nor will people changing their code be easy. Those facts, coupled with the relative unimportance of the eventual goal, makes it highly unlikely that this would get standardized. The committee has better things to spend time on, and C++ users have better things to do than add `()` to perfectly functional code.

It isn't worth the wait, and it isn't worth the code changes. The perfect syntax isn't worth the effort needed to get there.

Richard Hodges

unread,
Jan 6, 2018, 1:18:42 PM1/6/18
to std-pr...@isocpp.org
I think there's a way to move the [1, 2] syntax forward a little more aggressively while reducing compatibility clashes.

given the expression

    x, y

This currently means "evaluate x, evaluate y, return a [prvalue?-]reference to y"

Imagine there is a small language such that what it actually produced was a

std::comma_expansion<decltype(x), decltype(y)>

as if by:

return std::comma_expansion<decltype(x), decltype(y)> { x, y };


And that a comma_expansion&& was implicitly convertible to decltype(y)&& 

Now, legacy operator[] methods would continue to work, as would a new operator[] written to accept a std::comma_expansion:

(quick and dirty) example code:

#include <tuple>
#include <cassert>

namespace std {

template<class...Ts>
struct comma_expansion : std::tuple<Ts...>
{
using underlying_tuple = std::tuple<Ts...>;

comma_expansion(comma_expansion const&) = delete;
comma_expansion& operator=(comma_expansion const&) = delete;

using underlying_tuple::underlying_tuple;

static constexpr std::size_t size() { return sizeof...(Ts); };

static_assert(size() != 0, "");

using last_type = decltype(std::get<size()-1>(std::declval<underlying_tuple>()));

using last_rvalue_ref = std::add_rvalue_reference_t<last_type>;

operator last_rvalue_ref() &&
{
return std::get<size()-1>(std::move(*this));
}
};
}

struct array_2d
{
auto operator[](int y)
{
// as before
}

auto operator[](std::comma_expansion<int, int>&& xy)
{
// now detects [1, 2]
}
};

int main()
{
// with a small compiler change, this could be written as
// int test = 1, 2;
int test = std::comma_expansion<int, int>{ 1, 2 };
assert(test == 2);

array_2d a;
// and this could be written as
// a[4, 5];
a[std::comma_expansion<int, int>{ 4, 5 }];
}




--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAC%2B0CCNhTj2iP-LHbKsVyf%3Dgm-iwnCe7j3aMQBwJ1t6tdpu40Q%40mail.gmail.com.

schreiber...@gmail.com

unread,
Jan 6, 2018, 1:50:03 PM1/6/18
to ISO C++ Standard - Future Proposals, schreiber...@gmail.com
On Saturday, January 6, 2018 at 6:57:25 PM UTC+1, Nicol Bolas wrote:
On Saturday, January 6, 2018 at 12:07:22 PM UTC-5, schreiber...@gmail.com wrote:
On Saturday, January 6, 2018 at 5:33:08 PM UTC+1, Nicol Bolas wrote:
My overall point is this.

We can all agree that `[1, 2]` would be ideal. But because this already has meaning in C++, we can't change it without going through a round of deprecation. So you'd be looking at 6-9 years before we could even add the language feature that lets us give `[1, 2]` the meaning we want.

Is this feature worth that wait? Is it worth the effort of deprecating comma expressions in brackets? Or should we just encourage the use of alternatives?

I say it'd be easier to add a language feature to allow `[{1, 2}]` to work on language arrays than to make `[1, 2]` work. Let's canonize that idiom by adding it to the language. That's something that could (in theory) happen in the C++20 time frame, since it doesn't break backwards compatibility.

So you can either wait 6-9 years for perfection, or get something right now that is almost as good.

To reply to your previous message, yes I think people get turned off C++ because of small awkwardness like these. Of course it's not one such little detail that tips the balance, but the whole package of it. We're slowly making things simpler with each iteration of the language (range-based loops, auto, etc), and I think we should continue down that road to make C++ as intuitive to use as possible, if it is to compete with cute newborn languages.

As for your proposal. Consider someone new to the language seeing the "[{1,2}]" construct, and wondering why it is spelled like this.

Why do they care? A new user needs to learn that things are spelled as they're spelled.

"Why" is not an important matter for their learning at this point. Syntax is whatever it is.
For the first few lessons/tutorials maybe, but not in the long run. Understanding "why" is an essential part of learning. At least that is how I learn.
 
 
Can you imagine what they will think when they are told "yeah, it is because [1,2] is a valid syntax that throws away 1 and uses 2 as an index"?

But that's not why. They should be told "The expression `1, 2` is a valid expression that throws away 1 and uses 2. So we wrap it up in an initializer list so that it will be treated as a sequence of values and not an expression." That `[]` is involved is not really the point.
True, but that does not make it any better :) These are lot of concepts required to understand a statement that should be among the most basic in the language. Comma operator. Initializer list. Construction of a temporary. And that `[]` is involved is precisely the point, because the user will want to understand how array indexing works, and they will have to go though all the aforementioned bits to get their answer.
 

My bet is on "why on earth would someone want that? and why is it given a simpler, more accessible syntax?". It does not inspire trust in how the language was designed.

This one wart will not be noticed among the thousands of other warts C++ has. And however much you may think things in C++ are becoming simpler, the number of warts increases with each revision.
Because it is so central to how I write C++, I have certainly noticed it. And seeing how this topic comes back regularly, the OP and I are not the only ones. While you are probably right that the number of warts has increased as new features were added to the language, I think the "core" of C++ (i.e., the code that the 90% writes) globally has less warts. Otherwise the comity would be doing a poor job.

It's not just about the wait time. It's also about the pain of it. Getting such a change standardized will not be easy, nor will people changing their code be easy. Those facts, coupled with the relative unimportance of the eventual goal, makes it highly unlikely that this would get standardized. The committee has better things to spend time on, and C++ users have better things to do than add `()` to perfectly functional code.

It isn't worth the wait, and it isn't worth the code changes. The perfect syntax isn't worth the effort needed to get there.
Waiting is cheap. I've been doing it for years for things like reflection, or modules. It hasn't cost me anything so far ;) And code changes I am convinced are so few as to be non existent. I think you are overestimating the usage rate of this odd pattern, which IMO is such bad practice that it ought to be ill-formed and deprecated regardless of our present discussion.

Matthew Woehlke

unread,
Jan 8, 2018, 12:05:01 PM1/8/18
to std-pr...@isocpp.org, Nicol Bolas
On 2018-01-06 10:41, Nicol Bolas wrote:
> On Saturday, January 6, 2018 at 7:38:50 AM UTC-5, schreiber...@gmail.com
> wrote:
>> The second does not look like an array indexing operation, but like a
>> function call or a constructor. This is the solution that most libraries
>> nowadays adopt (see references below). It is not dangerous as the above,
>> but it is irritating because it does not convey the correct intent. Syntax
>> highlighters need to parse the definition of "matrix" to know whether
>> "matrix(1,2)" is a function call or not. As a result, most highlighters I
>> have seen treat "matrix(1,2)" as a function call, which is incorrect.
>
> But it *is* a function call. It's either a call to `operator()` or a call
> to a constructor.

Heh, that was exactly my reaction also. However, I think what he means
is that syntax highlighting will make `matrix` look like the *name* of a
function. Which... it isn't, of course; `matrix` here is an *object
instance*.

(Whether any syntax highlighter should even be attempting to make such a
distinction is another argument. I believe katepart did at one point,
and it was decided too troublesome and was dropped. At any rate, good
idea or not, there *are* highlighters that try to do this...)

On 2018-01-06 11:33, Nicol Bolas wrote:
> My overall point is this.
>
> We can all agree that `[1, 2]` would be ideal. But because this already has
> meaning in C++, we can't change it without going through a round of
> deprecation. So you'd be looking at 6-9 years before we could even add the
> language feature that lets us give `[1, 2]` the meaning we want.
>
> Is this feature worth that wait? Is it worth the effort of deprecating
> comma expressions in brackets? Or should we just encourage the use of
> alternatives?

Possibly. In particular, the answer to your second question may be "yes"
even if the answer to the first question is "no". (And, if it is, that
drastically lowers the bar for eventually expanding the syntax per the
first question.)

--
Matthew

Scott Dolim

unread,
Jan 15, 2018, 5:28:52 PM1/15/18
to ISO C++ Standard - Future Proposals
How about this (admittedly offbeat) syntax:

    x[1; 2; 3]    // use semicolons to separate multidimensional indexes

The compiler would find the longest prefix of that argument list that matches an overload of operator[] on x's class (eg an X::operator[](int a, int b) would match the first two).  It would peel off that many arguments to form "x.operator[](...)" with them, and then apply any remaining indexes likewise to the return value.

This lets you write the same indexing syntax for classes modeling multidimensional arrays (like a Matrix), as for expressions forming array-of-array types like char*[].  Using this in a template supports instantiation with both kinds of types.  The target value just has to obey the dimension contract, not a specific signature of operator[].  It treats Matrix2D<float[]> indifferently to Matrix3D<float>.  It also lets Matrix supply an operator[](int x, int y) for fast element access alongside an operator[](int x) for row access, and each can be specialized for their own task.

(Note, I'm not sure multidimensional operator[] is a hill that needs dying on, but the "uniform indexing syntax" idea above just came to me while reading the thread, and I thought I'd put it on the table.)

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

Cleiton Santoia

unread,
Jan 16, 2018, 1:44:44 PM1/16/18
to ISO C++ Standard - Future Proposals

I think that c++ behavior of obj[x,y] is awkward. 

When using this syntax, GCC already warn us:
main.cpp:24:10: warning: left operand of comma operator has no effect [-Wunused-value]

Other contexts that [] can appear (lambdas or bindings), the x and y are two different things passed to wherever [] will do, only in operator[] it's is allowed this "pay two and get one" parameter passing. This is a little embarrassing to explain...

Regardless of new uses for operator[]. I'm pro deprecate this.

Said that, easily follows that multi-parameter may use same rules as multi-parameter operator().






How about this (admittedly offbeat) syntax:

    x[1; 2; 3]    // use semicolons to separate multidimensional indexes

 This is strange, what happens if someone tries x[1,2,3;2;3,2,1] ? mixing ',' and ';' ?

 
Reply all
Reply to author
Forward
0 new messages