Quick question/suggestion: wouldn't it be feasible/reasonable to have
array items defaulting to the type of the array itself in literals?
An example would explain it better. E.g. This:
[]Type{Type{...},Type{...}}
Would be optionally spelled as
[]Type{{...},{...}}
Can someone see a downside to this?
--
Gustavo Niemeyer
http://niemeyer.net
http://niemeyer.net/blog
http://niemeyer.net/twitter
i'd like to see something like this.
it would need a grammar change, as currently the values inside
an initialiser block are simple expressions, but this would have to
change so that, in this one case only, braces would be allowed,
and imply a nested initialiser block.
i quite like this particular take on it, as it doesn't try
to make initialiser blocks without types more
generally assignment compatible, which has
the semantics (and the syntax) seems quite straightforward.
from this:
p := [][]Point{
[]Point{Point{0, 0}, Point{0, 20}, Point{20, 0}, Point{20, 20}},
[]Point{Point{0, -50}, {500, 500}},
}
to this:
p := [][]Point{
{{0, 0}, {0, 20}, {20, 0}, {20, 20}},
{{0, -50}, {500, 500}},
}
var foo []string = {"hello", "world"}
which would be equivalent to
var foo = []string{"hello", "world"}
This doesn't gain us much, but then the same rules would apply for
variable initialization as for array/slice initialization: braces
without a type preceding them would have the (known) implied type. It
seems nicer if one could avoid special rules that only apply in one
circumstance.
David
--
David Roundy
I agree that if you were to do this, it should apply to all cases
where the type of the composite literal is known, rather than be a
special case.
In cases like:
var x interface{} = {5, 2, 3}
or
arr := []interface{}{{5, 4}, {3, 2}}
which I seem to recall as examples of reasons not to do this, either
one of two things could happen:
Error: Untyped composite literal not assignable to type interface{}
or untyped composIte literals are assumed to be []interface{} when
assigned to an interface (as is the case with numeric literals and int
or float).
The argument was that it gets complex trying to infer the type of the
nested array, but as long as the type is known, then I don't see this
as a stumbling block.
that's what i meant when i mentioned initaliser blocks being
generally assignment-compatible (in my editing, i seem to have
deleted the word "problems" after "which has").
the difficulty with making initialiser blocks assignment
compatible is firstly grammatical: an expression is currently a legal
statement, but a block statement is not an expression - how would
you distinguish an initialiser block from a block statement when
they're now both legal in the same place?
secondly, there's the difficulty of working out the implied type.
what, for instance, would this print?
func(x interface{}){fmt.Printf("%T\n", x)}({nil, os.Stdout})
neither of those problems occur in Gustavo's proposal, AFAICS.
The reason is so that when you're checking an expression
you don't have to carry type information down with you.
When you're looking at the Type{...} inside
[]Type{Type{...},Type{...}}
it doesn't matter that it's inside the []Type{...} wrapping.
You just type check it normally and then pass the result up.
If you omitted the inner "Type" words then you'd be looking
at {...} and have no idea what to do with it (unless you pass
the context down). Parsing this kind of thing is one of
the thorniest parts of writing a C type checker. We made
a conscious decision to make the language simpler at the
cost of a little repetition in tables like this.
I think allowing you to omit the inner "Type" words would
also mean using a different grammar for initialization than
for ordinary expressions, another complication from C that
we'd like to avoid.
From a compiler writer's perspective, Go actually feels
like a simpler language than C despite being so much more
expressive. I'd like to keep it that way.
Russ
http://golang.org/doc/go_programming_faq.html#nested_array_verbose
(I expected it to be under the language design FAQ.)
- Evan
it's true that the type information would have to be
passed downwards, but this is only true for initialisation blocks,
and doesn't mean that it would be needed for other expressions.
the difference of this proposal from previous proposals
is that the type is always available and ready to be
passed downwards. i think that the only part of the
type checker that would have to change is typecheckcomplit.
> I think allowing you to omit the inner "Type" words would
> also mean using a different grammar for initialization than
> for ordinary expressions, another complication from C that
> we'd like to avoid.
i don't think it does. AFAICS, an extra
| '{' keyval_list '}'
within the keyval_list production would suffice, grammatically.
> From a compiler writer's perspective, Go actually feels
> like a simpler language than C despite being so much more
> expressive. I'd like to keep it that way.
again, i think that this proposal could keep things simple
while cutting out a lot of repetition.
caveat: i have only had the briefest of glances at the compiler source :-)
ignoring the syntactical difficulty, doing this in general is awkward - type
information has to pass both up and down the tree.
consider:
var a, b *os.File
var c, d *bytes.Buffer
var f func f([][]os.Reader)
x := {{a, b}, {c, d}}
f(x)
to type check this, the type checker has to unify the expected
type of the argument to f with the bottom-up types of the actual
argument. []*os.File must unify with []*bytes.Buffer to
this is considerably more complex than the current
situation.
what is the type of x?
Here's what I said:
>> Error: Untyped composite literal not assignable to type interface{}
>>
>> or untyped composIte literals are assumed to be []interface{} when
>> assigned to an interface (as is the case with numeric literals and int
>> or float).
It doesn't infer the type from the contents but from the context. If
the type cannot be inferred from the context, then it is either an
error, or it will assume a default type ([]interface{} or
map[interface{}]interface{}). The error is probably simpler and less
prone to causing problems. By the same coin, it's more restrictive
than the alternative.
To clarify,
> x := {{a, b}, {c, d}}
Would be an error:
Type of x cannot be inferred from untyped composite literal
Or it could be equivalent to:
[]interface{}{[]interface{}{a, b}, []interface{}{c, d}}
Thus, f(x) would be an error, due to a type mismatch. The first option would be simpler.
If we have to choose between sometimes confusingly an error and
sometimes confusingly not-at-all-what-you-would-expect, then neither
option is satisfactory.
I agree with "sometimes confusingly not-at-all-what-you-would-expect"
for the default type. I do not agree that the error is confusing
(though it may be unexpected if you wanted the compiler to guess the
type of the composite literal from it's contents). As long as you
accept that the type of a composite literal is dependent on it's
context, which is the same rule as with numeric literals, then this
error makes sense.
If you need extra hand holding, change the error to:
Type of x cannot be inferred from untyped composite literal. Use typed
composite literal.
The problem is easy enough to fix, and the error can be however
helpful you want it to be, as opposed to this similar error already in
the language:
var x interface{} = 2147483648
constant 2147483648 overflows int
On the surface, this is a confusing error. I never said it was an int.
Why doesn't the compiler infer a good type for it? It would be clear
if the compiler choked on this and said "untyped numeric literal
2147483648 is not assignable to type interface{}; use a type cast to
specify its type". However the Go authors chose ease of use (not
needing to specify the type when it's an int, which is most of the
time) over clarity in the errors. Once you know that the type of an
integer literal depends on it's context, and in the absence of
context, int is used, then you won't be at all fazed by this error.
The fact is that any error is confusing if you don't know the
language. A measure of a good error is if the way to fix it is clear
from the error text. In this case, the error can be however handholdy
you like it, much moreso than many errors that we already accept, and
the reason for it is fairly simple and intuitive once you know the
principle it's based on, not to mention that that same principle is
already present in the language.
http://codereview.appspot.com/2226048
On 27 September 2010 13:39, Gustavo Niemeyer <gus...@niemeyer.net> wrote:
That's very interesting, thanks Roger. Quite a short change indeed,
and not too magical at all.
The reason we didn't do anything about this before is simple: Afraid of complexities that arose in previous languages that tried to do this, we decided to require full types everywhere but with the plan to improve matters once we understood things, knowing that existing programs would never break if we later admitted abbreviated forms. With more experience now, we're ready to address the issue. Here's a simple step that we are sure works and seems to help in almost all cases.
Proposal:
Allow elision of the type from the element(s) of a composite literal if:
- the outside type is array, slice or map (not struct)
- the element is a composite literal value (not a pointer)
That's the whole thing.
This simple variant covers the great majority of existing verbosity in the code base. It avoids the thorny issues that arise when the outer type is a struct, in effect deferring those cases for now (and maybe indefinitely).
With this proposal,
[]T {
T{ 1, 2, 3 },
}
can be written
[]T {
{ 1, 2, 3 },
}
But note that
[]*T {
&T{ 1, 2, 3 },
}
cannot be simplified under this proposal because &T is not a type. We have discussed this case at length and, for now, have decided to leave it out for simplicity and to avoid conflating expressions and types in the proposal. Something might be done later.
-rob
http://codereview.appspot.com/2299041/
A few files in that CL (the ones with a delta from patch 1
in patch 2: asn1_test, marshal_test) have hand-made changes
that are now possible: many one-off types can be omitted
entirely. For example, look at
http://codereview.appspot.com/2299041/diff/2001/src/pkg/big/rat_test.go
where the types setStringTest, floatStringTest, and so on
can be eliminated entirely.
If we decide to do this, we will add a flag to gofmt to make
the patch 1 changes (just stripping T when possible) so that
people can update source code as they wish. It will not be
required to omit the type.
Russ
Out of interest, what are the "thorny issues"?
Structs are heterogeneous, so there are mysterious
errors one can imagine getting when adding or removing
fields in the struct. (The imagination is strongly
influenced by similar situations in C.)
In contrast, the types covered by this proposal
have completely uniform element types.
This isn't necessarily the last word, but we want to
move carefully and deliberately. We think everyone
is comfortable with eliding the T in the cases covered
here. Let's build up some experience with that and
then decide whether and how to move forward.
Russ
Go is designed with big software in mind, so some ideas that make sense in small programs start to cause problems at scale. If you didn't have to specify the types in an un-tagged struct literal, someone might one day rearrange the order of the elements of a struct, causing code far away from the edit to break mysteriously. This sort of thing can happen much more often, and even more mysteriously, when you have hundreds of programmers in the code base. Thinking like this has influenced a number of properties of Go, particularly in the package system but also in the way it requires you to be explicit when it really matters.
Tagged struct literals, you say? Well, that's another case we might handle. But let's start with a minimal, simple change first.
-rob
Definite +1 for this, BTW, and congrats to Gustavo for the original suggestion.
On 11 Oct 2010 21:09, "Rob 'Commander' Pike" <r...@google.com> wrote:
On Oct 11, 2010, at 12:53 PM, Russ Cox wrote:
> On Mon, Oct 11, 2010 at 15:47, roger peppe <rogpep...
Looks great indeed. Thank you very much.
> A few files in that CL (the ones with a delta from patch 1
> in patch 2: asn1_test, marshal_test) have hand-made changes
> that are now possible: many one-off types can be omitted
> entirely. For example, look at
Nice, hadn't foreseen the simplification in these cases.
That's great. It covers precisely the boilerplate which felt
unnecessarily overwhelming to repeat. Thanks, and +1 as well.
> cannot be simplified under this proposal because &T is not a type. We have discussed this case at length and, for now, have decided to leave it out for simplicity and to avoid conflating expressions and types in the proposal. Something might be done later.
Sounds reasonable. I'm also not sure at this point if eliding "&T"
entirely would make it less understandable, and "&{...}" doesn't
really feel great.
This reminded me of an issue that has (vaguely) bothered me for some
time, and which I've brought up before, which is the use of ints
rather than new types in the standard libraries. It seems to me to be
a very similar question. Defining separate Gid and Uid types (or for
open flags) doesn't help so much with small programs, but for larger
programs that might pass these things around and store them in large
structs (where they might get reordered some day), having separate
types could catch precisely the same sorts of bugs you're concerned
about with untagged literals. I'd much rather see type casts when
something marginally unsafe is done (like comparing a Uid with a Gid,
or using a Gid as a Uid, in both cases usually meaning a bug), and
have the safety of knowing that if fields get reordered I won't ever
end up accidentally using the Uid as a length, the Gid as the Uid and
the length as the Uid...
--
David Roundy