--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Hi,I wonder whether someone here would be able to explain to me the rationale for append() being the way it is. The idiomatic way for appending a new element to a slice seems to be this:type X struct {Foo int}slice := []X{X{1}, X{2}, X{3}}slice = append(slice, X{4})But if X is a larger struct, this is somewhat inefficient, since the object has to be allocated first somewhere on the heap,
just to be copied a moment after to the end of the array underlying the slice. To avoid this, the way I understand it, the slice capacity has to be checked manually, the slice potentially has to be expanded, then you have to take a pointer to an empty cell in the slices array and allocate the new object directly there. So to initialize a new struct directly using the underlying array you have to write this:slice := []X{X{1}, X{2}}l := len(slice)if(l == cap(slice)) {nslice := make([]X, cap(slice)*2)copy(nslice, slice)slice = nslice}slice = slice[0:l+1]x := &slice[l]x.Foo = 4Maybe the compiler optimizes it, maybe commonly the performance penalty isn't big, but I still do wonder how do the Go concepts fit together in the end. Is providing a primitive that would allow to directly initialize a new object at the end of the slice somehow not possible or leading to problems? Even just conceptually, that would seem to make more sense to me, so I wonder what am I missing out. I am trying to understand the design of the language, not trying to criticize anything... Another thing I wonder about is why append is a non-destructive operation, or at least why aren't there two versions like a non-destructive "concat" and destructive "append" based on passing a pointer to the slice.Cheers,Jarosław Rzeszótko
I wonder whether someone here would be able to explain to me the rationale for append() being the way it is. The idiomatic way for appending a new element to a slice seems to be this:
type X struct {Foo int}slice := []X{X{1}, X{2}, X{3}}slice = append(slice, X{4})But if X is a larger struct, this is somewhat inefficient, since the object has to be allocated first somewhere on the heap, just to be copied a moment
after to the end of the array underlying the slice. To avoid this, the way I understand it, the slice capacity has to be checked manually, the slice potentially has to be expanded, then you have to take a pointer to an empty cell in the slices array and allocate the new object directly there. So to initialize a new struct directly using the underlying array you have to write this:slice := []X{X{1}, X{2}}l := len(slice)if(l == cap(slice)) {nslice := make([]X, cap(slice)*2)copy(nslice, slice)slice = nslice}slice = slice[0:l+1]x := &slice[l]x.Foo = 4
--
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
I specifically mentioned I don't mean the compiler, I don't know what the compiler does in detail, but it for sure doesn't generate equivalent assembly for the two cases I have posted. I don't know if there is a detailed enough spec of Go anywhere to substantiate it, but when I say:
x := X{}
It means "Allocate an instance of X on the stack of the current procedure".
When I then say:
append(slice, x)
And the signature of append is:
func append(slice []Type, elems ...Type) []Type
so that elems are passed by value and not by reference, a copy of "x" has to be made on the stack frame for append. So you have "x" on the stack for whatever procedure you are in, "x" on the stack frame for append, and finally "x" lands in the array underlying the slice, so the only place where you really wanted it in. Those are just the good old semantics of function calls and of call-by-value vs call-by-reference. It's irrelevant what the compiler does, in a language with the semantics of Go appending a new element efficiently looks like my Push() I posted earlier. Push() might have other problems, I don't know if it's feasible as a practical proposal, all I was hoping for is an informed discussion of the rationale behind append(), and so far I am getting a lot of pointers to Go tutorials instead.
. I don't know if there is a detailed enough spec of Go anywhere to substantiate it, but when I say:
x := X{}
It means "Allocate an instance of X on the stack of the current procedure".
That said, when you say that "x := X{}" means "allocate an instance of
X on the stack of the current procedure" you are not describing the
situation accurately. "x := X{}" just means "create a new X named x".
The compiler can and will generate code that does things other than
putting an instance of X on the stack. Sometimes it will put an
instance of X on the heap. Sometimes it won't create any instance of
X at all. Sometimes it will create some fields of X but not others.
And in principle it might sometimes create X in a newly-appended
array, though I'm not aware of any Go compilers that actually
implement that optimization.
Ian
| all I was hoping for is an informed discussion of the rationale behind append(),
It's useful functionality that got written out long-hand lots of times before
append was added to the language, it's efficient (enough), optimisationopportunities are available when required, and its easy to explain.
Chris
Hi,
W dniu środa, 15 stycznia 2014 15:45:28 UTC+1 użytkownik chris dollin napisał:| all I was hoping for is an informed discussion of the rationale behind append(),
It's useful functionality that got written out long-hand lots of times before
append was added to the language, it's efficient (enough), optimisationopportunities are available when required, and its easy to explain.
Chris
It is not easy to explain, as long as you want to understand what's going on, as illustrated by this discussion.
You could use the existing semantics of the language to express a function for creating a new element directly in the array with much simpler semantics
and that would work efficiently without any extra compiler support. As said, whether it's allocated on the heap or on the stack doesn't change the fact that two copies are being made
On Wed, Jan 15, 2014 at 3:23 PM, <szt...@gmail.com> wrote:
> x := X{}
>
> It means "Allocate an instance of X on the stack of the current procedure".
Its sematics are ("it means"): _Declare_ a variable x in current scope
and _initialize_ it with value X{}. Observable effects are: Now x can
be used as an operand in expressions and its value, until overwtitten
eventually, will be X{}. BTW, those effects can be sometimes provided
without ever allocating a home for x when a smart compiler kicks in.
Nowhere in the specs is ever stack or heap mentioned. The compiler is
free to put the variable in a register, if it fits there, for example.
Or maybe on the stack, if the compiler cannot optimize x's completely
"home" away, which it sometimes can as mentioned above.
As the code explicitly asks to declare x and initialize x - there's no
doubt that any eventual allocation, register or stack, is
"unnecessary". One down.
> When I then say:
>
> append(slice, x)
>
> And the signature of append is:
>
> func append(slice []Type, elems ...Type) []Type
>
> so that elems are passed by value and not by reference, a copy of "x" has to
> be made on the stack frame for append. So you have "x" on the stack for
> whatever procedure you are in, "x" on the stack frame for append, and
> finally "x" lands in the array underlying the slice, so the only place where
> you really wanted it in. Those are just the good old semantics of function
> calls and of call-by-value vs call-by-reference.
Nowhere in the specs is this implementation detail mandated. I don't
know how other compilers, but I'm aware of compilers which, for code
f(a, b)
evaluate a and b on the stack and then call f. There's, in the general
case, no need to copy a or b if they are just literals or when the
compiler can deduce that x (above) is used only as an argument to f
(or append or whatever) as discussed earlier.
If a or b are, say local variables being still "live" after f
completes, than the semantics of pass by value do require, in the
first approximation, a copy to the f's invocation record. However,
there's nothing "unnecessary" in such copying as that what "pass by
value" is about. Second down.
Let's return to the OP code and its claim about "allocate two
unnecessary copies":
type X struct {
Foo int
}
slice := []X{X{1}, X{2}, X{3}}
slice = append(slice, X{4})
From the above discussion I conclude: Nothing in the specs prevents
the compiler to create just _one_ instance of the literal value X{4},
be it register or stack. That one value must exist at some point
anyway (it gets passed to append), so it's necessary/inevitable to
have it somewhere. (Modulo having an autogenerated specialized version
of append for appending X{}, which is in theory also possible.)
-j
On 15 January 2014 14:53, <szt...@gmail.com> wrote:Hi,
W dniu środa, 15 stycznia 2014 15:45:28 UTC+1 użytkownik chris dollin napisał:| all I was hoping for is an informed discussion of the rationale behind append(),
It's useful functionality that got written out long-hand lots of times before
append was added to the language, it's efficient (enough), optimisationopportunities are available when required, and its easy to explain.
Chris
It is not easy to explain, as long as you want to understand what's going on, as illustrated by this discussion.It's easy to explain if what you want to understand is the /effect/ of callingappend, rather than detailed internal implementation choices. The effect is
defined; the implementation choices can vary.
Whether actual copying -- picking the bytes up from some location and
dropping them doen at another -- happens a lot, a bit, or not at all is at
the whim of the compiler and the accidents of the parameters passed.
You could use the existing semantics of the language to express a function for creating a new element directly in the array with much simpler semanticsSimpler semantcis, how? Such a function still needs to (possibly) allocate
a new bigger underlying array and someone still needs to copy the desiredappended value to the end of the revised slice array.
and that would work efficiently without any extra compiler support. As said, whether it's allocated on the heap or on the stack doesn't change the fact that two copies are being made
/may/ be being made.
W dniu środa, 15 stycznia 2014 16:06:41 UTC+1 użytkownik chris dollin napisał:On 15 January 2014 14:53, <szt...@gmail.com> wrote:Hi,
W dniu środa, 15 stycznia 2014 15:45:28 UTC+1 użytkownik chris dollin napisał:| all I was hoping for is an informed discussion of the rationale behind append(),
It's useful functionality that got written out long-hand lots of times before
append was added to the language, it's efficient (enough), optimisationopportunities are available when required, and its easy to explain.
Chris
It is not easy to explain, as long as you want to understand what's going on, as illustrated by this discussion.It's easy to explain if what you want to understand is the /effect/ of callingappend, rather than detailed internal implementation choices. The effect is
defined; the implementation choices can vary.
Whether actual copying -- picking the bytes up from some location and
dropping them doen at another -- happens a lot, a bit, or not at all is at
the whim of the compiler and the accidents of the parameters passed.
append() has a signature that indicates that it will copy the value, that's the point of having values and pointers in the language. If the compiler does something else, then in does so contrary to the semantics of the language.
On Wed, Jan 15, 2014 at 6:48 AM, <szt...@gmail.com> wrote:
>
> W dniu środa, 15 stycznia 2014 15:40:54 UTC+1 użytkownik Ian Lance Taylor
> napisał:
>>
>> In fact, since Go doesn't have constructors at all, there is no way to
>> express what you want in Go. There is no way to say "turn this memory
>> location into this value" without expressing it as a copy. Of course
>> in many cases the copy will not actually occur. But conceptually it
>> must always be a copy.
>
>
> Why? How is there any copy made in my Push() implementation? You only end up
> with an uninitialized piece of memory for a moment, but that's true anyway
> in Go a lot of times, since it doesn't force you to initialize structs, and
> then you fill it out with the values.
That turns out not to be the case. There is no concept of
uninitialized memory in Go. All memory always holds a valid value for
its type. In implementation terms, what this means is that all memory
is initialized either by the zero value (which is always all zeros) or
by some other valid value.
>> That said, when you say that "x := X{}" means "allocate an instance of
>> X on the stack of the current procedure" you are not describing the
>> situation accurately. "x := X{}" just means "create a new X named x".
>> The compiler can and will generate code that does things other than
>> putting an instance of X on the stack. Sometimes it will put an
>> instance of X on the heap. Sometimes it won't create any instance of
>> X at all. Sometimes it will create some fields of X but not others.
>> And in principle it might sometimes create X in a newly-appended
>> array, though I'm not aware of any Go compilers that actually
>> implement that optimization.
>
> Okay, that's a point, but in practice it likely still has to be allocated
> either on the heap or on the stack, and then has to be copied as it is
> passed by value to append, and then finally copied to the array.
None of these statements is true in all cases. The append function is
part of the language, and not only is the generated code not required
to call a function in all cases, in actual fact for some compilers it
does not call a function in all cases.
Ian
W dniu środa, 15 stycznia 2014 16:06:41 UTC+1 użytkownik chris dollin napisał:
append() has a signature that indicates that it will copy the value,
that's the point of having values and pointers in the language. If the compiler does something else, then in does so contrary to the semantics of the language.
You could use the existing semantics of the language to express a function for creating a new element directly in the array with much simpler semanticsSimpler semantcis, how? Such a function still needs to (possibly) allocate
a new bigger underlying array and someone still needs to copy the desiredappended value to the end of the revised slice array.
I illustrated it with my Push() function several posts earlier. You can initialize the struct directly using a pointer to the right piece of the underlying array.
and that would work efficiently without any extra compiler support. As said, whether it's allocated on the heap or on the stack doesn't change the fact that two copies are being made
/may/ be being made.
And likely are currently being made.
On 15 January 2014 15:15, <szt...@gmail.com> wrote:
W dniu środa, 15 stycznia 2014 16:06:41 UTC+1 użytkownik chris dollin napisał:
append() has a signature that indicates that it will copy the value,The signature makes no difference; all Go function, append included,pass by value. Sometimes the value is a pointer. That doesn't mean
that the value is copied; it means it must /behave as if/ the value is copied.
that's the point of having values and pointers in the language. If the compiler does something else, then in does so contrary to the semantics of the language.
That's not true, if you can't tell from the program's external effects that"something else" has happened.
type T struct {A, B int}func eg() {a := T{1, 2}b := T{3, 4}a.A += 1b.B += 1
}An implementation is free to reorder the two declarations. It is free
to reorder the two assignments. It is free to delete all four statements.
Are these "something elses" contrary to the semantics of the language?What about inlining? Is that contrary to the semantics of the language?
You could use the existing semantics of the language to express a function for creating a new element directly in the array with much simpler semanticsSimpler semantcis, how? Such a function still needs to (possibly) allocate
a new bigger underlying array and someone still needs to copy the desiredappended value to the end of the revised slice array.
I illustrated it with my Push() function several posts earlier. You can initialize the struct directly using a pointer to the right piece of the underlying array.I don't see that the /semantics/ is any simpler.and that would work efficiently without any extra compiler support. As said, whether it's allocated on the heap or on the stack doesn't change the fact that two copies are being made
/may/ be being made.
And likely are currently being made.The significant thing is that it's /allowed/ to not make the copies you
insist /must/ be made.(Also, Ian's posts.)
Chris
--
Chris "allusive" Dollin
Hi,
W dniu środa, 15 stycznia 2014 16:46:57 UTC+1 użytkownik chris dollin napisał:On 15 January 2014 15:15, <szt...@gmail.com> wrote:
W dniu środa, 15 stycznia 2014 16:06:41 UTC+1 użytkownik chris dollin napisał:
append() has a signature that indicates that it will copy the value,The signature makes no difference; all Go function, append included,pass by value. Sometimes the value is a pointer. That doesn't mean
that the value is copied; it means it must /behave as if/ the value is copied.
that's the point of having values and pointers in the language. If the compiler does something else, then in does so contrary to the semantics of the language.
That's not true, if you can't tell from the program's external effects that"something else" has happened.
type T struct {A, B int}func eg() {a := T{1, 2}b := T{3, 4}a.A += 1b.B += 1
}An implementation is free to reorder the two declarations. It is free
to reorder the two assignments. It is free to delete all four statements.
Are these "something elses" contrary to the semantics of the language?What about inlining? Is that contrary to the semantics of the language?
I think there is a huge difference of degree between inlining and "optimizing" a call-by-value to be in fact a call-by-reference.
If you don't write things reasonably conforming to the semantics of the language, you can't just expect the compiler to be able to optimize it away, because you are then not giving the compiler the knowledge necessary to do it, see below.
Those are valid points in general, but not necessarily completely valid here, replying to both you and rog here, the struct might require initialization in the form of an arbitrarily complicated algorithm, and I doubt a compiler can analyze precisely from which parts of your program this initialization algorithm consists of and then apply those parts directly to the arrays memory instead of applying it to whatever the program says to apply it to and then doing a copy, and in the process make sure that the transformation is completely safe to do.
At the very least it's not a trivial optimization, while implementing something like Push() as a primitive is comparatively simple.
ChrisAt the very least it's not a trivial optimization, while implementing something like Push() as a primitive is comparatively simple.
That wasn't what people wanted in the times before append. It's no usebeing efficient if you don't get used.
--
Chris "allusive" Dollin
Hi,
W dniu środa, 15 stycznia 2014 17:15:51 UTC+1 użytkownik chris dollin napisał:That wasn't what people wanted in the times before append. It's no usebeing efficient if you don't get used.
If this is simply "what people wanted", then I don't have any more questions.
the struct might require initialization in the form of an arbitrarily complicated algorithm
W dniu czwartek, 16 stycznia 2014 00:46:40 UTC+1 użytkownik Niklas Schnelle napisał:Why do you think append would be conceptually more efficient with pass-by-reference? Say we are appending to a float64 slice, then copying 8 bytes into a truefloat array (that's backing the slice) will result in a much faster accessible []float64 than the reference based []*float64 that would be the only way if append() would usepass-by-reference in the C sense of passing a pointer.
All arguments are non-pointer values, and I think this signature was meant to suggest that none of the elems will be modified by append, because the appends *function body* will just copy the element to the underlying array. But this is what the go specification says about function calls:
"In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution."
So it is conceptually wrong to use a non-pointer value argument to suggest that there will be a copy made in the function
*body*, the copy takes place before function execution. A call-by-value means making a copy on the stack of the function called, before the function is at all executed, if you at all care about keeping clear semantics. So append to stay efficient
and conceptually clear should take pointers for the elements to be appended, so if you are appending an array of float64, *[]float64, a pointer to an array of float64, not a stack-copy of an array of float64. To express more specifically what happens to the appended elements, append should really take a pointer to a constant, but Go doesn't have const as I understand. In general, as said earlier, I think append should only handle concatenating elements already allocated earlier, and there should be a new primitive that would, just like append, ensure the array underlying the slices has room for one more element, reallocate to a new memory place if needed, and return a pointer to the next "free" element of the array. This is the best way to express the idea of adding a new element to a slice using the zoo of concepts Go has.
The current approach wastes time, space, and is confusing the concepts of the language.
On Thu, Jan 16, 2014 at 1:21 AM, <szt...@gmail.com> wrote:W dniu czwartek, 16 stycznia 2014 00:46:40 UTC+1 użytkownik Niklas Schnelle napisał:Why do you think append would be conceptually more efficient with pass-by-reference? Say we are appending to a float64 slice, then copying 8 bytes into a truefloat array (that's backing the slice) will result in a much faster accessible []float64 than the reference based []*float64 that would be the only way if append() would usepass-by-reference in the C sense of passing a pointer.This is only a conceptual signature for append, append is a compiler builtin function, whichmeans that the compiler can blend the normal rules.
All arguments are non-pointer values, and I think this signature was meant to suggest that none of the elems will be modified by append, because the appends *function body* will just copy the element to the underlying array. But this is what the go specification says about function calls:
"In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution."you're still mixing the semantics of one statement and the actual implementation of that statement.as long as the behavior visible to the user is the same, the implementation is free to do whatever it want.Also, as I said above, append is not a normal function (you can't define your own append in Go, so thisparagraph actually might not describe what happens when you call the builtin function append.
So it is conceptually wrong to use a non-pointer value argument to suggest that there will be a copy made in the functionyou keep mention conceptually. conceptually everything must be allocated on heap, how does that sound?it's actually an optimization to allocate object on stack when it doesn't escape.
So why you analyze the performance of a conceptual thing? Discussing performance implicitly meansan implementation, and having an implementation means give concrete meanings to all concepts.*body*, the copy takes place before function execution. A call-by-value means making a copy on the stack of the function called, before the function is at all executed, if you at all care about keeping clear semantics. So append to stay efficientand conceptually clear should take pointers for the elements to be appended, so if you are appending an array of float64, *[]float64, a pointer to an array of float64, not a stack-copy of an array of float64. To express more specifically what happens to the appended elements, append should really take a pointer to a constant, but Go doesn't have const as I understand. In general, as said earlier, I think append should only handle concatenating elements already allocated earlier, and there should be a new primitive that would, just like append, ensure the array underlying the slices has room for one more element, reallocate to a new memory place if needed, and return a pointer to the next "free" element of the array. This is the best way to express the idea of adding a new element to a slice using the zoo of concepts Go has.because you based all this arguments on false assumptions, they are false.
The current approach wastes time, space, and is confusing the concepts of the language.
the Push you mentioned isn't better, besides it's much harder to use.for example, how to do append(s, a1, a2, a3)?do you need to Push multiple times? Or have a Push that can return any number of pointers?How you could write that line in terms of your Push?
Hi,
W dniu czwartek, 16 stycznia 2014 07:47:57 UTC+1 użytkownik minux napisał:On Thu, Jan 16, 2014 at 1:21 AM, <szt...@gmail.com> wrote:W dniu czwartek, 16 stycznia 2014 00:46:40 UTC+1 użytkownik Niklas Schnelle napisał:Why do you think append would be conceptually more efficient with pass-by-reference? Say we are appending to a float64 slice, then copying 8 bytes into a truefloat array (that's backing the slice) will result in a much faster accessible []float64 than the reference based []*float64 that would be the only way if append() would usepass-by-reference in the C sense of passing a pointer.This is only a conceptual signature for append, append is a compiler builtin function, whichmeans that the compiler can blend the normal rules.
The conceptual signature should still make sense when you for a moment forget that a compiler exists and just think on the level of the programming language as an abstract thing, it should adequately describe the semantics of append(). Why would append do a copy of each of elems before execution? And yet that's what the signature of append() says it does, according to go language spec. Are you saying the signature of append can say anything because it's a builtin and the compiler will anyway interpret it whatever way it wants?
W dniu czwartek, 16 stycznia 2014 07:47:57 UTC+1 użytkownik minux napisał:The conceptual signature should still make sense when you for a moment forget that a compiler exists and just think on the level of the programming language as an abstract thing, it should adequately describe the semantics of append(). Why would append do a copy of each of elems before execution? And yet that's what the signature of append() says it does, according to go language spec. Are you saying the signature of append can say anything because it's a builtin and the compiler will anyway interpret it whatever way it wants?
All arguments are non-pointer values, and I think this signature was meant to suggest that none of the elems will be modified by append, because the appends *function body* will just copy the element to the underlying array. But this is what the go specification says about function calls:
"In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution."you're still mixing the semantics of one statement and the actual implementation of that statement.as long as the behavior visible to the user is the same, the implementation is free to do whatever it want.Also, as I said above, append is not a normal function (you can't define your own append in Go, so thisparagraph actually might not describe what happens when you call the builtin function append.
I am not mixing anything, that's how the signature of append describes what append does according to the language spec,
if the compiler does something else, it only makes things even more messy to understand. In case of Go, language concepts happen to intersect with some implementation details, as you have seen the language spec explicitly mentions that a copy is being made before function body execution.
So it is conceptually wrong to use a non-pointer value argument to suggest that there will be a copy made in the functionyou keep mention conceptually. conceptually everything must be allocated on heap, how does that sound?it's actually an optimization to allocate object on stack when it doesn't escape.
Pointer and non-pointer values are concepts of the language, the heap is not, neither is the stack, it's not even mentioned once in the spec.
So why you analyze the performance of a conceptual thing? Discussing performance implicitly meansan implementation, and having an implementation means give concrete meanings to all concepts.*body*, the copy takes place before function execution. A call-by-value means making a copy on the stack of the function called, before the function is at all executed, if you at all care about keeping clear semantics. So append to stay efficientand conceptually clear should take pointers for the elements to be appended, so if you are appending an array of float64, *[]float64, a pointer to an array of float64, not a stack-copy of an array of float64. To express more specifically what happens to the appended elements, append should really take a pointer to a constant, but Go doesn't have const as I understand. In general, as said earlier, I think append should only handle concatenating elements already allocated earlier, and there should be a new primitive that would, just like append, ensure the array underlying the slices has room for one more element, reallocate to a new memory place if needed, and return a pointer to the next "free" element of the array. This is the best way to express the idea of adding a new element to a slice using the zoo of concepts Go has.because you based all this arguments on false assumptions, they are false.
No, they are not false, you guys just keep using this "appeal to compiler" to justify a language flaw. It's unbelievably
irritating, because A) this is a flaw regardless of whatever the compiler does, B) the compiler does not optimize this currently, and might never be able to for 100% of the cases, as was discussed.
The current approach wastes time, space, and is confusing the concepts of the language.
the Push you mentioned isn't better, besides it's much harder to use.for example, how to do append(s, a1, a2, a3)?do you need to Push multiple times? Or have a Push that can return any number of pointers?How you could write that line in terms of your Push?
I have already mentioned that you should distinguish between the case of concatenating slices, for which append with a pointer argument for elems would be OK, and the case of initializing a new element in the slice. Push() is meant to fix the issues append() has with the latter case.
append() is just intellectually ugly to me, that's all, and I am very surprised people here are defending it almost fanatically like it's a gift from the elders or something, I thought at least one person will simply agree with me that it isn't the best thing under the sun, even if the whole issue isn't a very big deal in most cases. Then you again invent this sort of circular argument, of the sort that was mentioned already, that append() is totally fine because if you would like to do LargeStruct[] then append() would make a copy and that's slow and you would avoid using append() then... I don't get it. You can't just preallocate a larger slice, because you might not know in advance how many elements you will want to add. In addition to wasting (not too much) time, append() also wastes space. Why do it wrong if you can do it right?
On Thu, Jan 16, 2014 at 2:05 AM, <szt...@gmail.com> wrote:W dniu czwartek, 16 stycznia 2014 07:47:57 UTC+1 użytkownik minux napisał:The conceptual signature should still make sense when you for a moment forget that a compiler exists and just think on the level of the programming language as an abstract thing, it should adequately describe the semantics of append(). Why would append do a copy of each of elems before execution? And yet that's what the signature of append() says it does, according to go language spec. Are you saying the signature of append can say anything because it's a builtin and the compiler will anyway interpret it whatever way it wants?
the signature is just for reference how you can use append, not how append take those argumentsbecause it's a builtin function, not a normal function.
All arguments are non-pointer values, and I think this signature was meant to suggest that none of the elems will be modified by append, because the appends *function body* will just copy the element to the underlying array. But this is what the go specification says about function calls:
"In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution."you're still mixing the semantics of one statement and the actual implementation of that statement.as long as the behavior visible to the user is the same, the implementation is free to do whatever it want.Also, as I said above, append is not a normal function (you can't define your own append in Go, so thisparagraph actually might not describe what happens when you call the builtin function append.
I am not mixing anything, that's how the signature of append describes what append does according to the language spec,also, that signature is actually not correct. because you can append a string to a byte slice.if the compiler does something else, it only makes things even more messy to understand. In case of Go, language concepts happen to intersect with some implementation details, as you have seen the language spec explicitly mentions that a copy is being made before function body execution.
My point is you can't discuss performance of a language by looking at the spec.Also, you are actually wrong to cite the function call section of the spec, because at the startof that section, it says:"Given an expression f of function type F,"Does append fits this requirement? what the function type of append?Also, if it's a function, then you can assign to a variable, can you do this in Go?f := append
So it is conceptually wrong to use a non-pointer value argument to suggest that there will be a copy made in the functionyou keep mention conceptually. conceptually everything must be allocated on heap, how does that sound?it's actually an optimization to allocate object on stack when it doesn't escape.
Pointer and non-pointer values are concepts of the language, the heap is not, neither is the stack, it's not even mentioned once in the spec.
So why you analyze the performance of a conceptual thing? Discussing performance implicitly meansan implementation, and having an implementation means give concrete meanings to all concepts.
*body*, the copy takes place before function execution. A call-by-value means making a copy on the stack of the function called, before the function is at all executed, if you at all care about keeping clear semantics. So append to stay efficientand conceptually clear should take pointers for the elements to be appended, so if you are appending an array of float64, *[]float64, a pointer to an array of float64, not a stack-copy of an array of float64. To express more specifically what happens to the appended elements, append should really take a pointer to a constant, but Go doesn't have const as I understand. In general, as said earlier, I think append should only handle concatenating elements already allocated earlier, and there should be a new primitive that would, just like append, ensure the array underlying the slices has room for one more element, reallocate to a new memory place if needed, and return a pointer to the next "free" element of the array. This is the best way to express the idea of adding a new element to a slice using the zoo of concepts Go has.because you based all this arguments on false assumptions, they are false.
No, they are not false, you guys just keep using this "appeal to compiler" to justify a language flaw. It's unbelievablyWhat is the flaw of append?Performance? No, you can discuss the performance without making reference to an implementation,and that's why we keep mention compiler here.
irritating, because A) this is a flaw regardless of whatever the compiler does, B) the compiler does not optimize this currently, and might never be able to for 100% of the cases, as was discussed.
The current approach wastes time, space, and is confusing the concepts of the language.
the Push you mentioned isn't better, besides it's much harder to use.for example, how to do append(s, a1, a2, a3)?do you need to Push multiple times? Or have a Push that can return any number of pointers?How you could write that line in terms of your Push?
I have already mentioned that you should distinguish between the case of concatenating slices, for which append with a pointer argument for elems would be OK, and the case of initializing a new element in the slice. Push() is meant to fix the issues append() has with the latter case.
OK, let's discuss how you can use Push to write:append(s1, X{1}, X{2})Push as you proposed can only initialize one element, which IMO is the flaw.(also, you failed to consider that allocating new slice already zeros the slice,so it actually doesn't reduce the amount of unnecessary memory initialization,be it zeroing or copying another non-zero value)
a = append(a, make([]T, j)...)
Then your code can continue to enquire and place elements into the already-allocated space.
--
On Thu, Jan 16, 2014 at 6:05 PM, <szt...@gmail.com> wrote:The conceptual signature should still make sense when you for a moment forget that a compiler exists and just think on the level of the programming language as an abstract thing, it should adequately describe the semantics of append(). Why would append do a copy of each of elems before execution?
append() is just intellectually ugly to me
The end result is: you artifically demonstrated a pathological
case for which your custom approach works better.
I demonstrated you that appending bytes using stock append()
works faster, and doing this *is* very common.
What to take from this? "There's no silver bullet." [1]
Built-in append() works OK for a broad range of cases on
recent hardware. Basically every data type instances of which
has size <= unsafe.Sizeof(unitptr) will work faster with append()'s
"pass by value" than with yours "pass via pointer".
If you've profiled your program and identified append() is
a bottleneck, you either start to store pointers to objects
in your slice instead of objects themselves or write custom
code like you did.
On Thu, Jan 16, 2014 at 11:06 AM, <szt...@gmail.com> wrote:
>> Show us _one_ well written piece of Go code which passes [100
>> million]int arrays by value to append and we could consider this
>> discussion rational again.
>>
>> -j
>
>
> The benchmark has to fix certain numbers to be able to meaningfully compare
> between the two programs, if you just stop and think about it for just one
> minute before posting the next response about how I should use Go features
> better, you will easily see that the problem illustrated can occur in a real
> program.
Where can I see such "real" program? Do experienced Go programmers
really write code appending [100 million]int elements by value?
> 95% of the time you don't know at compile time how many elements to
> allocate, and have to allocate a new larger array and copy the contents of
> the old one to the new as you go and encounter new inputs, so you use
> append. My benchmark measures how well append() handles such a scenario
> under the assumption that a certain number of elements of certain size are
> in effect finally appended. The point that in the benchmark program you
> could directly access the j-th element of the slice and initialize it is
> irrelevant, because the moment you do it the benchmark stops measuring
> anything interesting.
I suggest to research using real Go code bases like for example the
stdlib. You'll find that elements of slices tend to be typically in
several orders of magnitude smaller than the unreal example of [100
million]int. Next step is to realize that the design of append is
optimized for the/such _majority_ of uses and it thus will not
necessarily scale to _every possible_ use. Yet another step would be,
in theory, that if append would have been designed for typical use of
appending a [100 million]int element, then it quite probably would
have been designed taking a element's pointer.
On the opposite, designing it taking a pointer would probably hurt
many, if not most of the cases where the element is really small. Like
byte or few bytes. Which you can see happening all over the code in
the wild. And if it's more than few bytes it's probably the case of
appending a []byte, so the value is just three words.
> It's just glitch, it's already not worth the time this discussion took, but
> the kind of response I get here, it's just ridiculous. It's like an all time
> ongoing competition in who will first show the OP is an idiot who doesn't
> know how to use the language, without doing any serious attempt at
> understanding what's being said. It took away any sympathy I had for the Go
> language itself.
I don't know why discussion "metadata" are being repeatedly discussed
by you, but I see others as focused on the technical side of the
"problem" as possible. I don't recall anyone else here is concerned
about something being 'ridiculous', about 'sympathy' and other
non-technical issues.
I, for one, think that the OP claims are technically wrong and the
later attempts to render them correct by unreal examples don't make it
any better. Also, I think that if you're gonna, by chance, code in Go
for some time, you'll find using append often; while it typically will
not be an unnecessary and/or avoidable bottleneck of your programs.
On Thu, Jan 16, 2014 at 6:05 PM, <szt...@gmail.com> wrote:The conceptual signature should still make sense when you for a moment forget that a compiler exists and just think on the level of the programming language as an abstract thing, it should adequately describe the semantics of append(). Why would append do a copy of each of elems before execution?
And why would it not? If I don't care about computer internals ("efficiency"), from mathematical point of view presence of intermediate copy is irrelevant to the result. If I *do* care about computer internals, I do care about *full depth* of the toolchain, including optimizations, I don't see why should I stop in the middle of the road; and then, copy by value may be often actually more efficient today, for example because of cache locality. Or may not. But I shall not care anyway until I actually measure this as performance bottleneck in a concrete real app; only then I shall search for possible optimizations.
append() is just intellectually ugly to me
append() is just intellectually pretty to me.
In real life, when I put a box on a stack of boxes, I tend to first stuff it and close it, and only then put it on top of the stack. I rarely do opposite (i.e. put the box on stack top, then fill, then close). This matches the current syntax of append() for me, and is intellectually simpler to me, also through separation of concerns and more "functional" approach.
Also, append() has some more advantages for me over Push as described above, including that I can easily, without necessity of intermediate values, express with it many things, including:
vec = append(vec, Foo{
Bar: "bar",
Baz: "baz",
}, Foo{
Bar: "bar2",
Baz: "baz2",
})
and common slice operations (although they tend to often be overly verbose still): https://code.google.com/p/go-wiki/wiki/SliceTricks
I tried here to discuss with your concerns about "concept" as I could understand it (thus the "boxes" metaphor). I've seen also concerns about efficiency/optimizations/implementation - I'm not sure which of those are actually more important to you in this discussion; it might, I believe, be good idea for easier discussion to focus on one concern, if one is most important, or to list them all explicitly and separately.
Just my $0.02, anyway. I have no affiliation etc.
/M.
What point precisely was "explained" to me? [...] that there are some
tradeoffs present [ ... ] I try to understand one aspect of it [Go] in more depth.
Hi,
W dniu środa, 15 stycznia 2014 09:36:10 UTC+1 użytkownik Abhijit Kadam napisał:
On Wednesday, January 15, 2014 1:52:49 PM UTC+5:30, andrey mirtchovski wrote:> I think what he means is avoid the creation of temporary X{4} that is
> created and copied to slice's array
if the underlying array has the capacity to accept "X{4}" at any
location then that memory is already allocated and the only way to
initialize it to a non-zero element is either to set the appropriate
fields explicitly or to copy a whole struct en masse.I think he knows copy en masse happens and he is expecting former option (set appropriate fields explicitly). This is interesting as it is imperative to me that the check for capacity code would be in append function and by that time the temporary is already created and passed to it.Also I see no major problem with current behavior.
Exactly, thank you for putting in the effort to understand the question and choosing a wise interpretation and not the most naive one. That's the whole point, it's not like setting the fields explicitly and doing a copy are equivalent - the struct can potentially be large and have a large initialization cost, that will then be paid twice with the current implementation of append. I think it would potentially be interesting to be able to initialize the struct directly in the memory in which it will later be stored.
Cheers,
Jarosław Rzeszótko
Hi,I wonder whether someone here would be able to explain to me the rationale for append() being the way it is. The idiomatic way for appending a new element to a slice seems to be this:type X struct {Foo int}slice := []X{X{1}, X{2}, X{3}}slice = append(slice, X{4})But if X is a larger struct, this is somewhat inefficient, since the object has to be allocated first somewhere on the heap, just to be copied a moment after to the end of the array underlying the slice. To avoid this, the way I understand it, the slice capacity has to be checked manually, the slice potentially has to be expanded, then you have to take a pointer to an empty cell in the slices array and allocate the new object directly there. So to initialize a new struct directly using the underlying array you have to write this:slice := []X{X{1}, X{2}}l := len(slice)if(l == cap(slice)) {nslice := make([]X, cap(slice)*2)copy(nslice, slice)slice = nslice}slice = slice[0:l+1]x := &slice[l]x.Foo = 4Maybe the compiler optimizes it, maybe commonly the performance penalty isn't big, but I still do wonder how do the Go concepts fit together in the end. Is providing a primitive that would allow to directly initialize a new object at the end of the slice somehow not possible or leading to problems? Even just conceptually, that would seem to make more sense to me, so I wonder what am I missing out. I am trying to understand the design of the language, not trying to criticize anything... Another thing I wonder about is why append is a non-destructive operation, or at least why aren't there two versions like a non-destructive "concat" and destructive "append" based on passing a pointer to the slice.Cheers,Jarosław Rzeszótko
On Thu, 16 Jan 2014 03:09:59 -0800 (PST)
szt...@gmail.com wrote:
> > The end result is: you artifically demonstrated a pathological
> > case for which your custom approach works better.
> > I demonstrated you that appending bytes using stock append()
> > works faster, and doing this *is* very common.
>
> Why do you say I demonstrated it "artifically"?
As many people said earlier in this thread, it's hard to find
Go code written by a serious programmer that would use append()
in the context similar to that you have demonstrated -- that is,
with types instances of which are this large.
It does not render your testing code "wrong" (in whatever sense
you take it to be) it just demonstrates it's artifical.
If you're not convinced, here's another example: you might write a
piece of code (in whatever language you prefer) that would try to
allocate more than 3 GiB and run it on a commodity 32-bit OS --
it's oblidged to fail but then again typical programs working on 32-bit
systems do not try to allocate that much memory all at the same time.
> I said all the time from the start it's just a little quirk, and I am
> curious about the rationale, and all the way until this point people
> have in various way denied it's at all an issue,
I'm in the same camp with those people. I was just pleased that you
have finally turned to presenting hard data and then tried to present
another set of hard data demonstrating a case of using append() pretty
opposite to yours. That wasn't to defeat your point but rather to
demonstrate with numbers the use case append() is tailored for.