>Stephen Sprunk wrote:
>> On 21-Oct-12 01:26, Terje Mathisen wrote:
>> > Ivan Godard wrote:
>> >> This is especially a notational problem for our Mill, in which hardware
>> >> call operations can return more than one result. It works fairly
>> >> naturally for Ada and other languages with OUT parameters, but there's
>> >> no good way to reach it from C.
>> > I believe I have seen a C compiler which had an intrinsic that did
>> > expose this, but you needed an address parameter to get back the
>> > remainder, something like
>> I can think of innumerable cases where it would be useful to return
>> multiple values without having to mess about with temporary structures
>> or pass-by-pointer. It would be a nice touch if it were also legal to
>> call the above function like this:
>> Adding this to the language seems a rather obvious request, so I have to
>> assume that there's some serious obstacle that isn't so obvious?
>This has merit for many normal functions to return multiple values
>as a result rather than through argument list pointers. The syntax is
>clear and the implementation would not be difficult.
I am sorry, but that is seriously mistaken. All that is true in a
cleaner language, but is completely the opposite of the truth in
the utter mess that C and C++ have become. It was discussed in
WG14 when I was on it, and rejected on those grounds. We didn't
consider using braces rather than parentheses, but I am sure that
I could think of some syntactic nasties with those, especially in
C++. For example, remember that assignment is an operator and
delivers an lvalue, and is less binding than the conditional and
comma operators.
> >> Adding this to the language seems a rather obvious request, so I have to
> >> assume that there's some serious obstacle that isn't so obvious?
> >This has merit for many normal functions to return multiple values
> >as a result rather than through argument list pointers. The syntax is
> >clear and the implementation would not be difficult.
> I am sorry, but that is seriously mistaken. All that is true in a
> cleaner language, but is completely the opposite of the truth in
> the utter mess that C and C++ have become. It was discussed in
> WG14 when I was on it, and rejected on those grounds. We didn't
> consider using braces rather than parentheses, but I am sure that
> I could think of some syntactic nasties with those, especially in
> C++. For example, remember that assignment is an operator and
> delivers an lvalue, and is less binding than the conditional and
> comma operators.
Nick,
I was on WG-14 at the same time. I am not advocating a C change
but some form of language linguistics to allow more than one return
value from a function.
There is some precedence for this in the pascal procedures where
a variable may declared in the argument list as a var and values
may be returned that way. The C problem is multiple values may
be returned but only through a pointer.
> > >> Adding this to the language seems a rather obvious request, so I have to
> > >> assume that there's some serious obstacle that isn't so obvious?
> > >This has merit for many normal functions to return multiple values
> > >as a result rather than through argument list pointers. The syntax is
> > >clear and the implementation would not be difficult.
> > I am sorry, but that is seriously mistaken. All that is true in a
> > cleaner language, but is completely the opposite of the truth in
> > the utter mess that C and C++ have become. It was discussed in
> > WG14 when I was on it, and rejected on those grounds. We didn't
> > consider using braces rather than parentheses, but I am sure that
> > I could think of some syntactic nasties with those, especially in
> > C++. For example, remember that assignment is an operator and
> > delivers an lvalue, and is less binding than the conditional and
> > comma operators.
> Nick,
> I was on WG-14 at the same time. I am not advocating a C change
> but some form of language linguistics to allow more than one return
> value from a function.
> There is some precedence for this in the pascal procedures where
> a variable may declared in the argument list as a var and values
> may be returned that way. The C problem is multiple values may
> be returned but only through a pointer.
>> >> Adding this to the language seems a rather obvious request, so I have to
>> >> assume that there's some serious obstacle that isn't so obvious?
>> >This has merit for many normal functions to return multiple values
>> >as a result rather than through argument list pointers. The syntax is
>> >clear and the implementation would not be difficult.
>> I am sorry, but that is seriously mistaken. All that is true in a
>> cleaner language, but is completely the opposite of the truth in
>> the utter mess that C and C++ have become. It was discussed in
>> WG14 when I was on it, and rejected on those grounds. We didn't
>> consider using braces rather than parentheses, but I am sure that
>> I could think of some syntactic nasties with those, especially in
>> C++. For example, remember that assignment is an operator and
>> delivers an lvalue, and is less binding than the conditional and
>> comma operators.
>I was on WG-14 at the same time. I am not advocating a C change
>but some form of language linguistics to allow more than one return
>value from a function.
>There is some precedence for this in the pascal procedures where
>a variable may declared in the argument list as a var and values
>may be returned that way. The C problem is multiple values may
>be returned but only through a pointer.
Sorry - may I please increasing age and degenerating memory?
But my point stands. Pascal does not have the same syntactic
horrors - for example, assignment is a statement (like in
Fortran) - but the real problem is in the semantic horrors,
which are made worse by allowing this. For example
int * a, * b;
{a[5],b[10]} = ({a[10],b[5]} += {3,2});
And, of course, that's a simple case. It was bad enough in C90,
but what C99 and (even worse) C11 have done to the semantics of
this area beggars description. The only simple resolution would
be to create the concept of an assignment statement, but I am
pretty sure that wouldn't solve the issues in C11 or be acceptable
to C++.
In principle, I agree that it's trivial. Lots of languages
permit it - Matlab and Python, to name but two.
<already5cho...@yahoo.com> wrote:
>On Oct 21, 5:37 pm, Walter Banks <wal...@bytecraft.com> wrote:
>> n...@cam.ac.uk wrote:
>> > >> Adding this to the language seems a rather obvious request, so I have to
>> > >> assume that there's some serious obstacle that isn't so obvious?
>> > >This has merit for many normal functions to return multiple values
>> > >as a result rather than through argument list pointers. The syntax is
>> > >clear and the implementation would not be difficult.
>> > I am sorry, but that is seriously mistaken. All that is true in a
>> > cleaner language, but is completely the opposite of the truth in
>> > the utter mess that C and C++ have become. It was discussed in
>> > WG14 when I was on it, and rejected on those grounds. We didn't
>> > consider using braces rather than parentheses, but I am sure that
>> > I could think of some syntactic nasties with those, especially in
>> > C++. For example, remember that assignment is an operator and
>> > delivers an lvalue, and is less binding than the conditional and
>> > comma operators.
>> Nick,
>> I was on WG-14 at the same time. I am not advocating a C change
>> but some form of language linguistics to allow more than one return
>> value from a function.
>> There is some precedence for this in the pascal procedures where
>> a variable may declared in the argument list as a var and values
>> may be returned that way. The C problem is multiple values may
>> be returned but only through a pointer.
>> Walter..
>Why returning structure is not good enough?
Because the two values returned frequently don't always have a
relationship that makes it sensible to put them into a structure.
Consider the most obvious use for such a thing: the ability to return
a value and a result/status code from a function - you'd almost never
really want those two in a structure together.
> In article <5083BC9C.43D0...@bytecraft.com>,
> Walter Banks <wal...@bytecraft.com> wrote:
>> Stephen Sprunk wrote:
>>> On 21-Oct-12 01:26, Terje Mathisen wrote:
>>>> Ivan Godard wrote:
>>>>> This is especially a notational problem for our Mill, in which hardware
>>>>> call operations can return more than one result. It works fairly
>>>>> naturally for Ada and other languages with OUT parameters, but there's
>>>>> no good way to reach it from C.
>>>> I believe I have seen a C compiler which had an intrinsic that did
>>>> expose this, but you needed an address parameter to get back the
>>>> remainder, something like
>>> I can think of innumerable cases where it would be useful to return
>>> multiple values without having to mess about with temporary structures
>>> or pass-by-pointer. It would be a nice touch if it were also legal to
>>> call the above function like this:
>>> Adding this to the language seems a rather obvious request, so I have to
>>> assume that there's some serious obstacle that isn't so obvious?
>> This has merit for many normal functions to return multiple values
>> as a result rather than through argument list pointers. The syntax is
>> clear and the implementation would not be difficult.
> I am sorry, but that is seriously mistaken. All that is true in a
> cleaner language, but is completely the opposite of the truth in
> the utter mess that C and C++ have become. It was discussed in
> WG14 when I was on it, and rejected on those grounds. We didn't
> consider using braces rather than parentheses, but I am sure that
> I could think of some syntactic nasties with those, especially in
> C++. For example, remember that assignment is an operator and
> delivers an lvalue, and is less binding than the conditional and
> comma operators.
> Regards,
> Nick Maclaren.
I agree that tuple syntax (all those { , , , } approaches) is a syntactic problem in C. More important, it makes code *very* hard to read and understand - too many balls in the air for mere mortals.
However, what's wrong with OUT parameters?
float divide(float divisor, float dividend,
OUT float remainder);
float X, Y, Z, W;
X = divide(Y, Z, W);
by making the OUT parameters optional (compiler passes an ignored temp) then you get:
X = divide(Y, Z);
and
; divide(Y, Z, W);
for normal / and %. INOUT is an obvious extension that is widespread in other languages, as is explicit IN for the documentation benefit.
I don't see any syntactic problems adding these to C. As a notation it's clumsy because you need a handy L-value, but single-carried-value is a psychological assumption that is a holdover from the linear scan of reading text; we don't "read" graphs, only arcs. Hence there doesn't seem to be anything better than to stash all but one result in a name where you can pick then up later when you are done with the "prime" result.
> n...@cam.ac.uk wrote:
>> I am sorry, but that is seriously mistaken. All that is true in a
>> cleaner language, but is completely the opposite of the truth in
>> the utter mess that C and C++ have become. It was discussed in
>> WG14 when I was on it, and rejected on those grounds. We didn't
>> consider using braces rather than parentheses, but I am sure that
>> I could think of some syntactic nasties with those, especially in
>> C++. For example, remember that assignment is an operator and
>> delivers an lvalue, and is less binding than the conditional and
>> comma operators.
> Nick,
> I was on WG-14 at the same time. I am not advocating a C change
> but some form of language linguistics to allow more than one return
> value from a function.
> There is some precedence for this in the pascal procedures where
> a variable may declared in the argument list as a var and values
> may be returned that way. The C problem is multiple values may
> be returned but only through a pointer.
Wouldn't a struct return solve this syntax problem?
Terje
-- - <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
Robert Wessel wrote:
> On Sun, 21 Oct 2012 09:00:04 -0700 (PDT), Michael S
>> Why returning structure is not good enough?
> Because the two values returned frequently don't always have a
> relationship that makes it sensible to put them into a structure.
> Consider the most obvious use for such a thing: the ability to return
> a value and a result/status code from a function - you'd almost never
> really want those two in a structure together.
No, but if that struct is declared locally and the two (or more?) return value components are immediately copied (i.e. moved) to the final location, then most compilers should be able to detect those moves as NOPs and get rid of them.
Terje
-- - <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
In article <k6194d$31...@speranza.aioe.org>,
Ivan Godard <igod...@pacbell.net> wrote:
>I agree that tuple syntax (all those { , , , } approaches) is a >syntactic problem in C. More important, it makes code *very* hard to >read and understand - too many balls in the air for mere mortals.
I don't have a problem with it in a language with a simpler and
cleaner syntax, but I agree about it in C.
>However, what's wrong with OUT parameters?
> float divide(float divisor, float dividend,
> OUT float remainder);
> float X, Y, Z, W;
> X = divide(Y, Z, W);
>by making the OUT parameters optional (compiler passes an ignored temp) >then you get:
> X = divide(Y, Z);
>and
> ; divide(Y, Z, W);
>for normal / and %. INOUT is an obvious extension that is widespread in >other languages, as is explicit IN for the documentation benefit.
As in C++ reference arguments, yes. It would resolve some issues
because all side-effects and sequencing in calculating the lvalues
would be done before calling the function. What it would NOT do
is to provide proper multi-valued function semantics, without
inventing some semantics and semantic mechanism to store the
multiple results only on return. And that is the aspect that I
think is intractable.
None of this is linguistically fundamental, but I am referring to
the specifics of C, where this area is already an ungodly mess.
In article <bpmdl9-o8q1....@ntp-sure.tmsw.no>,
Terje Mathisen <"terje.mathisen at tmsw.no"> wrote:
>Robert Wessel wrote:
>> On Sun, 21 Oct 2012 09:00:04 -0700 (PDT), Michael S
>>> Why returning structure is not good enough?
>> Because the two values returned frequently don't always have a
>> relationship that makes it sensible to put them into a structure.
>> Consider the most obvious use for such a thing: the ability to return
>> a value and a result/status code from a function - you'd almost never
>> really want those two in a structure together.
>No, but if that struct is declared locally and the two (or more?) return >value components are immediately copied (i.e. moved) to the final >location, then most compilers should be able to detect those moves as >NOPs and get rid of them.
The difference between theory and practice is less in theory than
it is in practice :-(
> ... It was discussed in WG14 when I was on it, and rejected on those
> grounds. We didn't consider using braces rather than parentheses,
> but I am sure that I could think of some syntactic nasties with
> those, especially in C++. For example, remember that assignment is
> an operator and delivers an lvalue, and is less binding than the
> conditional and comma operators.
That's why I disregarded the (more obvious, to me) parenthesis syntax;
the following already has a defined meaning:
except that I'm not sure the LHS would be an lvalue, and even if it were
there would be an obvious type mismatch and the quotient would get lost.
OTOH, using braces results in something that is not a compound statement
because there is no internal ";", so the only logical way to parse it is
as an anonymous struct.
Likewise, I would expect the following naļve implementation of
divmod6432() to work as well:
Again, without the internal ";", there is no valid way to parse this
except as an anonymous struct.
Aside from the parsing issues (and I assume there are other
complications I haven't considered there, as I know it's one of my weak
points), the behavior seems relatively easy to specify since you can
easily translate the above examples to the following equivalent code:
It is true one could argue that no truly new functionality is being
added to the language, just syntactic sugar, but this particular sugar
would be incredibly useful--and there are plenty of other examples of
syntactic sugar in C ([], ->, ++, etc).
S
-- Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
> On Sun, 21 Oct 2012 09:00:04 -0700 (PDT), Michael S
> <already5cho...@yahoo.com> wrote:
>> Why returning structure is not good enough?
> Because the two values returned frequently don't always have a
> relationship that makes it sensible to put them into a structure.
> Consider the most obvious use for such a thing: the ability to return
> a value and a result/status code from a function - you'd almost never
> really want those two in a structure together.
So what if they have no "relationship"? They are immediately extracted
from the temporary structure as soon as the function returns, thanks to
the anonymous struct syntax I proposed.
And, arguably, they do have a "relationship" of sorts, being multiple
return values from the same function invocation.
S
-- Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
> Robert Wessel wrote:
>> On Sun, 21 Oct 2012 09:00:04 -0700 (PDT), Michael S
>>> Why returning structure is not good enough?
>> Because the two values returned frequently don't always have a
>> relationship that makes it sensible to put them into a structure.
>> Consider the most obvious use for such a thing: the ability to return
>> a value and a result/status code from a function - you'd almost never
>> really want those two in a structure together.
> No, but if that struct is declared locally and the two (or more?) return
> value components are immediately copied (i.e. moved) to the final
> location, then most compilers should be able to detect those moves as
> NOPs and get rid of them.
Unfortunately, that would only be the case if the function were inlined.
Many (most?) ABIs specify that if a function returns a struct, the
caller passes a hidden argument with a pointer to a struct on the stack
that receives the return values. You would need to redefine all those
ABIs to allow multiple return values via registers (as with function
arguments), not just one, for it to be efficient in non-inlined cases.
S
-- Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
> But my point stands. Pascal does not have the same syntactic
> horrors - for example, assignment is a statement (like in
> Fortran) - but the real problem is in the semantic horrors,
> which are made worse by allowing this. For example
> int * a, * b;
> {a[5],b[10]} = ({a[10],b[5]} += {3,2});
> And, of course, that's a simple case.
Well, I would never think of writing such a monstrosity, but its meaning
is obviously (to me) equivalent to this:
a[5] = a[10] += 3,
b[10] = b[5] += 2;
except that there is no sequence point between the two, so it would be
undefined if a==b or a and b overlapped in certain ways.
> The only simple resolution would be to create the concept of an
> assignment statement, but I am pretty sure that wouldn't solve the
> issues in C11 or be acceptable to C++.
Why would you need an assignment statement to make anonymous structs work?
S
-- Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
>> n...@cam.ac.uk wrote:
>>> I am sorry, but that is seriously mistaken. All that is true in a
>>> cleaner language, but is completely the opposite of the truth in
>>> the utter mess that C and C++ have become. It was discussed in
>>> WG14 when I was on it, and rejected on those grounds. We didn't
>>> consider using braces rather than parentheses, but I am sure that
>>> I could think of some syntactic nasties with those, especially in
>>> C++. For example, remember that assignment is an operator and
>>> delivers an lvalue, and is less binding than the conditional and
>>> comma operators.
>> Nick,
>> I was on WG-14 at the same time. I am not advocating a C change
>> but some form of language linguistics to allow more than one return
>> value from a function.
>> There is some precedence for this in the pascal procedures where
>> a variable may declared in the argument list as a var and values
>> may be returned that way. The C problem is multiple values may
>> be returned but only through a pointer.
> Wouldn't a struct return solve this syntax problem?
> Terje
If you look at the machine code for a struct return you will discover that they are returned via a pointer regardless of what it looks like in the source. The caller passes a pointer to a temporary region, and the callee fills it in. This means stores to assign the results, and later loads to use them; painful increases in memory bandwidth and load/store unit utilization, not to mention the necessary load latency to use the result after the return. Given:
struct S {int i; ...};
S F1() { ... }
void F2(int) { ... }
F2(F1().i);
there will be at least a load delay between the calls, even on an OOO machine.
In contrast, a multi-result call in asm just leaves the several results in registers, where they are immediately usable by the caller. The above code would have at most a register move delay between the calls. This practice violates all call ABIs I have ever heard of. It could be used for intra-module calls where the ABI rules are explicitly waived, but I have also never seen a compiler that can recognize a multi-result idiom and use an in-register protocol.
In contrast, Ada and other languages with OUT parameters routinely return them in registers using an ABI that allows for that usage. Inter-language calls are restricted to the lowest common denominator of course, i.e. C protocol.
Ivan Godard wrote:
> On 10/21/2012 10:05 AM, Terje Mathisen wrote:
>> Wouldn't a struct return solve this syntax problem?
>> Terje
> If you look at the machine code for a struct return you will discover
> that they are returned via a pointer regardless of what it looks like in
> the source.
As another poster noted: What about inlining?
IMHO such a tiny piece of code really doesn't make sense unless you inline it, since the CALL/RETURN overhead (just in code space) is far larger than the actual DIV opcode which generates both return values.
I.e. I'm really arguing for what's effectively a compiler intrinsic, but still portable code.
Terje
-- - <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
n...@cam.ac.uk wrote:
> As in C++ reference arguments, yes. It would resolve some issues
> because all side-effects and sequencing in calculating the lvalues
> would be done before calling the function. What it would NOT do
> is to provide proper multi-valued function semantics, without
> inventing some semantics and semantic mechanism to store the
> multiple results only on return. And that is the aspect that I
> think is intractable.
What I have done previously is to define a pair of 32-bit return values as a single 64-bit value:
This makes perfect sense in this particular case on x86, since EDX:EAX is the 64-bit return pair, and the same registers contain the two results of a 64/32->(32,32) DIV opcode. :-)
Terje
-- - <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
> n...@cam.ac.uk wrote:
>> As in C++ reference arguments, yes. It would resolve some issues
>> because all side-effects and sequencing in calculating the lvalues
>> would be done before calling the function. What it would NOT do
>> is to provide proper multi-valued function semantics, without
>> inventing some semantics and semantic mechanism to store the
>> multiple results only on return. And that is the aspect that I
>> think is intractable.
> What I have done previously is to define a pair of 32-bit return values
> as a single 64-bit value:
> This makes perfect sense in this particular case on x86, since EDX:EAX
> is the 64-bit return pair, and the same registers contain the two
> results of a 64/32->(32,32) DIV opcode. :-)
> Terje
Until you want to return a double and a status.
div/mod here is merely an easily understood example of multiple return values. While any given example can klugified into an intrinsic or other ad hocery, the general case remains.
> Ivan Godard wrote:
>> On 10/21/2012 10:05 AM, Terje Mathisen wrote:
>>> Wouldn't a struct return solve this syntax problem?
>>> Terje
>> If you look at the machine code for a struct return you will discover
>> that they are returned via a pointer regardless of what it looks like in
>> the source.
> As another poster noted: What about inlining?
> IMHO such a tiny piece of code really doesn't make sense unless you
> inline it, since the CALL/RETURN overhead (just in code space) is far
> larger than the actual DIV opcode which generates both return values.
> I.e. I'm really arguing for what's effectively a compiler intrinsic, but
> still portable code.
> Terje
Then use a longer example: a function that returns both sin and cos is much cheaper than two calls one for each, and it's not uncommon to want both. Sure, you can inline - but a correct library function, that deals with rounding modes, NaNs and infs, etc is a bit bigger than you'd want to.
Or look at the mess that are Unix syscalls and error returns. Some return zero on failure, some zero on success, some a negative, some a status, some a null pointer, some don't report at all - and most with the hidden errno out argument. And all uniformly ignored - tell me the last time you checked the return value of printf (yes, it can fail).
Contrast a uniform convention in which the result of every call is the error enum, and any values to be returned are OUT parameters.
We use a library that intercepts every syscall with a macro of the same nam. The macro checks for and converts error returns to C++ throw, while still returning the (checked) data value of the original signature. It's saved my butt more times than I like to admit.
On Oct 21, 9:49 pm, Ivan Godard <igod...@pacbell.net> wrote:
> On 10/21/2012 10:05 AM, Terje Mathisen wrote:
> If you look at the machine code for a struct return you will discover
> that they are returned via a pointer regardless of what it looks like in
> the source. The caller passes a pointer to a temporary region, and the
> callee fills it in. This means stores to assign the results, and later
> loads to use them;
True for Microsoft calling conventions.
According to my understanding of "System V Application Binary
Interface on AMD64 Architecture Processor", only partially true
(a.k.a. false) on x64 Gnu/Linux, Solaris, FreBSD and, may be, some
other OSes.
System V ABI has rather complex rules that I don't understand
completely. But the bottom line in our case is that
struct ab { uint64_t a,b; } is passed back in RAX and RDX.
> painful increases in memory bandwidth and load/store
> unit utilization, not to mention the necessary load latency to use the
> result after the return.
Even when it is passed on stack, it's not that painful relatively to
the time of 128b/64b division itself.
Memory is practically never involved. Of all today's x86 processors,
even L2 is involved only on Bulldozer.
Load/store unit is, indeed, utilized, but it's no big deal.
Load latency also shouldn't be involved, at least on more advanced
processors, because the case is extremely easy for load-to-store
forwarding hardware.
> On Oct 21, 9:49 pm, Ivan Godard <igod...@pacbell.net> wrote:
>> On 10/21/2012 10:05 AM, Terje Mathisen wrote:
>> If you look at the machine code for a struct return you will discover
>> that they are returned via a pointer regardless of what it looks like in
>> the source. The caller passes a pointer to a temporary region, and the
>> callee fills it in. This means stores to assign the results, and later
>> loads to use them;
> Even when it is passed on stack, it's not that painful relatively to
> the time of 128b/64b division itself.
> Memory is practically never involved. Of all today's x86 processors,
> even L2 is involved only on Bulldozer.
> Load/store unit is, indeed, utilized, but it's no big deal.
> Load latency also shouldn't be involved, at least on more advanced
> processors, because the case is extremely easy for load-to-store
> forwarding hardware.
Good point; the store buffers act in effect like a L0 cache.
Does anyone know the actual timing on a load from decode to availability for something that has a hard data dependence when the load is satisfied in the store buffers?
<step...@sprunk.org> wrote:
>On 21-Oct-12 12:08, Terje Mathisen wrote:
>> Robert Wessel wrote:
>>> On Sun, 21 Oct 2012 09:00:04 -0700 (PDT), Michael S
>>>> Why returning structure is not good enough?
>>> Because the two values returned frequently don't always have a
>>> relationship that makes it sensible to put them into a structure.
>>> Consider the most obvious use for such a thing: the ability to return
>>> a value and a result/status code from a function - you'd almost never
>>> really want those two in a structure together.
>> No, but if that struct is declared locally and the two (or more?) return
>> value components are immediately copied (i.e. moved) to the final
>> location, then most compilers should be able to detect those moves as
>> NOPs and get rid of them.
>Unfortunately, that would only be the case if the function were inlined.
>Many (most?) ABIs specify that if a function returns a struct, the
>caller passes a hidden argument with a pointer to a struct on the stack
>that receives the return values. You would need to redefine all those
>ABIs to allow multiple return values via registers (as with function
>arguments), not just one, for it to be efficient in non-inlined cases.
While I don't want to quibble of the definition of "many", it's not
uncommon for (very) short structures to be returned in registers. For
example, both MS and GCC using various "fast" calling conventions will
return at least some structures of 8 bytes in a register pair in
x86-32.
tmsw.no"> wrote:
>Robert Wessel wrote:
>> On Sun, 21 Oct 2012 09:00:04 -0700 (PDT), Michael S
>>> Why returning structure is not good enough?
>> Because the two values returned frequently don't always have a
>> relationship that makes it sensible to put them into a structure.
>> Consider the most obvious use for such a thing: the ability to return
>> a value and a result/status code from a function - you'd almost never
>> really want those two in a structure together.
>No, but if that struct is declared locally and the two (or more?) return >value components are immediately copied (i.e. moved) to the final >location, then most compilers should be able to detect those moves as >NOPs and get rid of them.
True enough, at least for functions that are inlined or have very
short returned structures meeting certain requirements, but it's
certainly not a pretty solution. For many libraries you'd need to
define a whole passel of "return" structures. And the introduction of
otherwise meaningless temporary variables is unlikely to help the
clarity of the code.
Ivan Godard wrote:
> On 10/21/2012 4:25 PM, Michael S wrote:
>> Even when it is passed on stack, it's not that painful relatively to
>> the time of 128b/64b division itself.
>> Memory is practically never involved. Of all today's x86 processors,
>> even L2 is involved only on Bulldozer.
>> Load/store unit is, indeed, utilized, but it's no big deal.
>> Load latency also shouldn't be involved, at least on more advanced
>> processors, because the case is extremely easy for load-to-store
>> forwarding hardware.
> Good point; the store buffers act in effect like a L0 cache.
> Does anyone know the actual timing on a load from decode to availability
> for something that has a hard data dependence when the load is satisfied
> in the store buffers?
I'm pretty sure that the answer is "that depends on the actual cpu model", but that the real answer is "less than or equal to an L1 cache hit".
I.e. a read from memory that hits in a store buffer looks like a single-cycle operation as seen from the surrounding code.
Terje
-- - <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
> On 10/21/2012 4:25 PM, Michael S wrote:
>> On Oct 21, 9:49 pm, Ivan Godard <igod...@pacbell.net> wrote:
>>> On 10/21/2012 10:05 AM, Terje Mathisen wrote:
>>> If you look at the machine code for a struct return you will discover
>>> that they are returned via a pointer regardless of what it looks like in
>>> the source. The caller passes a pointer to a temporary region, and the
>>> callee fills it in. This means stores to assign the results, and later
>>> loads to use them;
>> Even when it is passed on stack, it's not that painful relatively to
>> the time of 128b/64b division itself.
>> Memory is practically never involved. Of all today's x86 processors,
>> even L2 is involved only on Bulldozer.
>> Load/store unit is, indeed, utilized, but it's no big deal.
>> Load latency also shouldn't be involved, at least on more advanced
>> processors, because the case is extremely easy for load-to-store
>> forwarding hardware.
> Good point; the store buffers act in effect like a L0 cache.
> Does anyone know the actual timing on a load from decode to availability
> for something that has a hard data dependence when the load is satisfied
> in the store buffers?
On all of the machines I am familiar with the store buffers have the same latency as the L1 cache.
You *could* try to make store-to-load forwarding faster than the L1 cache. But it is a pain to deal with many different latencies.
Moreover, machines have now begin to "registerify" memory:
to treat "store M[A]:=reg1; ... load reg2 := M[A]"
as equivalent to "move reg2 := reg1". Even if reg1 gets overwritten in between.
(Actually, "store M[A]:=reg1; ... move reg2 := reg1".
Eliding the store is not done so much yet, because it *might* be visible to another processor.
However, eliding the store if there is another store to the same address is done.