Integrating multiple-result function call hardware into C++ - prior work?

168 views
Skip to first unread message

iv...@ootbcomp.com

unread,
Jun 9, 2014, 9:49:29 PM6/9/14
to std-pr...@isocpp.org
I work on the Mill CPU, a novel architecture for general-purpose computation. The Mill hardware has a number of features that are not present in current CPUs, and we plan to implement extensions in clang and gcc to accommodate those aspects that do not fit naturally as libraries or intrinsics. We want our extensions to be "C++-ish" in style, and to conform to best-practice thinking if there is any. Info on the Mill at http://millcomputing.com/docs.

One of the Mill features is the ability for Mill hardware to return more than one result from a function. These are true value returns, not reference arguments. If a returned value is a large object then we uses the same implicit reference trick as is used for large function arguments, but anything that would pass in the registers on a conventional CPU can be returned by value on the Mill. VARRESULTS is also supported using a mechanism similar to VARARGS. For a use-case, we find a very common idiom is for a function to have one main value result and a second status result. However, multiple value results are often convenient or efficient too; for example, it is possible to compute both sin() and cos() of an angle in little more than the time to compute one of them. The library function computes and returns both, and the caller uses one or the other or both as needed.

The question is how this facility should be integrated (as an extension) into the language. Languages (such as Ada) that have OUT parameters have a natural way to express return-by-value, so something like:
 
            int foo(int x, int y, _out int z) { ...}

would provide declarator syntax. However, it's not at all clear how to extend the return statement; tuples would be ungainly, and the natural argument list runs foul of the comma operator. There are also evaluation order issues: should the actual argument of an OUT parameter be evaluated before or after the call?

A different approach, followed by some other languages, uses modified function-like notation for multiple results, possibly something like:

            {int, int} foo(int x, int y) {...};

which seems more C++-flavored than the out-parameter approach. However, the return statement remains a problem, and in addition the syntactic place of a function call becomes problematic. Possibly:

            a = b + {*, c}foo(1,2);  // means that first result is used in the add, while the second is assigned to c

seems tolerable, but potentially forces the creation of otherwise unnecessary placeholder variables.

Has C++ multi-result syntax and semantics been explored before? Can anyone supply citations?

Ivan Godard

David Krauss

unread,
Jun 9, 2014, 9:50:53 PM6/9/14
to std-pr...@isocpp.org

On 2014–06–10, at 9:49 AM, iv...@ootbcomp.com wrote:

> http://millcomputing.com/docs.

403 forbidden error.

David Krauss

unread,
Jun 9, 2014, 10:07:46 PM6/9/14
to std-pr...@isocpp.org
On 2014–06–10, at 9:49 AM, iv...@ootbcomp.com wrote:

Has C++ multi-result syntax and semantics been explored before? Can anyone supply citations?

The simplest analogy is returning a class type by value. This has existed since before C++ split from C. The C++ standard library functions std::equal_range (and its associative container counterpart), std::mismatch, and others (including the C standard library function family *div()) return multiple values this way.

You can also perform lifetime analysis on instructions pulled in by function inlining, and promote a pass-by-reference parameter to registers.

Generally speaking, CPU architectures do not dictate that a single register holds a function return value. Any of x86, x64, 68K, and PowerPC allow multiple general-purpose registers to be preserved on return, and those are only the ones I personally know. It’s a choice made by the ABI, guided by statistics. One register provides sufficient storage for 90% of function calls. Function returns are uncommon enough to render useless any special architectural considerations, aside from preventing extremely expensive conditions like pipeline flushes and branch mispredictions.

Ivan Godard

unread,
Jun 9, 2014, 10:23:15 PM6/9/14
to std-pr...@isocpp.org
Odd. Works for me, and it's a public site.


On 6/9/2014 6:50 PM, David Krauss wrote:

Ivan Godard

unread,
Jun 9, 2014, 10:29:43 PM6/9/14
to std-pr...@isocpp.org
We know why other CPUs do not have hardware support, but that's not the issue: right or wrong, we do have such support. The question is how to make it naturally available as an extension of C++.

And yes, with sufficient analysis the compiler may bbe able to figure out in some (many?) cases that a regular C++ single-valued function can use the hardware operation, but that too is not the issue: we are looking for a language extension by which the programmer can state his intention that the function has some specific number of results other than zero or one, whether or not the compiler could figure it out.

Without requiring bogus dummy class types, or detecting that a reference type is really not a reference type, or other kludgery.
--

---
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/4g3NYpKwQX0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

David Krauss

unread,
Jun 9, 2014, 10:32:07 PM6/9/14
to std-pr...@isocpp.org
Apparently I have been blocked by “project honey pot.”

You might consider an established host like GitHub.
> --
>
> --- You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

David Krauss

unread,
Jun 9, 2014, 10:44:44 PM6/9/14
to std-pr...@isocpp.org
On 2014–06–10, at 10:29 AM, Ivan Godard <iv...@MillComputing.com> wrote:

We know why other CPUs do not have hardware support

Other CPUs do support multiple returns, at the hardware level. How do they not? Can you name a single problem on a single architecture? On x64 specifically?

, but that's not the issue: right or wrong, we do have such support. The question is how to make it naturally available as an extension of C++.

You asked for existing practice and literature. If extending the language is part of your homework assignment, so be it, but again you need to state what difference needs to be made.

And yes, with sufficient analysis the compiler may bbe able to figure out in some (many?) cases that a regular C++ single-valued function can use the hardware operation, but that too is not the issue: we are looking for a language extension by which the programmer can state his intention that the function has some specific number of results other than zero or one, whether or not the compiler could figure it out.

Without requiring bogus dummy class types, or detecting that a reference type is really not a reference type, or other kludgery.

If you dig into implementations of other multiple-return languages such as Python, Haskell, ML, etc, you will find that multiple return values are always implemented as single values of tuple type. Nothing is lost in such a formalism so there’s no reason not to model things that way. All that C++ lacks is declarations that unpack tuples. Such a feature is hard to reconcile with the C legacy of declarations supporting introduction of multiple names of the same type. But, we will probably see it added eventually.

However, that’s only syntactic sugar and there’s no bearing on machine architecture. C++ is entirely machine independent and machine-specific extensions are outside the scope of this discussion list.

Message has been deleted

Patrick Michael Niedzielski

unread,
Jun 9, 2014, 11:18:59 PM6/9/14
to std-pr...@isocpp.org
On mar, 2014-06-10 at 10:43 +0800, David Krauss wrote:
> Other CPUs do support multiple returns, at the hardware level. How do they not? Can you name a single problem on a single architecture? On x64 specifically?

I know you're having trouble following the link given for some reason,
but I'm relatively sure this talk (which I found through that link),
describes what he means by "having hardware support" for multiple
returns: <https://www.youtube.com/watch?v=QGw-cy0ylCc>. I can't say for
sure, because I'm still watching it right now, but from the description,
this seems to be it.

> You asked for existing practice and literature. If extending the language is part of your homework assignment, so be it, but again you need to state what difference needs to be made.

Why was this "homework assignment" comment was necessary? Be civil.
The past few emails you've sent in this thread have not come off as
civil.


Ivan:

How necessary do you think the syntactic extensions are? Ignoring, for
a second, the VARRESULTS functionality, do you think that this could
effectively be implemented without any changes to syntax, as in:

std::tuple<A, B, C> foo(D d, E e);

with each element of the tuple being able to be mapped to your return
values? For a compiler targeting your architecture, is there anything
that prevents the optimizer from decomposing the std::tuple into
multiple, in-hardware return values? I apologize, I'm still fuzzy on
what your hardware support for multiple return values means. But what
cases could not be covered by this solution, which doesn't need any
language changes?

The VARRESULTS functionality may be more problematic. Let's say:

/* ... */ bar();

returns a variable number of return values. Because the number of
return values can vary at runtime, the return type of foo cannot be
deduced at compile time. Thus,

auto i = bar(); // what is decltype(i)? we don't know

So we definitely need something more complicated than something this
simple. You may want to look into the recent proposal N4025 on classes
whose sizes can only be figured out at runtime. If foo() returns a type
of this sort, it could potentially be able to apply a similar
optimization as what I described above. Someone else here may be able
to disprove me on that.

Cheers,
Patrick
signature.asc

Ivan Godard

unread,
Jun 10, 2014, 12:21:54 AM6/10/14
to std-pr...@isocpp.org

On 6/9/2014 7:43 PM, David Krauss wrote:

On 2014–06–10, at 10:29 AM, Ivan Godard <iv...@MillComputing.com> wrote:

We know why other CPUs do not have hardware support

Other CPUs do support multiple returns, at the hardware level. How do they not? Can you name a single problem on a single architecture? On x64 specifically?

With a few historical exceptions, other CPUs do not have a call operation at all; they have a branch-and-link. On a Mill "f(a,b,c)" is one hardware operation, total. There is neither preamble in the caller other than the evaluation of the argument expressions, nor postamble within the callee. The call operation saves all state, exits the current frame and enters the new one, and passes all the arguments, in hardware, in one clock. Yes, it's a CISC.


, but that's not the issue: right or wrong, we do have such support. The question is how to make it naturally available as an extension of C++.

You asked for existing practice and literature. If extending the language is part of your homework assignment, so be it, but again you need to state what difference needs to be made.
I wish to inquire for suggestions or current and/or historic language practice in support of functions returning more than one by-value result. The notation should be convenient, intuitive and not convoluted or unaesthetic, nor introduce superfluous variables.

If possible.



And yes, with sufficient analysis the compiler may bbe able to figure out in some (many?) cases that a regular C++ single-valued function can use the hardware operation, but that too is not the issue: we are looking for a language extension by which the programmer can state his intention that the function has some specific number of results other than zero or one, whether or not the compiler could figure it out.

Without requiring bogus dummy class types, or detecting that a reference type is really not a reference type, or other kludgery.

If you dig into implementations of other multiple-return languages such as Python, Haskell, ML, etc, you will find that multiple return values are always implemented as single values of tuple type. Nothing is lost in such a formalism so there’s no reason not to model things that way. All that C++ lacks is declarations that unpack tuples. Such a feature is hard to reconcile with the C legacy of declarations supporting introduction of multiple names of the same type. But, we will probably see it added eventually.
I mentioned that approach as a possible notation. However, it is ambiguous, because tuples are first class objects, so it is not obvious whether the function is to return one tuple-type object, or several by-value values. The difference is most evident when one of the tuple components has neither copy nor move constructors.

However, that’s only syntactic sugar and there’s no bearing on machine architecture. C++ is entirely machine independent
Well, there are those that say it incorporates a PDP-11. I've programmed PDP-11s, and I will say the C part of C++ looks remarkable familiar.

and machine-specific extensions are outside the scope of this discussion list.

That's a reasonable answer, thank you. Can you suggest a forum which might be interested in such discussion?

Ivan

Ivan Godard

unread,
Jun 10, 2014, 12:26:50 AM6/10/14
to std-pr...@isocpp.org
Ah - that's the company security interface. You have been blocked
because your ISP or address range is a known source of spam or attack
vectors. You might check your system for viruses, or consider using a
different address. I'll let the site administrator know that you have
been blocked.

Ivan Godard

unread,
Jun 10, 2014, 12:51:57 AM6/10/14
to std-pr...@isocpp.org

On 6/9/2014 8:18 PM, Patrick Michael Niedzielski wrote:
> On mar, 2014-06-10 at 10:43 +0800, David Krauss wrote:
>> Other CPUs do support multiple returns, at the hardware level. How do they not? Can you name a single problem on a single architecture? On x64 specifically?
> I know you're having trouble following the link given for some reason,
> but I'm relatively sure this talk (which I found through that link),
> describes what he means by "having hardware support" for multiple
> returns: <https://www.youtube.com/watch?v=QGw-cy0ylCc>. I can't say for
> sure, because I'm still watching it right now, but from the description,
> this seems to be it.
Yes, that's the right talk.
>> You asked for existing practice and literature. If extending the language is part of your homework assignment, so be it, but again you need to state what difference needs to be made.
> Why was this "homework assignment" comment was necessary? Be civil.
> The past few emails you've sent in this thread have not come off
> civil.
>
>
> Ivan:
>
> How necessary do you think the syntactic extensions are? Ignoring, for
> a second, the VARRESULTS functionality, do you think that this could
> effectively be implemented without any changes to syntax, as in:
>
> std::tuple<A, B, C> foo(D d, E e);
Yes, that's a well-known solution. The problem is that it forces the
introduction of dummy variables (A/B/C). Not only is this inelegant (and
inefficient absent some heroics in the compiler) but there are problems
when the return component type(s) lack copy constructors.

Consider a function F with two scalar results, and a function G
accepting two arguments of the scalar type, where the intent is that the
results of F will be the arguments of G - in the opposite order. I was
hoping for something like:
G([lab:f()] lab$1, lab$0)
rather than the armwaving required to use tuples (note that I am very
much not proposing that syntax; my purpose here is to gather suggestions
for the syntax.

> with each element of the tuple being able to be mapped to your return
> values? For a compiler targeting your architecture, is there anything
> that prevents the optimizer from decomposing the std::tuple into
> multiple, in-hardware return values?
Not that I know of, unless the function is bound at link time and not
visible to the optimizer. In that case the distinction between returning
a tuple-type object and returning multiple values becomes ambiguous.
This problem is present whenever the optimizer is proposed to finesse
imputed types that don't actually exist in the type system. I'm hoping
to actually extend the type system for function types to include genuine
multi-result types. We expect to add whatever extension we come up with
to clang and gcc; this is not an academic exercise.
> I apologize, I'm still fuzzy on
> what your hardware support for multiple return values means. But what
> cases could not be covered by this solution, which doesn't need any
> language changes?
It is unclear to me that the tuple solution works when the result
component types lack copy constructors.
> The VARRESULTS functionality may be more problematic. Let's say:
>
> /* ... */ bar();
>
> returns a variable number of return values. Because the number of
> return values can vary at runtime, the return type of foo cannot be
> deduced at compile time. Thus,
>
> auto i = bar(); // what is decltype(i)? we don't know
>
> So we definitely need something more complicated than something this
> simple. You may want to look into the recent proposal N4025 on classes
> whose sizes can only be figured out at runtime. If foo() returns a type
> of this sort, it could potentially be able to apply a similar
> optimization as what I described above. Someone else here may be able
> to disprove me on that.
That sounds promising. Please forgive me, I am not part of the language
community and I don't know where I would find N4025; the first few pages
of Google yield nothing promising. Can you give me a pointer?

Thank you.

Ivan

Thomas Braxton

unread,
Jun 10, 2014, 12:57:36 AM6/10/14
to std-pr...@isocpp.org

On Mon, Jun 9, 2014 at 11:51 PM, Ivan Godard <iv...@millcomputing.com> wrote:
N4025

https://isocpp.org/files/papers/n4025.pdf

David Krauss

unread,
Jun 10, 2014, 1:01:47 AM6/10/14
to std-pr...@isocpp.org
On 2014–06–10, at 12:21 PM, Ivan Godard <iv...@MillComputing.com> wrote:


On 6/9/2014 7:43 PM, David Krauss wrote:

On 2014–06–10, at 10:29 AM, Ivan Godard <iv...@MillComputing.com> wrote:

We know why other CPUs do not have hardware support

Other CPUs do support multiple returns, at the hardware level. How do they not? Can you name a single problem on a single architecture? On x64 specifically?

With a few historical exceptions, other CPUs do not have a call operation at all; they have a branch-and-link. On a Mill "f(a,b,c)" is one hardware operation, total. There is neither preamble in the caller other than the evaluation of the argument expressions, nor postamble within the callee. The call operation saves all state, exits the current frame and enters the new one, and passes all the arguments, in hardware, in one clock. Yes, it's a CISC.

You’ve changed the topic from returns to calls, but your answer still reveals the truth: current architectures do not have function return operations at all, they have branch-to-link (and perhaps load some registers) operations. Whatever registers aren’t overwritten by such an instruction are eligible to return data back to the caller.

The entire notion of subroutines is a higher-level construct which helps to organize programs.

For what it’s worth, the notion of instructions is also somewhat arbitrary. As a CISC instruction is decoded you could call its constituent encodings as instructions instead. Or, you could consider the sequence of RISC instructions accomplishing a function call to be one instruction. If you squint hard enough, then all the ISAs look the same, and they can all be represented with something like LLVM IR.

I wish to inquire for suggestions or current and/or historic language practice in support of functions returning more than one by-value result. The notation should be convenient, intuitive and not convoluted or unaesthetic, nor introduce superfluous variables.

If possible.

Those criteria are subjective. If you’re serious about solving the problem, I urge you to consider the solution used by essentially all functional and procedural languages, which is to pack the multiple values into one tuple.

Data-types are also merely human constructs that help organize programs. Even if you invent another formalism that avoids calling the return values a tuple, it will still be equally valid to treat them as such, because the difference is only in notation.

I mentioned that approach as a possible notation. However, it is ambiguous, because tuples are first class objects, so it is not obvious whether the function is to return one tuple-type object, or several by-value values. The difference is most evident when one of the tuple components has neither copy nor move constructors.

Returning a value without a move or copy constructor is only partially supported by the language: the caller must bind the result to a local (non-member) reference. Personally I believe this limitation to be unjustified, and in the long run I’d like to see it eliminated. However, others believe differently.

As for current programming practice, it’s safe to assume that return values are moveable or copyable.

Note that a partial workaround does exist, as piecewise construction of tuples. See std::piecewise_construct. This will accomplish a non-moveable object inside a tuple, but all tuple elements must be constructed simultaneously.

As for whether you might want a new mechanism, as opposed to more sugary tuples, are you prepared to sacrifice support or performance for programs that use tuples (or other aggregates) for this purpose?

However, that’s only syntactic sugar and there’s no bearing on machine architecture. C++ is entirely machine independent
Well, there are those that say it incorporates a PDP-11. I've programmed PDP-11s, and I will say the C part of C++ looks remarkable familiar.
and machine-specific extensions are outside the scope of this discussion list.

That's a reasonable answer, thank you. Can you suggest a forum which might be interested in such discussion?

The one on your website appears to be a natural choice. You should also discuss with your customers, who will be the first to write code using your extensions.

David Krauss

unread,
Jun 10, 2014, 1:20:30 AM6/10/14
to std-pr...@isocpp.org
On 2014–06–10, at 12:51 PM, Ivan Godard <iv...@MillComputing.com> wrote:

Consider a function F with two scalar results, and a function G accepting two arguments of the scalar type, where the intent is that the results of F will be the arguments of G - in the opposite order. I was hoping for something like:
                        G([lab:f()] lab$1, lab$0)
rather than the armwaving required to use tuples (note that I am very much not proposing that syntax; my purpose here is to gather suggestions for the syntax.

The only way for the C++ programmer to route values into functions is to pass them individually. Some languages support unpacking primitive tuples as arguments (and C++ only requires adding a wrapper function), but in your example the programmer would still need to name the items individually.

However, programmer notation has no bearing on object code generation. If your hardware has some distinct addresses for multiple return values, which need to be coded in a subsequent CALL instruction, then your code generator will need to detect the way data is being routed given intermediate-level program representation.

with each element of the tuple being able to be mapped to your return
values?  For a compiler targeting your architecture, is there anything
that prevents the optimizer from decomposing the std::tuple into
multiple, in-hardware return values?
Not that I know of, unless the function is bound at link time and not visible to the optimizer. In that case the distinction between returning a tuple-type object and returning multiple values becomes ambiguous. This problem is present whenever the optimizer is proposed to finesse imputed types that don't actually exist in the type system. I'm hoping to actually extend the type system for function types to include genuine multi-result types. We expect to add whatever extension we come up with to clang and gcc; this is not an academic exercise.

The first task in a compiler port is to design an ABI, so I suggest you specify aggregate return values to go into multiple registers. Is there a reason this wouldn't work as the general case? (And is there any other possible programmer intent in returning an aggregate type?)

 I apologize, I'm still fuzzy on
what your hardware support for multiple return values means.  But what
cases could not be covered by this solution, which doesn't need any
language changes?
It is unclear to me that the tuple solution works when the result component types lack copy constructors.

I’ve addressed this in my previous message. Also, you can define an extension to separately construct tuple items, and still have the result be an ABI-compatible tuple. In fact, I don’t see why it can’t be contained within a 100% compatible library extension:

// Return a tuple of uninitialized storage.
template< typename ... t >
std::tuple< t ... > make_uninitialized_tuple();

// Initialize one tuple element.
/* All elements must be initialized before the tuple lifetime ends
 (i.e. before return or any exception may be thrown). */
template< std::size_t n, typename tuple_type, typename ... construct_arg >
void initialize_tuple_element( tuple_type & t, construct_arg && ... a );

The exception safety is probably a deal-breaker, though.

David Krauss

unread,
Jun 10, 2014, 1:35:47 AM6/10/14
to std-pr...@isocpp.org
On 2014–06–10, at 11:18 AM, Patrick Michael Niedzielski <patrickni...@gmail.com> wrote:

Why was this "homework assignment" comment was necessary?  Be civil.

Homework is the only justification for ignoring a widely implemented solution for the sake of implementing something new.

There’s no shame in doing homework, but insisting on inventing something for novelty’s sake is Wrong. What we have here is a man in search of a solution in search of a problem. He doesn’t know what his extension is yet, he just knows he wants some kind of an extension. C++ handles multiple returns essentially the same as any other language, but it’s a “kludge” because it’s not bespoke.

Presupposition that new solutions are needed is not the way an engineer should think. Prototypes are built using the parts on hand, and no evidence has so far been presented that the parts do not fit. If we need more detail about multiple returns, it should be described within the immediate request and not left in the middle of an hours-long video series.

The past few emails you've sent in this thread have not come off as
civil.

The first post immediately raised red flags. The assertion that a new CPU architecture is completely different from everything else is immediately suspicious, because after 60+ years of experimentation there are few completely novel ideas. To communicate architectural details to a literate audience, it is expedient to reference literature. Following the YouTube link and checking the website via archive.org reinforces my impression that the materials, by intention or not, are geared to an illiterate audience of potential investors. This rubs me the wrong way.

Anyone with the experience to do what Ivan says he did, knows that multiple returns are commonplace in handwritten assembler for any conventional register-based architecture. (Even more common for stack machines.) So the assertion that they’re unsupported is just disingenuous. Even variable length returns are easily done with a call stack, and there is precedent for putting the stack in rename registers. This has nothing to do with C++, by the way.


“DSP code, all those loops are software pipelined, and software pipelines have unbounded ILP.”

Even DSP filters often have dependencies over loop iterations.

“Rather few general-purpose loops can be software pipelined. They contain really nasty things like function calls, that are full of control."

No, they cannot be pipelined because of Amdahl’s law: you can only go so far before hitting a dependent computation.

“Any of you who do do chip design know I’m lying through my teeth”

Well, I have done a smidgen of physical circuit design, and I’d like to see what magic network wiring can be a game-changer.


I’ll stop now and get back to work, but civility is not the best thing when it comes to scammers. Big dreams are well and good, but asking money for research that requires no materials, misrepresenting the degree of originality, is unjustified. Claiming that additional language support is needed when its not, and such requirement of extensions being a historical reason for failure of projects just like this, is disingenuous. There’s no problem with Ivan pursuing personal projects, but the connections drawn in messages here between C++ notation and machine code suggest a lack of the familiarity with modern compilers that is essential to ISA design. Acting like a seasoned pro with a comp-arch business plan, but without familiarity with the software development stack, is fraud.

TL;DR: If he wants to succeed, he will avoid language extensions and use pure LLVM IR as source. If the project is for real, it can take my flak. If it’s not, I’m doing skeptics a valuable service with public criticism.

Ivan Godard

unread,
Jun 10, 2014, 1:56:35 AM6/10/14
to std-pr...@isocpp.org

On 6/9/2014 10:01 PM, David Krauss wrote:

On 2014–06–10, at 12:21 PM, Ivan Godard <iv...@MillComputing.com> wrote:


On 6/9/2014 7:43 PM, David Krauss wrote:

On 2014–06–10, at 10:29 AM, Ivan Godard <iv...@MillComputing.com> wrote:

We know why other CPUs do not have hardware support

Other CPUs do support multiple returns, at the hardware level. How do they not? Can you name a single problem on a single architecture? On x64 specifically?

With a few historical exceptions, other CPUs do not have a call operation at all; they have a branch-and-link. On a Mill "f(a,b,c)" is one hardware operation, total. There is neither preamble in the caller other than the evaluation of the argument expressions, nor postamble within the callee. The call operation saves all state, exits the current frame and enters the new one, and passes all the arguments, in hardware, in one clock. Yes, it's a CISC.

You’ve changed the topic from returns to calls, but your answer still reveals the truth: current architectures do not have function return operations at all, they have branch-to-link (and perhaps load some registers) operations. Whatever registers aren’t overwritten by such an instruction are eligible to return data back to the caller.

The Mill is a new architecture. The return operation (also a hardware primitive) unwinds what the hardware call operation does.The Mill has no general registers, and values are SSA.

The entire notion of subroutines is a higher-level construct which helps to organize programs.

And it is up to the hardware designer to decide how close the hardware primitive should map to the language primitive. Yes, a subroutine is a construct to organize programs, but so is an add. At root all the hardware you need is two bricks and a roll of toilet paper; all else beyond that is cost-benefit. In the case of the Mill, our call and return ops are significantly higher level than is common, and multi-return falls out of the design. We find multi-return useful in all sorts of odd places, were it not so clumsy to use from conventional languages that lack OUT parameters.


For what it’s worth, the notion of instructions is also somewhat arbitrary. As a CISC instruction is decoded you could call its constituent encodings as instructions instead. Or, you could consider the sequence of RISC instructions accomplishing a function call to be one instruction. If you squint hard enough, then all the ISAs look the same, and they can all be represented with something like LLVM IR.

"Instruction" is not an arbitrary notion from the viewpoint of a decoder/dispatcher, and "operation" is not an arbitrary notion from the view of a functional unit like an ALU. We are in the process of porting our toolchain to LLVM, and have found that the IR makes some really egregious assumptions about the character of the eventual target. For example, it assumes that pointers are integers.

I wish to inquire for suggestions or current and/or historic language practice in support of functions returning more than one by-value result. The notation should be convenient, intuitive and not convoluted or unaesthetic, nor introduce superfluous variables.

If possible.

Those criteria are subjective. If you’re serious about solving the problem, I urge you to consider the solution used by essentially all functional and procedural languages, which is to pack the multiple values into one tuple.

I am looking for better, or ideas leading toward better.

Data-types are also merely human constructs that help organize programs. Even if you invent another formalism that avoids calling the return values a tuple, it will still be equally valid to treat them as such, because the difference is only in notation.

Are you sure of that? Consider a function returning a value of type T, and a function returning a single-component tuple whose component is of type T. Is there truly no T for which the behavior of the two is different? I had thought there was.


I mentioned that approach as a possible notation. However, it is ambiguous, because tuples are first class objects, so it is not obvious whether the function is to return one tuple-type object, or several by-value values. The difference is most evident when one of the tuple components has neither copy nor move constructors.

Returning a value without a move or copy constructor is only partially supported by the language: the caller must bind the result to a local (non-member) reference. Personally I believe this limitation to be unjustified, and in the long run I’d like to see it eliminated. However, others believe differently.
I would support you.


As for current programming practice, it’s safe to assume that return values are moveable or copyable.

Note that a partial workaround does exist, as piecewise construction of tuples. See std::piecewise_construct. This will accomplish a non-moveable object inside a tuple, but all tuple elements must be constructed simultaneously.

Thank you; I'll treat that as a homework assignment.

As for whether you might want a new mechanism, as opposed to more sugary tuples, are you prepared to sacrifice support or performance for programs that use tuples (or other aggregates) for this purpose?

The intent is that the extensions (this case, and others in process, which I may bring here) are proprietary, although if clang/gcc accepts them as contributions then they will be available to others. As for sacrificing performance, we are considering alternatives precisely because the tuple approach so badly blows out the hardware performance.

Consider a function with two int results, and an expression that wants the sum of them. On a Mill, using assembly language, that is two hardware operations - a call and an add. I have tried the corresponding tuple code in various compilation systems and targets available to me, and have been unable to get below 12 operations and some were over fifty. Your mileage may vary; if you have a platform that gets it to two then please post the source code used in the test, the target machine, and the compiler version and settings.


However, that’s only syntactic sugar and there’s no bearing on machine architecture. C++ is entirely machine independent
Well, there are those that say it incorporates a PDP-11. I've programmed PDP-11s, and I will say the C part of C++ looks remarkable familiar.
and machine-specific extensions are outside the scope of this discussion list.

That's a reasonable answer, thank you. Can you suggest a forum which might be interested in such discussion?

The one on your website appears to be a natural choice. You should also discuss with your customers, who will be the first to write code using your extensions.

As might be expected for a hardware company, our forums are mostly populated by those interested in architecture rather than languages. Hence I have come here.

I am moderately competent in language design - I was in on the revision of Algol68 and my name is in the Revised Report; I was on the Green team that won the Ada competition; I am the designer of the Mary family of languages, which I'm sure you have never heard of; have had my fingers in other languages to a lesser extent; and have done nearly a dozen compilers for a variety of languages and targets, one still in active use today forty years after I wrote it. However, I remain happy to seek the assistance of others; it is not possible to know everything in a field, and there is wisdom in crowds. I thank you for your time.

Ivan Godard

unread,
Jun 10, 2014, 2:12:54 AM6/10/14
to std-pr...@isocpp.org

On 6/9/2014 10:20 PM, David Krauss wrote:

On 2014–06–10, at 12:51 PM, Ivan Godard <iv...@MillComputing.com> wrote:

Consider a function F with two scalar results, and a function G accepting two arguments of the scalar type, where the intent is that the results of F will be the arguments of G - in the opposite order. I was hoping for something like:
                        G([lab:f()] lab$1, lab$0)
rather than the armwaving required to use tuples (note that I am very much not proposing that syntax; my purpose here is to gather suggestions for the syntax.

The only way for the C++ programmer to route values into functions is to pass them individually. Some languages support unpacking primitive tuples as arguments (and C++ only requires adding a wrapper function), but in your example the programmer would still need to name the items individually.

However, programmer notation has no bearing on object code generation. If your hardware has some distinct addresses for multiple return values, which need to be coded in a subsequent CALL instruction, then your code generator will need to detect the way data is being routed given intermediate-level program representation.
It doesn't have such addresses. The above code fragment (in Mill assembler) totals two call operations on one instruction (the Mill is a wide-issue machine in which an instruction can contain more than one operation, and yes, the operations can be calls). The actual asm code for the above is:
        calln(f, 2), call0(g, b1, b0);
where comma separates ops and semicolon separates instructions.


with each element of the tuple being able to be mapped to your return
values?  For a compiler targeting your architecture, is there anything
that prevents the optimizer from decomposing the std::tuple into
multiple, in-hardware return values?
Not that I know of, unless the function is bound at link time and not visible to the optimizer. In that case the distinction between returning a tuple-type object and returning multiple values becomes ambiguous. This problem is present whenever the optimizer is proposed to finesse imputed types that don't actually exist in the type system. I'm hoping to actually extend the type system for function types to include genuine multi-result types. We expect to add whatever extension we come up with to clang and gcc; this is not an academic exercise.

The first task in a compiler port is to design an ABI, so I suggest you specify aggregate return values to go into multiple registers. Is there a reason this wouldn't work as the general case?
Yes, there are reasons. To begin with, there are no registers.
(And is there any other possible programmer intent in returning an aggregate type?)
How about assigning to a variable of the aggregate type? Or passing to a function that expects the aggregate type? Or to printf, for a comical case.

David Krauss

unread,
Jun 10, 2014, 2:38:09 AM6/10/14
to std-pr...@isocpp.org
On 2014–06–10, at 1:56 PM, Ivan Godard <iv...@MillComputing.com> wrote:

The Mill is a new architecture. The return operation (also a hardware primitive) unwinds what the hardware call operation does.The Mill has no general registers, and values are SSA.

The point is that existing architectures support multiple returns, both through returned aggregates and inlined calls.

And it is up to the hardware designer to decide how close the hardware primitive should map to the language primitive.

Language primitives don’t map directly to hardware primitives. These days, you really need to design hardware for an abstraction for which code generation is practical. You will need to support generic static compilation for the likes of C, and generic JIT for the likes of JavaScript. Anything else isn’t really general-purpose, unless GPGPU is also “general-purpose.”

We are in the process of porting our toolchain to LLVM, and have found that the IR makes some really egregious assumptions about the character of the eventual target. For example, it assumes that pointers are integers.

According to http://llvm.org/docs/LangRef.html#pointer-type , each type has a corresponding pointer type as in C, but also there are multiple address spaces. The requirement that casts to integer type work may be a consequence of C support, or relying on a toolchain designed for C, but that’s not LLVM’s fault. In theory you should be able to do without (not support) such casts, but why? Any bit-pattern can be an unsigned int.

As a semester project (back about 10 years ago), I targeted LLVM to a machine I designed with separate address and data registers (or something like that) and discovered that the generic compiler components were less amenable to things that are less like x86. Indeed, it’s not perfect, but it’s a lot better than nothing. I kludged my architecture and made it work; in real life there would have been a migration path back to where I’d intended to be.

Consider what happened to IA-64. For a long time it was thought capable of a little more, if the compiler were just better. Ultimately it turned out to be merely on par, but at tremendous implementation cost.

Data-types are also merely human constructs that help organize programs. Even if you invent another formalism that avoids calling the return values a tuple, it will still be equally valid to treat them as such, because the difference is only in notation.

Are you sure of that? Consider a function returning a value of type T, and a function returning a single-component tuple whose component is of type T. Is there truly no T for which the behavior of the two is different? I had thought there was.

I don’t follow. I’m talking about computer science tuples, not std::tuple. If you invent a parallel system, it will still only parallel what we already have.

As for whether you might want a new mechanism, as opposed to more sugary tuples, are you prepared to sacrifice support or performance for programs that use tuples (or other aggregates) for this purpose?

The intent is that the extensions (this case, and others in process, which I may bring here) are proprietary, although if clang/gcc accepts them as contributions then they will be available to others. As for sacrificing performance, we are considering alternatives precisely because the tuple approach so badly blows out the hardware performance.

Why? Do you need lifetime analysis to follow values through tuple packing/unpacking?

Consider a function with two int results, and an expression that wants the sum of them. On a Mill, using assembly language, that is two hardware operations - a call and an add. I have tried the corresponding tuple code in various compilation systems and targets available to me, and have been unable to get below 12 operations and some were over fifty. Your mileage may vary; if you have a platform that gets it to two then please post the source code used in the test, the target machine, and the compiler version and settings.

ICC reduces my test to nothing. It tracked the values through the pack and unpack, and added them as constants.

http://goo.gl/TZEABO

Perhaps this test is too simplistic, but where do you need the complexity to be added?

As might be expected for a hardware company, our forums are mostly populated by those interested in architecture rather than languages. Hence I have come here.

Specific examples of code that should be fast are more-or-less on topic here. But, given the complexity of a competitive optimizer, you’re going to need to seriously leverage LLVM and/or GCC. Their development mailing lists should also be invaluable. You should be able to fix the bugs without resorting to language extensions, at least as far as function inlining and LTO can reach.

David Krauss

unread,
Jun 10, 2014, 2:49:04 AM6/10/14
to std-pr...@isocpp.org
I mean, is there a reason to treat multiple returns separately from aggregate returns.

Of course there are hardware registers, regardless of what the ISA codes.

>> (And is there any other possible programmer intent in returning an aggregate type?)
> How about assigning to a variable of the aggregate type? Or passing to a function that expects the aggregate type? Or to printf, for a comical case.

I don’t see how any of these are a problem. If you’re crossing a linkage boundary where code on either side doesn’t know the optimal format for the other side, then the data needs to be “serialized” in a standard way. Otherwise, you really need lifetime analysis and inlining to eliminate the pack/unpack operation. Not a super tall order, but perhaps a serious chink in the armor.

You will need to model the “belt” as registers to get anywhere. But this is all below the conceptual level of C++, which really only specifies that values get from point A to point B, and that things whose addresses are taken must reside definitively at a particular address.

Thiago Macieira

unread,
Jun 10, 2014, 2:55:31 AM6/10/14
to std-pr...@isocpp.org
Em seg 09 jun 2014, às 22:56:29, Ivan Godard escreveu:
> Consider a function with two int results, and an expression that wants
> the sum of them. On a Mill, using assembly language, that is two
> hardware operations - a call and an add. I have tried the corresponding
> tuple code in various compilation systems and targets available to me,
> and have been unable to get below 12 operations and some were over
> fifty. Your mileage may vary; if you have a platform that gets it to two
> then please post the source code used in the test, the target machine,
> and the compiler version and settings.

On the x86-64 and IA-64 Linux ABIs, the return of an aggregate of 2 integrals
should be done entirely on registers. So code like this:

struct C { long i, j; }
C f();
long g()
{
auto v = f();
return v.i + v.j;
}

Does expand to a call instruction and an add (plus any frame marker overhead).

I don't know why the above doesn't work for std::tuple -- it must be violating
one of the ABI for return in registers, but I haven't verified which one. In
any case, that's a subject to be discussed in the ABI forums, not in the C++
standard group.

The fact is that C++ does allow multiple values to be returned, even if the
syntax is clumsy. Your architecture does not need an extension. We all just
need an unclumsy solution, for all architectures.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Ivan Godard

unread,
Jun 10, 2014, 3:06:08 AM6/10/14
to std-pr...@isocpp.org
Dear me! I seem to have pushed a few buttons. Unintentionally I assure you.


On 6/9/2014 10:35 PM, David Krauss wrote:

On 2014–06–10, at 11:18 AM, Patrick Michael Niedzielski <patrickni...@gmail.com> wrote:

Why was this "homework assignment" comment was necessary?  Be civil.

Homework is the only justification for ignoring a widely implemented solution for the sake of implementing something new.

Not ignored, but not satisfied with either.
There’s no shame in doing homework, but insisting on inventing something for novelty’s sake is Wrong. What we have here is a man in search of a solution in search of a problem. He doesn’t know what his extension is yet, he just knows he wants some kind of an extension. C++ handles multiple returns essentially the same as any other language, but it’s a “kludge” because it’s not bespoke.

I find it to be kludgy (IMHO) because it breaks the natural flow of expressions, and I deem expression languages to be the superior approach to imperatives. Granted YMMV, and I show my Algol68 and Mary heritage. However, other languages have a much less lumpy exterior when faced with this problem: multiple returns express naturally in pure dataflow languages, and even Ada does a better job using OUT parameters.

Presupposition that new solutions are needed is not the way an engineer should think. Prototypes are built using the parts on hand, and no evidence has so far been presented that the parts do not fit.
Are you really advocating that, so long as you can squint and make a problem look kinda like a nail, there's no merit in considering what else than hammers might be usable?

If we need more detail about multiple returns, it should be described within the immediate request and not left in the middle of an hours-long video series.

My apologies; I had assumed that all readers of this board would understand what a "function returning multiple results" meant. Actually, I'm not sure I could explain it better.

The past few emails you've sent in this thread have not come off as
civil.

The first post immediately raised red flags. The assertion that a new CPU architecture is completely different from everything else is immediately suspicious, because after 60+ years of experimentation there are few completely novel ideas.
Quite true. Nearly all the CPU advances in the last 30 years have been in the fab; a Haswell would look very familiar to the designer of the 360/91, or even the STRETCH. Unfortunately, Moore's law is no longer with us. The last major architectural advance was the introduction of caches.

And that's why we felt that there was a market for a new architecture. It couldn't be the incremental evolutionary improvement that produced what we use today; to break a startup into the market requires something that is *very* different. We put in a decade doing that, and you are welcome to visit our site to see how it works to the extent that you are able to; our published material is intended primarily for architects and hardware designers.

To communicate architectural details to a literate audience, it is expedient to reference literature.
For academics, I suppose. We have found that to describe dynamic processes (and the guts of a CPU is nothing but dynamic) it really helps to have pictures, and especially animated pictures. As the purpose of our published material is to explain the machine to those who could implement it, we don't really care if they qualify as "literate".

Following the YouTube link and checking the website via archive.org reinforces my impression that the materials, by intention or not, are geared to an illiterate audience of potential investors. This rubs me the wrong way.

You leapt to conclusions. No "investor" without a background in CPU architecture would understand it.
Anyone with the experience to do what Ivan says he did, knows that multiple returns are commonplace in handwritten assembler for any conventional register-based architecture.
Yes; there's no problem in asm. I came here hoping for C++ help.

(Even more common for stack machines.) So the assertion that they’re unsupported is just disingenuous. Even variable length returns are easily done with a call stack, and there is precedent for putting the stack in rename registers. This has nothing to do with C++, by the way.


“DSP code, all those loops are software pipelined, and software pipelines have unbounded ILP.”

Even DSP filters often have dependencies over loop iterations.

While part of what we do for pipes is covered in other talks (see http://millcomputing.com/docd/metadata), our upcoming talk (early July at FaceBook) will go into it in greater detail, and in particular how the hardware deals with loop-carried data, including reductions.

“Rather few general-purpose loops can be software pipelined. They contain really nasty things like function calls, that are full of control."

No, they cannot be pipelined because of Amdahl’s law: you can only go so far before hitting a dependent computation.

Well, I suppose that is a good argument for leaving things as they are.

“Any of you who do do chip design know I’m lying through my teeth”

Well, I have done a smidgen of physical circuit design, and I’d like to see what magic network wiring can be a game-changer.

None of our advantages are at the circuit or fab level; we use bog-standard cell libraries.


I’ll stop now and get back to work, but civility is not the best thing when it comes to scammers.
You might try it :-)

Big dreams are well and good, but asking money for research that requires no materials, misrepresenting the degree of originality, is unjustified.
I'm not sure where you got the idea that we are asking for money. There is no such request on our site, and we are not seeking grants of any kind. We do now have investors, who have sought us out, but for the first decade of our work we were entirely a self-funded bootstrap organization. I have been full time on the Mill from the beginning, and this year is the first time that I have had a paycheck. Frankly, being called a scammer I find rather objectionable; that such behavior is permitted here speaks poorly for this board. My complements to Patrick Niedzielski; thank you for speaking for civility.


Claiming that additional language support is needed when its not,
That is not my claim: I stated that I found the conventional alternatives somewhat unsatisfactory (and gave reasons) and ask for help and advice in search for alternatives that could be used as extensions. There may not be alternatives; if I were sure then I would not have needed to bother this board. To respond that the search is wrongheaded and due to base motives will certainly serve to keep disruptive outsiders away from the pure and righteous.

Perhaps that is the intent, but it's not the way we run our railroad in the Mill project..

and such requirement of extensions being a historical reason for failure of projects just like this, is disingenuous. There’s no problem with Ivan pursuing personal projects, but the connections drawn in messages here between C++ notation and machine code suggest a lack of the familiarity with modern compilers that is essential to ISA design. Acting like a seasoned pro with a comp-arch business plan, but without familiarity with the software development stack, is fraud.

Yeah. This is pointless, so I'll withdraw. If any lurkers would like to continue this off the board you can reach me at ivan@millcomputing .com, or post comments at http://millcomputing/forums.

TL;DR: If he wants to succeed, he will avoid language extensions and use pure LLVM IR as source. If the project is for real, it can take my flak. If it’s not, I’m doing skeptics a valuable service with public criticism.

And we thank you for your public-spirited gesture.

Ivan

Ivan Godard

unread,
Jun 10, 2014, 4:13:13 AM6/10/14
to std-pr...@isocpp.org

On 6/9/2014 11:54 PM, Thiago Macieira wrote:
> <snip>

> The fact is that C++ does allow multiple values to be returned, even
> if the syntax is clumsy. Your architecture does not need an extension.
> We all just need an unclumsy solution, for all architectures.
Agreed.

However, I have an architecture to support, and have neither time nor
money nor interest in improving C++ outside that context, except as a
side effect. We are a commercial company, not an academy or government
agency, and if we are going to pay engineers to work on compilers it has
to directly benefit our product. On the other hand, we all benefit from
the community and we want to support it to the extent we can, so if we
were to come up with an unclumsy solution then we would contribute it to
clang and gcc so others could have practical experience with it, and
eventually standardize it if deemed worthy.

I had hoped that others might already have found the present C++ clumsy
enough that they had already kicked ideas around but just hadn't had the
resources to put them in the compilers to try out. Then our compiler
work, on our nickel, with our platform underneath, would give the
language community a real opportunity for exploration and experience,
while getting better use out of our hardware.

Instead I learn that not only is the present mechanism not clumsy, it is
so perfect as to be above examination. So much for my fraudulent attempt
to scam C++.

Ivan

Thiago Macieira

unread,
Jun 10, 2014, 4:43:10 AM6/10/14
to std-pr...@isocpp.org
Em ter 10 jun 2014, às 01:13:10, Ivan Godard escreveu:
> I had hoped that others might already have found the present C++ clumsy
> enough that they had already kicked ideas around but just hadn't had the
> resources to put them in the compilers to try out. Then our compiler
> work, on our nickel, with our platform underneath, would give the
> language community a real opportunity for exploration and experience,
> while getting better use out of our hardware.
>
> Instead I learn that not only is the present mechanism not clumsy, it is
> so perfect as to be above examination. So much for my fraudulent attempt
> to scam C++.

Did you search the archives? DId you see the email that pointed out to the
paper on this exact subject?

From reading all the emails in this thread, it seems that you got off on the
wrong foot by trying to make something that is special for your architecture.
You can bet that we 99% of the people here couldn't care less about your
architecture. We're not stake holders in it -- and many of us, including me,
are actually stakeholders in competitors.

Instead, the discussion should have focused only and exclusively on multiple
result. See also the threads "C++ named tuple", "Unpacking syntax for tuples
using auto", and a few others.

So, no, the status is not that we think the current solution is perfect. Far
from it. There's a lot of room to improve and you can and should contribute to
it.

But your attitude isn't helping.

Ivan Godard

unread,
Jun 10, 2014, 5:30:13 AM6/10/14
to std-pr...@isocpp.org

On 6/10/2014 1:42 AM, Thiago Macieira wrote:
> Em ter 10 jun 2014, às 01:13:10, Ivan Godard escreveu:
>> I had hoped that others might already have found the present C++ clumsy
>> enough that they had already kicked ideas around but just hadn't had the
>> resources to put them in the compilers to try out. Then our compiler
>> work, on our nickel, with our platform underneath, would give the
>> language community a real opportunity for exploration and experience,
>> while getting better use out of our hardware.
>>
>> Instead I learn that not only is the present mechanism not clumsy, it is
>> so perfect as to be above examination. So much for my fraudulent attempt
>> to scam C++.
> Did you search the archives?
No. I have no idea where the archives are. I did ask for pointers, if
you recall. I came here after finding nothing new to me in a Google of
"multi-result functions in C++", expecting that I might find better
search keys here, or other useful pointers.
> DId you see the email that pointed out to the
> paper on this exact subject?
No. Whose email was it? Can you repeat the pointer? Pretty much
everything I have seen in postings so far has appeared to be about the
tuple solution, but I don't need info about that.
> >From reading all the emails in this thread, it seems that you got off on the
> wrong foot by trying to make something that is special for your architecture.
> You can bet that we 99% of the people here couldn't care less about your
> architecture. We're not stake holders in it -- and many of us, including me,
> are actually stakeholders in competitors.
Well, in all innocence I was trying to motivate why I showed up a
stranger on this board, admittedly unfamiliar with its denizens and
conventions. I thought providing the context was a courtesy, but it
appears to have been seen as an attack.

It had been my understanding that a common path for additions to the
standard was for an interested party with an idea would implement the
idea as an extension in one of the widely used compilers (or boost for
something that could be done as a library). After experience with the
implementation, the community would then reject the idea, accept it, or
improve it, and eventually someone (possibly not the original
implementer) would draft it up as a proposal with all the proper
phrasing, and eventually, if not rejected, the idea would be enshrined.

As we will be porting the compilers anyway, it seemed not much more work
to see what we could come up with and incorporate it; if the idea turned
out to be sound then all benefit, and if not then the experience with
implementation and use should soon reveal that. I even naively thought
that the community here would be happy that someone was willing to pay
for the work involved in such exploration.
> Instead, the discussion should have focused only and exclusively on multiple
> result. See also the threads "C++ named tuple", "Unpacking syntax for tuples
> using auto", and a few others.
I shall, although the titles suggest more on tuples. Can you expand on
"a few others"?
> So, no, the status is not that we think the current solution is perfect. Far
> from it. There's a lot of room to improve and you can and should contribute to
> it.
Begging your pardon, but why would I want to after such a welcome? At
least in the circles that I do contribute to, people sometimes tell me
my stuff won't work, and people sometimes tell me it could be done
better, and people sometimes tell me the goals and approach are
wrong-headed, and people sometimes tell me I'm a newbie and should be
quiet until I learn more, but they don't call me a fraud.
> But your attitude isn't helping.
>
Really?

Ville Voutilainen

unread,
Jun 10, 2014, 6:12:42 AM6/10/14
to std-pr...@isocpp.org
On 10 June 2014 11:42, Thiago Macieira <thi...@macieira.org> wrote:
> Instead, the discussion should have focused only and exclusively on multiple
> result. See also the threads "C++ named tuple", "Unpacking syntax for tuples
> using auto", and a few others.
>
> So, no, the status is not that we think the current solution is perfect. Far
> from it. There's a lot of room to improve and you can and should contribute to
> it.


I have thought of figuring out a way to have multiple return values in
c++ a couple of
times. Every time, I have failed to find sufficient motivation for it.
I can return pairs,
tuples, custom structs, function-local structs (in c++14, with return
type deduction),
or use return-via argument (with either default-argument pointers that
users don't
need to provide, or with multiple overloads so that the don't-care-cases don't
bother either the caller or the callee), and the remaining cases where
I'd really
want multiple independent return values seem vanishingly rare.

The cases where I'd want to unpack into independent variables instead
of just using a
struct are even more vanishingly rare. Yeah, sure, python and lisp
give me nice tools for
such vanishingly rare cases, but I fail to convince even myself that
such a facility
is necessary, let alone convince anyone else. Solutions that muck with
declarator
syntax certainly don't seem worth the trouble.

Selling multiple return values as a new feature is going to be quite
hard, I expect.
The motivation seems weak, the use cases seem rare, and it has been discussed
quite many times with no new results coming out of those discussions.

Ivan Godard

unread,
Jun 10, 2014, 6:36:32 AM6/10/14
to std-pr...@isocpp.org
Users of languages with OUT and INOUT parameters seem to make quite
heavy use of them, which suggests that the (semantically equivalent)
multi-result functions should be equally common. Perhaps the infrequent
need that you report is due in part to the inconvenience of the
notation, so you never got into the habit of using value-result arguments?

Users of those languages also assert that it is much easier to prove
correctness, and in general to reason about programs, when using
value-result rather than when using pass-by-reference; this is
especially true when reasoning about concurrent behavior in
multithreaded applications. While it is true that a compiler can (often)
determine that a by-reference parameter is really an OUT parameter, it
is rather more difficult for a mere human to do so, and even compilers
have problems when the analysis scope is less than whole program
optimization.

Ville Voutilainen

unread,
Jun 10, 2014, 6:47:38 AM6/10/14
to std-pr...@isocpp.org
On 10 June 2014 13:36, Ivan Godard <iv...@millcomputing.com> wrote:
> Users of languages with OUT and INOUT parameters seem to make quite heavy
> use of them, which suggests that the (semantically equivalent) multi-result
> functions should be equally common. Perhaps the infrequent need that you
> report is due in part to the inconvenience of the notation, so you never got
> into the habit of using value-result arguments?

For all purposes I've ever needed them, I already have out parameters in C++,
and I have had zero problems with T&, I've never felt a need for an
'out T'. There
is hardly any inconvenience of notation, returning a struct is as convenient
as I need it to be. Tuples are less than convenient due to having to use
std::get<N>, which isn't very readable. Pairs are more convenient, although
.first and .second are not exactly meaning-conveying either.

> Users of those languages also assert that it is much easier to prove
> correctness, and in general to reason about programs, when using
> value-result rather than when using pass-by-reference; this is especially
> true when reasoning about concurrent behavior in multithreaded applications.
> While it is true that a compiler can (often) determine that a by-reference
> parameter is really an OUT parameter, it is rather more difficult for a mere
> human to do so, and even compilers have problems when the analysis scope is
> less than whole program optimization.


I should probably mention that if std::expected lands into C++,
there's yet fewer
cases where I'd ever resort to returning multiple values.

But by all means, if you want, write a proposal. I'm merely trying to
point out what
the likely counterarguments are going to be. And do keep in mind that those
counterarguments will likely be raised by EWG members who have a fair bit
of influence. The proposals for this facility have never succeeded
before, so having it succeed
now would, to me, logically, require better motivation/rationale than
earlier attempts
of it. Thus far I don't think I have seen any new arguments for the facility.

Edward Catmur

unread,
Jun 10, 2014, 7:07:46 AM6/10/14
to std-pr...@isocpp.org, iv...@millcomputing.com
Consider a function F with two scalar results, and a function G
accepting two arguments of the scalar type, where the intent is that the
results of F will be the arguments of G - in the opposite order. I was
hoping for something like:
                          G([lab:f()] lab$1, lab$0)
rather than the armwaving required to use tuples (note that I am very
much not proposing that syntax; my purpose here is to gather suggestions
for the syntax.

Three ways to write that with current syntax:

void x() {
 
[](auto lab) { return g(std::get<1>(lab), std::get<0>(lab)); }(f());
}
void y() {
 
[](decltype(f()) lab=f()) { return g(std::get<1>(lab), std::get<0>(lab)); }();
}
void z() {
 
auto lab = f();
  g
(std::get<1>(lab), std::get<0>(lab));
}

Clang compiles all three identically (with f returning std::tuple<int, double>):

 pushq %rax
 callq f
()
 movl
%eax, %edi
 popq
%rax
 jmp g
(double, int)                  # TAILCALL

If clang can generate (to my eyes) optimal code on x86-64, an architecture not specifically designed for multiple argument return, then it should be able to do the same on your architecture, just as long as you get the ABI right. There doesn't seem to be any justification for new syntax from the point of view of performance.

I'm not saying the above syntax is perfect - it's clearly pretty clumsy - but it's clearly expressive enough to communicate to both the reader and the optimizer the intended behavior.

With Peter Sommerlad's apply[1] and a little helper function permute one can write:

std::apply(g, permute<1, 0>(f()))

Again, clang produces identical code.

Ivan Godard

unread,
Jun 10, 2014, 7:23:39 AM6/10/14
to std-pr...@isocpp.org

On 6/10/2014 4:07 AM, Edward Catmur wrote:
Consider a function F with two scalar results, and a function G
accepting two arguments of the scalar type, where the intent is that the
results of F will be the arguments of G - in the opposite order. I was
hoping for something like:
                          G([lab:f()] lab$1, lab$0)
rather than the armwaving required to use tuples (note that I am very
much not proposing that syntax; my purpose here is to gather suggestions
for the syntax.

Three ways to write that with current syntax:

void x() {
 
[](auto lab) { return g(std::get<1>(lab), std::get<0>(lab)); }(f());
}
void y() {
 
[](decltype(f()) lab=f()) { return g(std::get<1>(lab), std::get<0>(lab)); }();
}
void z() {
 
auto lab = f();
  g
(std::get<1>(lab), std::get<0>(lab));
}

Clang compiles all three identically (with f returning std::tuple<int, double>):

 pushq %rax
 callq f
()
 movl
%eax, %edi
 popq
%rax
 jmp g
(double, int)                  # TAILCALL


<snip>

Thank you; this is much better than I had seen before. If the compiler is as effective on other cases then it lays my concerns about performance to rest.

Matthew Woehlke

unread,
Jun 10, 2014, 12:59:28 PM6/10/14
to std-pr...@isocpp.org
On 2014-06-10 06:47, Ville Voutilainen wrote:
> Tuples are less than convenient due to having to use
> std::get<N>, which isn't very readable.

It occurred to me... syntactically, is there a reason why it would be
hard to implement:

std::tuple<int, double, char, long> a;
auto a1 = a<1>; // a1 has type double
auto a3 = a<-1>; // a3 has type long
auto ax = a<2:> // ax has type tuple<char, long>
auto ay = a<:2,3> // ay has type tuple<int, double, long>

...? (Similar for other Pythonic slicing.)

IOW, t<n> where 't' is a tuple would be shorthand for std::get<n>(t).
Then also allow recomposing tuples with slicing and multiple slicing
arguments using ','.

This, plus tuple argument unpacking, would solve the problem of passing
partial or out-of-order return values as arguments:

g(<*>(f()<1,0>));

(Using '<*>' this time for tuple-to-argument unpacking, as it's less
ambiguous than just '*' to tell the compiler that something other than
'operator*' is going on.)

> I should probably mention that if std::expected lands into C++,
> there's yet fewer cases where I'd ever resort to returning multiple
> values.

Of course, I'd claim that std::expected *is* a multiple return value
:-). So is std::pair, std::tuple, structs, etc. :-). Which is why I
don't understand why there is such an issue here; we already *have*
multiple return values. If Ivan's API can't deal with them in a
reasonable manner, that's not the language's fault.

(Though, that said, previous comments suggest that maybe there *is* a
problem with std::tuple being treated as an aggregate. If so, that
possibly *would* be the language's fault, assuming it isn't just a bug
in the library implementation.)

I do think there is room to have better syntactic sugar, e.g. '<int,
int> foo();' as shorthand for 'std::tuple<int, int> foo();', tuple
unpacking, and possible tuple slicing as suggested above.

And possibly even at some point '<int...>' as shorthand for some
yet-unknown mechanism for returning a VLA of 'int'.

--
Matthew

Ville Voutilainen

unread,
Jun 10, 2014, 1:58:19 PM6/10/14
to std-pr...@isocpp.org
On 10 June 2014 19:59, Matthew Woehlke <mw_t...@users.sourceforge.net> wrote:
>> I should probably mention that if std::expected lands into C++,
>> there's yet fewer cases where I'd ever resort to returning multiple
>> values.
> Of course, I'd claim that std::expected *is* a multiple return value
> :-). So is std::pair, std::tuple, structs, etc. :-). Which is why I
> don't understand why there is such an issue here; we already *have*
> multiple return values. If Ivan's API can't deal with them in a
> reasonable manner, that's not the language's fault.

I think we agree there. I see very little need for core language
multiple-return-values.

> I do think there is room to have better syntactic sugar, e.g. '<int,
> int> foo();' as shorthand for 'std::tuple<int, int> foo();', tuple
> unpacking, and possible tuple slicing as suggested above.


If someone can come up with a decent proposal that assures there are no
specification or implementation difficulties with such things, I'm sure the
committee will listen. I doubt it will be so easy to either implement or
specify, and I don't think tuples deserve such special syntax that stands out
from what we already have.

Thiago Macieira

unread,
Jun 11, 2014, 9:55:46 AM6/11/14
to std-pr...@isocpp.org
Em ter 10 jun 2014, às 02:30:10, Ivan Godard escreveu:
> On 6/10/2014 1:42 AM, Thiago Macieira wrote:
> > Did you search the archives?
>
> No. I have no idea where the archives are. I did ask for pointers, if
> you recall. I came here after finding nothing new to me in a Google of
> "multi-result functions in C++", expecting that I might find better
> search keys here, or other useful pointers.

This is a mailing list. All public mailing lists have archives, probably
multiple ones, on the internet. Etiquette rules for mailing lists say that you
should always consult the archive to see if your question has been answered
before.

Maybe you didn't know it. Now you do. The link to the mailing list's website
is in the footer of every single email you receive. And it's indexed by
Google.

> > DId you see the email that pointed out to the
> > paper on this exact subject?
>
> No. Whose email was it? Can you repeat the pointer? Pretty much
> everything I have seen in postings so far has appeared to be about the
> tuple solution, but I don't need info about that.

Thomas Braxton's email dated Mon, 9 Jun 2014 23:57:35 -0500 (Message-Id
CAPHJ0U03X=WYR6mHtEApK8Khq09nEY...@mail.gmail.com). He was
simply giving you the pointer to the N4025 paper.

> > >From reading all the emails in this thread, it seems that you got off on
> > >the>
> > wrong foot by trying to make something that is special for your
> > architecture. You can bet that we 99% of the people here couldn't care
> > less about your architecture. We're not stake holders in it -- and many
> > of us, including me, are actually stakeholders in competitors.
>
> Well, in all innocence I was trying to motivate why I showed up a
> stranger on this board, admittedly unfamiliar with its denizens and
> conventions. I thought providing the context was a courtesy, but it
> appears to have been seen as an attack.

Context is good, but it was not perceived as context. It was perceived mostly
as the motivation/agenda and by setting your hardware apart. You can
understand that we don't really care to help make your hardware apart from
ours...

What's more, tying language features to hardware capabilities is really
frowned upon. That's a recipe for getting something rejected by the standards
committee. Instead, like I said before, you should approach this from the "how
do I make the language better" point of view. What is the problem you're
trying to solve? Are you sure that it requires a language extension to do it?
Can't we find a better solution in different ways?

And by the way, when calling conventions are involved, the C++ language
follows in close lockstep with C, so you should also approach WG14 about this.

> It had been my understanding that a common path for additions to the
> standard was for an interested party with an idea would implement the
> idea as an extension in one of the widely used compilers (or boost for
> something that could be done as a library). After experience with the
> implementation, the community would then reject the idea, accept it, or
> improve it, and eventually someone (possibly not the original
> implementer) would draft it up as a proposal with all the proper
> phrasing, and eventually, if not rejected, the idea would be enshrined.

Correct.

> As we will be porting the compilers anyway, it seemed not much more work
> to see what we could come up with and incorporate it; if the idea turned
> out to be sound then all benefit, and if not then the experience with
> implementation and use should soon reveal that. I even naively thought
> that the community here would be happy that someone was willing to pay
> for the work involved in such exploration.

Also true. However, you are talking about a subject that has been discussed
before, with very different solutions from what you were proposing. If you add
that to the misunderstanding on the context/motivation, it sounded like you
were just trying to push a particular agenda, as opposed to improving C++ for
everyone.

> > Instead, the discussion should have focused only and exclusively on
> > multiple result. See also the threads "C++ named tuple", "Unpacking
> > syntax for tuples using auto", and a few others.
>
> I shall, although the titles suggest more on tuples. Can you expand on
> "a few others"?

Not really. I do remember lengthy discussions and those are the two I could
easily find. And since this seems to be a particularly popular topic, there may
have been discussions before my joining this mailing list and in the standards
committee itself. So I think there's more to be read, but I can't be sure.

> > So, no, the status is not that we think the current solution is perfect.
> > Far from it. There's a lot of room to improve and you can and should
> > contribute to it.
>
> Begging your pardon, but why would I want to after such a welcome? At

From our point of view, we did everything right and the one at fault was you.
In any case, Internet requires thick skin. So I am trying right now to clear
up the misunderstanding so we can work together.

> least in the circles that I do contribute to, people sometimes tell me
> my stuff won't work, and people sometimes tell me it could be done
> better, and people sometimes tell me the goals and approach are
> wrong-headed, and people sometimes tell me I'm a newbie and should be
> quiet until I learn more, but they don't call me a fraud.

Tony V E

unread,
Jun 11, 2014, 2:22:26 PM6/11/14
to std-pr...@isocpp.org
On Wed, Jun 11, 2014 at 9:55 AM, Thiago Macieira <thi...@macieira.org> wrote:
From our point of view, we did everything right and the one at fault was you.

I don't know, I think some comments went too far.
 
In any case, Internet requires thick skin. So I am trying right now to clear
up the misunderstanding so we can work together.


How about we just start over.

Basically:

- new hardware. OK, interesting, but whatever - it needs to be a useful language feature regardless of hardware.
- multiple return values: discussed in the past, some want it, some don't think it is worth it

note the "worth" part - adding language (not just library) to the standard is very costly in terms of time (of committee members) and resulting complexity - the language is too complicated already.

You might go so far to say "we almost all want it, we almost all don't think it is worth it".

Personally, I'd like it.  I'd also like { a, b } to be a tuple (or convertible to tuple), basically.  For return, I would definitely head down the road of

    { a, b } = f();

Is it worth it?  I could paraphrase Ville as "I already have 7 workarounds, I don't need it".  But to me that is a sign of needing it.

!! But only if it is elegant !!

And that's the hard part.  I have no references to previous discussions, but it has been tackled (unsuccessfully) in the past.

Also, it isn't a small feature - it isn't just return.  It is also packing and unpacking, declaration, etc.  It touches so many parts of the language.

So I might paraphrase Ville slightly differently "I already have 7 workarounds, I don't need it... that badly".

Tony

Ivan Godard

unread,
Jun 11, 2014, 3:03:17 PM6/11/14
to std-pr...@isocpp.org

On 6/11/2014 11:22 AM, Tony V E wrote:



On Wed, Jun 11, 2014 at 9:55 AM, Thiago Macieira <thi...@macieira.org> wrote:
From our point of view, we did everything right and the one at fault was you.

I don't know, I think some comments went too far.
 
In any case, Internet requires thick skin. So I am trying right now to clear
up the misunderstanding so we can work together.


How about we just start over.
<snip>


So I might paraphrase Ville slightly differently "I already have 7 workarounds, I don't need it... that badly".

Yes, I have understood that the answer to my question "are there better ideas around that just need an implementation for experience" was: no. Not the answer I was hoping for, but I am answered.
Tony

And I thank you for your gracious answer to a newbie. I'll go away now.

But before I do, I'll comment on the line from Thiago that you quote: "In any case, Internet requires thick skin".

Why should it?

I participate in many boards, forums, and standards groups, and each such group has a "feel", a character, that is as unmistakable as the feel you get walking into a strange bar. Some bars require a thick skin - or an exit to a different watering hole where the beer doesn't come with testosterone chasers.

Ivan

Ville Voutilainen

unread,
Jun 11, 2014, 3:42:35 PM6/11/14
to std-pr...@isocpp.org
On 11 June 2014 21:22, Tony V E <tvan...@gmail.com> wrote:
> Also, it isn't a small feature - it isn't just return. It is also packing
> and unpacking, declaration, etc. It touches so many parts of the language.

The unpacking syntax you showed as an example also nicely complicates
the implementation for handling declarations. Having worked on that, I may
be personally biased against making it any more complicated than it already
is, unless there are very, very strong reasons for it. Multiple return values
do not pass that bar for me.

> So I might paraphrase Ville slightly differently "I already have 7
> workarounds, I don't need it... that badly".


I don't have 7 work-arounds, I have 7 alternative ways to solve the problem,
one of which (a named struct with named members, any invariants I choose,
and any semantics I want to express) is vastly superior to multiple return
values. The only case where I've found a real need for multiple return
values and
especially the unpacking of those into single variables is quick prototypes.
Well, C++ can't be a language for everything; perhaps it's sub-optimal for
quick prototypes. I can't say that's a problem for me.

Richard Smith

unread,
Jun 11, 2014, 4:39:07 PM6/11/14
to std-pr...@isocpp.org, Herb Sutter
+1

I don't think it's fair to blame a bad experience in a particular group on "the internet"; messages come from people. If useful contributors are being discouraged from participating due to aggressive or antagonistic responses here, that is a problem that we should be thinking about how to address.

gmis...@gmail.com

unread,
Jun 11, 2014, 10:09:16 PM6/11/14
to std-pr...@isocpp.org, hsu...@microsoft.com
I think this particular thread just got off to a bad start and isn't typical of the more general "problem" I think this forum has.

I watched this thread as it unfolded and the initial post appeared to me (wrongly as it turns out) to be some kind of "scam/troll" post. I found the idea that somebody was working on a CPU that was specializing in the particular "problem" as highly unlikely. So that made me kind of sceptical if the post was real and I wasn't sure if to take it seriously or not.

But what confirmed that misunderstanding was some of the phrasing used by the poster, which seemed to be pretty dislikeable. In particular, it was the part in the conversation where the poster appeared to be effectively saying "I only care about my cpu and nothing else, yet I still expect you to help me with my niche CPU and extend C++ to support it anyway". As in, "I don't care about anything else, but you should.".

Given that this is the hello, "iso" cpp forum, a statement like that to this forum, of all forums, would seem to me like being the surest way to be shown the door!

Standardization and niche aren't instant easy bed fellows I would think. That doesn't mean one can't help though, but the way things initially came across, it didn't make it look like help was warranted.

I hope the poster will take that comment to heart and not be offended by it. I hope that explains to the OP, why things might have started off badly, even if it doesn't fully justify all of how things went.

So overall, given the circumstance, I'm not entirely sure how this particular thread could have played out differently. I personally can see why some of the apparent hostility was drawn even if I do think it would have been better if it hadn't happened.

But things did settle down eventually. It just took a few posts to realise the OP is a serious person trying to do something serious, if unusual (to me) and "we" got the right answer in the end which is Tony's conclusion: Misunderstanding, start again.

But as a new-ish person myself here, I agree with Richard's +1, that seems to allude to their still being a general problem even if this particular thread doesn't directly demonstrate it (IMHO).

To avoid talking too abstractly, Ville makes a great example of the "problem", or not, with this forum. Ville is usually 95% right, in my opinion, if not higher. Many of the ideas he's shot down, I initially thought had potential, but when I went away and tried to advance them more, they just weren't that great. Ville was right. He usually is.

But I wonder, for the experts (which probably means most people on this site compared to the outside world at large), isn't it a case of, "with great power comes great responsibility" and that means entertaining mere mortals (more) why they argue a bad idea badly?

That will make this site more like Reddit, which I would say is bad, but I actually spend as much time on Reddit reading things as I do here (and it's increasing). I can't say the standard there is higher, but sometimes I can say the "feel" is.

I wouldn't want to be without either site, but if C++ is to start growing again (assuming it has stalled or shrunk), I can only see that being achieved through everybody being more welcoming to new (aka not new) *bad* ideas - to encourage that interest and growth and tomorrows experts.

If the best of both worlds could be achieved, would that be great?

Anyway, you have my great idea on how to make this site more approachable. Will that make it better, I don't  know, but there's one way to find out.

Let the arrows begin! :)

PS
Yes, of course I shoot arrows too on my bad days; I no doubt will continue to do so too.

morw...@gmail.com

unread,
Jun 12, 2014, 3:48:24 AM6/12/14
to std-pr...@isocpp.org
Le mercredi 11 juin 2014 21:42:35 UTC+2, Ville Voutilainen a écrit :
I don't have 7 work-arounds, I have 7 alternative ways to solve the problem,
one of which (a named struct with named members, any invariants I choose,
and any semantics I want to express) is vastly superior to multiple return
values. The only case where I've found a real need for multiple return
values and
especially the unpacking of those into single variables is quick prototypes.
Well, C++ can't be a language for everything; perhaps it's sub-optimal for
quick prototypes. I can't say that's a problem for me.

There is another area where I often want multiple return values or tuple
unpacking. It is when the returned values cannot be "named" but are not
a "collection" of values either. It often happens in maths. When I write a
function to solve a quadratic equation, I want it to return the two solutions,
but the return values are nothing more than "two solutions". They don't need
a name of their own. So I return a std::pair or a std::array and then use
std::get when I want to use the values. But it would be somehow cleaner if
the values could be automatically unpacked in new variables without having
to pass by a pair/tuple/array.

But these are the only times where I "strongly" want such a feature. Most
of the time, I find it neater to use structures with named values.
Reply all
Reply to author
Forward
0 new messages