Has C++ multi-result syntax and semantics been explored before? Can anyone supply citations?
--
---
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/4g3NYpKwQX0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
We know why other CPUs do not have hardware support
, but that's not the issue: right or wrong, we do have such support. The question is how to make it naturally available as an extension of C++.
And yes, with sufficient analysis the compiler may bbe able to figure out in some (many?) cases that a regular C++ single-valued function can use the hardware operation, but that too is not the issue: we are looking for a language extension by which the programmer can state his intention that the function has some specific number of results other than zero or one, whether or not the compiler could figure it out.
Without requiring bogus dummy class types, or detecting that a reference type is really not a reference type, or other kludgery.
On 2014–06–10, at 10:29 AM, Ivan Godard <iv...@MillComputing.com> wrote:
We know why other CPUs do not have hardware support
Other CPUs do support multiple returns, at the hardware level. How do they not? Can you name a single problem on a single architecture? On x64 specifically?
, but that's not the issue: right or wrong, we do have such support. The question is how to make it naturally available as an extension of C++.
You asked for existing practice and literature. If extending the language is part of your homework assignment, so be it, but again you need to state what difference needs to be made.
And yes, with sufficient analysis the compiler may bbe able to figure out in some (many?) cases that a regular C++ single-valued function can use the hardware operation, but that too is not the issue: we are looking for a language extension by which the programmer can state his intention that the function has some specific number of results other than zero or one, whether or not the compiler could figure it out.
Without requiring bogus dummy class types, or detecting that a reference type is really not a reference type, or other kludgery.
If you dig into implementations of other multiple-return languages such as Python, Haskell, ML, etc, you will find that multiple return values are always implemented as single values of tuple type. Nothing is lost in such a formalism so there’s no reason not to model things that way. All that C++ lacks is declarations that unpack tuples. Such a feature is hard to reconcile with the C legacy of declarations supporting introduction of multiple names of the same type. But, we will probably see it added eventually.
However, that’s only syntactic sugar and there’s no bearing on machine architecture. C++ is entirely machine independent
and machine-specific extensions are outside the scope of this discussion list.
On 6/9/2014 7:43 PM, David Krauss wrote:
With a few historical exceptions, other CPUs do not have a call operation at all; they have a branch-and-link. On a Mill "f(a,b,c)" is one hardware operation, total. There is neither preamble in the caller other than the evaluation of the argument expressions, nor postamble within the callee. The call operation saves all state, exits the current frame and enters the new one, and passes all the arguments, in hardware, in one clock. Yes, it's a CISC.
On 2014–06–10, at 10:29 AM, Ivan Godard <iv...@MillComputing.com> wrote:
We know why other CPUs do not have hardware support
Other CPUs do support multiple returns, at the hardware level. How do they not? Can you name a single problem on a single architecture? On x64 specifically?
I wish to inquire for suggestions or current and/or historic language practice in support of functions returning more than one by-value result. The notation should be convenient, intuitive and not convoluted or unaesthetic, nor introduce superfluous variables.
If possible.
I mentioned that approach as a possible notation. However, it is ambiguous, because tuples are first class objects, so it is not obvious whether the function is to return one tuple-type object, or several by-value values. The difference is most evident when one of the tuple components has neither copy nor move constructors.
Well, there are those that say it incorporates a PDP-11. I've programmed PDP-11s, and I will say the C part of C++ looks remarkable familiar.However, that’s only syntactic sugar and there’s no bearing on machine architecture. C++ is entirely machine independent
That's a reasonable answer, thank you. Can you suggest a forum which might be interested in such discussion?and machine-specific extensions are outside the scope of this discussion list.
Consider a function F with two scalar results, and a function G accepting two arguments of the scalar type, where the intent is that the results of F will be the arguments of G - in the opposite order. I was hoping for something like:
G([lab:f()] lab$1, lab$0)
rather than the armwaving required to use tuples (note that I am very much not proposing that syntax; my purpose here is to gather suggestions for the syntax.
with each element of the tuple being able to be mapped to your returnNot that I know of, unless the function is bound at link time and not visible to the optimizer. In that case the distinction between returning a tuple-type object and returning multiple values becomes ambiguous. This problem is present whenever the optimizer is proposed to finesse imputed types that don't actually exist in the type system. I'm hoping to actually extend the type system for function types to include genuine multi-result types. We expect to add whatever extension we come up with to clang and gcc; this is not an academic exercise.
values? For a compiler targeting your architecture, is there anything
that prevents the optimizer from decomposing the std::tuple into
multiple, in-hardware return values?
I apologize, I'm still fuzzy onIt is unclear to me that the tuple solution works when the result component types lack copy constructors.
what your hardware support for multiple return values means. But what
cases could not be covered by this solution, which doesn't need any
language changes?
Why was this "homework assignment" comment was necessary? Be civil.
The past few emails you've sent in this thread have not come off as
civil.
On 2014–06–10, at 12:21 PM, Ivan Godard <iv...@MillComputing.com> wrote:
On 6/9/2014 7:43 PM, David Krauss wrote:
With a few historical exceptions, other CPUs do not have a call operation at all; they have a branch-and-link. On a Mill "f(a,b,c)" is one hardware operation, total. There is neither preamble in the caller other than the evaluation of the argument expressions, nor postamble within the callee. The call operation saves all state, exits the current frame and enters the new one, and passes all the arguments, in hardware, in one clock. Yes, it's a CISC.
On 2014–06–10, at 10:29 AM, Ivan Godard <iv...@MillComputing.com> wrote:
We know why other CPUs do not have hardware support
Other CPUs do support multiple returns, at the hardware level. How do they not? Can you name a single problem on a single architecture? On x64 specifically?
You’ve changed the topic from returns to calls, but your answer still reveals the truth: current architectures do not have function return operations at all, they have branch-to-link (and perhaps load some registers) operations. Whatever registers aren’t overwritten by such an instruction are eligible to return data back to the caller.
The entire notion of subroutines is a higher-level construct which helps to organize programs.
For what it’s worth, the notion of instructions is also somewhat arbitrary. As a CISC instruction is decoded you could call its constituent encodings as instructions instead. Or, you could consider the sequence of RISC instructions accomplishing a function call to be one instruction. If you squint hard enough, then all the ISAs look the same, and they can all be represented with something like LLVM IR.
I wish to inquire for suggestions or current and/or historic language practice in support of functions returning more than one by-value result. The notation should be convenient, intuitive and not convoluted or unaesthetic, nor introduce superfluous variables.
If possible.
Those criteria are subjective. If you’re serious about solving the problem, I urge you to consider the solution used by essentially all functional and procedural languages, which is to pack the multiple values into one tuple.
Data-types are also merely human constructs that help organize programs. Even if you invent another formalism that avoids calling the return values a tuple, it will still be equally valid to treat them as such, because the difference is only in notation.
I mentioned that approach as a possible notation. However, it is ambiguous, because tuples are first class objects, so it is not obvious whether the function is to return one tuple-type object, or several by-value values. The difference is most evident when one of the tuple components has neither copy nor move constructors.
Returning a value without a move or copy constructor is only partially supported by the language: the caller must bind the result to a local (non-member) reference. Personally I believe this limitation to be unjustified, and in the long run I’d like to see it eliminated. However, others believe differently.
As for current programming practice, it’s safe to assume that return values are moveable or copyable.
Note that a partial workaround does exist, as piecewise construction of tuples. See std::piecewise_construct. This will accomplish a non-moveable object inside a tuple, but all tuple elements must be constructed simultaneously.
As for whether you might want a new mechanism, as opposed to more sugary tuples, are you prepared to sacrifice support or performance for programs that use tuples (or other aggregates) for this purpose?
Well, there are those that say it incorporates a PDP-11. I've programmed PDP-11s, and I will say the C part of C++ looks remarkable familiar.However, that’s only syntactic sugar and there’s no bearing on machine architecture. C++ is entirely machine independent
That's a reasonable answer, thank you. Can you suggest a forum which might be interested in such discussion?and machine-specific extensions are outside the scope of this discussion list.
The one on your website appears to be a natural choice. You should also discuss with your customers, who will be the first to write code using your extensions.
On 2014–06–10, at 12:51 PM, Ivan Godard <iv...@MillComputing.com> wrote:
Consider a function F with two scalar results, and a function G accepting two arguments of the scalar type, where the intent is that the results of F will be the arguments of G - in the opposite order. I was hoping for something like:
G([lab:f()] lab$1, lab$0)
rather than the armwaving required to use tuples (note that I am very much not proposing that syntax; my purpose here is to gather suggestions for the syntax.
The only way for the C++ programmer to route values into functions is to pass them individually. Some languages support unpacking primitive tuples as arguments (and C++ only requires adding a wrapper function), but in your example the programmer would still need to name the items individually.
However, programmer notation has no bearing on object code generation. If your hardware has some distinct addresses for multiple return values, which need to be coded in a subsequent CALL instruction, then your code generator will need to detect the way data is being routed given intermediate-level program representation.
with each element of the tuple being able to be mapped to your returnNot that I know of, unless the function is bound at link time and not visible to the optimizer. In that case the distinction between returning a tuple-type object and returning multiple values becomes ambiguous. This problem is present whenever the optimizer is proposed to finesse imputed types that don't actually exist in the type system. I'm hoping to actually extend the type system for function types to include genuine multi-result types. We expect to add whatever extension we come up with to clang and gcc; this is not an academic exercise.
values? For a compiler targeting your architecture, is there anything
that prevents the optimizer from decomposing the std::tuple into
multiple, in-hardware return values?
The first task in a compiler port is to design an ABI, so I suggest you specify aggregate return values to go into multiple registers. Is there a reason this wouldn't work as the general case?
(And is there any other possible programmer intent in returning an aggregate type?)
The Mill is a new architecture. The return operation (also a hardware primitive) unwinds what the hardware call operation does.The Mill has no general registers, and values are SSA.
And it is up to the hardware designer to decide how close the hardware primitive should map to the language primitive.
We are in the process of porting our toolchain to LLVM, and have found that the IR makes some really egregious assumptions about the character of the eventual target. For example, it assumes that pointers are integers.
Are you sure of that? Consider a function returning a value of type T, and a function returning a single-component tuple whose component is of type T. Is there truly no T for which the behavior of the two is different? I had thought there was.Data-types are also merely human constructs that help organize programs. Even if you invent another formalism that avoids calling the return values a tuple, it will still be equally valid to treat them as such, because the difference is only in notation.
The intent is that the extensions (this case, and others in process, which I may bring here) are proprietary, although if clang/gcc accepts them as contributions then they will be available to others. As for sacrificing performance, we are considering alternatives precisely because the tuple approach so badly blows out the hardware performance.As for whether you might want a new mechanism, as opposed to more sugary tuples, are you prepared to sacrifice support or performance for programs that use tuples (or other aggregates) for this purpose?
Consider a function with two int results, and an expression that wants the sum of them. On a Mill, using assembly language, that is two hardware operations - a call and an add. I have tried the corresponding tuple code in various compilation systems and targets available to me, and have been unable to get below 12 operations and some were over fifty. Your mileage may vary; if you have a platform that gets it to two then please post the source code used in the test, the target machine, and the compiler version and settings.
As might be expected for a hardware company, our forums are mostly populated by those interested in architecture rather than languages. Hence I have come here.
On 2014–06–10, at 11:18 AM, Patrick Michael Niedzielski <patrickni...@gmail.com> wrote:
Why was this "homework assignment" comment was necessary? Be civil.
Homework is the only justification for ignoring a widely implemented solution for the sake of implementing something new.
There’s no shame in doing homework, but insisting on inventing something for novelty’s sake is Wrong. What we have here is a man in search of a solution in search of a problem. He doesn’t know what his extension is yet, he just knows he wants some kind of an extension. C++ handles multiple returns essentially the same as any other language, but it’s a “kludge” because it’s not bespoke.
Presupposition that new solutions are needed is not the way an engineer should think. Prototypes are built using the parts on hand, and no evidence has so far been presented that the parts do not fit.
If we need more detail about multiple returns, it should be described within the immediate request and not left in the middle of an hours-long video series.
The past few emails you've sent in this thread have not come off as
civil.
The first post immediately raised red flags. The assertion that a new CPU architecture is completely different from everything else is immediately suspicious, because after 60+ years of experimentation there are few completely novel ideas.
To communicate architectural details to a literate audience, it is expedient to reference literature.
Following the YouTube link and checking the website via archive.org reinforces my impression that the materials, by intention or not, are geared to an illiterate audience of potential investors. This rubs me the wrong way.
Anyone with the experience to do what Ivan says he did, knows that multiple returns are commonplace in handwritten assembler for any conventional register-based architecture.
(Even more common for stack machines.) So the assertion that they’re unsupported is just disingenuous. Even variable length returns are easily done with a call stack, and there is precedent for putting the stack in rename registers. This has nothing to do with C++, by the way.
“DSP code, all those loops are software pipelined, and software pipelines have unbounded ILP.”
Even DSP filters often have dependencies over loop iterations.
“Rather few general-purpose loops can be software pipelined. They contain really nasty things like function calls, that are full of control."No, they cannot be pipelined because of Amdahl’s law: you can only go so far before hitting a dependent computation.
“Any of you who do do chip design know I’m lying through my teeth”
Well, I have done a smidgen of physical circuit design, and I’d like to see what magic network wiring can be a game-changer.
—
I’ll stop now and get back to work, but civility is not the best thing when it comes to scammers.
Big dreams are well and good, but asking money for research that requires no materials, misrepresenting the degree of originality, is unjustified.
Claiming that additional language support is needed when its not,
and such requirement of extensions being a historical reason for failure of projects just like this, is disingenuous. There’s no problem with Ivan pursuing personal projects, but the connections drawn in messages here between C++ notation and machine code suggest a lack of the familiarity with modern compilers that is essential to ISA design. Acting like a seasoned pro with a comp-arch business plan, but without familiarity with the software development stack, is fraud.
TL;DR: If he wants to succeed, he will avoid language extensions and use pure LLVM IR as source. If the project is for real, it can take my flak. If it’s not, I’m doing skeptics a valuable service with public criticism.
Consider a function F with two scalar results, and a function G
accepting two arguments of the scalar type, where the intent is that the
results of F will be the arguments of G - in the opposite order. I was
hoping for something like:
G([lab:f()] lab$1, lab$0)
rather than the armwaving required to use tuples (note that I am very
much not proposing that syntax; my purpose here is to gather suggestions
for the syntax.
void x() {
[](auto lab) { return g(std::get<1>(lab), std::get<0>(lab)); }(f());
}
void y() {
[](decltype(f()) lab=f()) { return g(std::get<1>(lab), std::get<0>(lab)); }();
}
void z() {
auto lab = f();
g(std::get<1>(lab), std::get<0>(lab));
} pushq %rax
callq f()
movl %eax, %edi
popq %rax
jmp g(double, int) # TAILCALL
std::apply(g, permute<1, 0>(f()))Consider a function F with two scalar results, and a function G
accepting two arguments of the scalar type, where the intent is that the
results of F will be the arguments of G - in the opposite order. I was
hoping for something like:
G([lab:f()] lab$1, lab$0)
rather than the armwaving required to use tuples (note that I am very
much not proposing that syntax; my purpose here is to gather suggestions
for the syntax.
Three ways to write that with current syntax:
void x() {
[](auto lab) { return g(std::get<1>(lab), std::get<0>(lab)); }(f());
}
void y() {
[](decltype(f()) lab=f()) { return g(std::get<1>(lab), std::get<0>(lab)); }();
}
void z() {
auto lab = f();
g(std::get<1>(lab), std::get<0>(lab));
}
Clang compiles all three identically (with f returning std::tuple<int, double>):
pushq %rax
callq f()
movl %eax, %edi
popq %rax
jmp g(double, int) # TAILCALL
From our point of view, we did everything right and the one at fault was you.
In any case, Internet requires thick skin. So I am trying right now to clear
up the misunderstanding so we can work together.
On Wed, Jun 11, 2014 at 9:55 AM, Thiago Macieira <thi...@macieira.org> wrote:
From our point of view, we did everything right and the one at fault was you.
I don't know, I think some comments went too far.
In any case, Internet requires thick skin. So I am trying right now to clear
up the misunderstanding so we can work together.
How about we just start over.
So I might paraphrase Ville slightly differently "I already have 7 workarounds, I don't need it... that badly".
And I thank you for your gracious answer to a newbie. I'll go away now.Tony
I don't have 7 work-arounds, I have 7 alternative ways to solve the problem,
one of which (a named struct with named members, any invariants I choose,
and any semantics I want to express) is vastly superior to multiple return
values. The only case where I've found a real need for multiple return
values and
especially the unpacking of those into single variables is quick prototypes.
Well, C++ can't be a language for everything; perhaps it's sub-optimal for
quick prototypes. I can't say that's a problem for me.