arm5, arm6, arm7 build tag?

525 views
Skip to first unread message

minux

unread,
Feb 14, 2014, 12:16:03 AM2/14/14
to golang-dev

Hi gophers,

Due to the liblink changes, the ARM compiler in Go 1.3 will use the GOARM variable at the compiler stage instead only at the linker stage.

Can we introduce arm5, arm6 and arm7 (or, more technically correctly, armv5, armv6 and armv7) build tags so that we can have separate implementations for each variant?

I want to optimize the atomic ops. used in the runtime, but I don't like the performance penalty of runtime dispatch.

Also, we can have NEON (only in armv7) optimized math, math/big and crypto, which will speed up things a lot.

What do you think? The only drawback I can think of is collision with existing user defined build tags, but I didn't know any people do that yet. Examples welcome.

Sincerely,
minux

Nick Craig-Wood

unread,
Feb 14, 2014, 3:19:30 AM2/14/14
to minux, golang-dev
There are certainly some ARM features it would be very useful to know
about a compile time.

In particular when writing assembler for the crypto functions I would
have like to know whether an unaligned load was allowed or not - this
would have speeded up the code significantly on newer architectures.
This depends more on the OS though than the arm version.

The NEON instructions are definitely available in armv7, might be
available in armv6 and definitely not in armv5. so those categories are
a little broad in ARM terms.

Perhaps we should define exactly what we mean by each of those tags in
terms of ARM features Go code can use

armv7

NEON
FPv4-SP

armv6

?

armv5

?

--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick

minux

unread,
Feb 14, 2014, 4:00:16 AM2/14/14
to Nick Craig-Wood, golang-dev


On Feb 14, 2014 3:19 AM, "Nick Craig-Wood" <ni...@craig-wood.com> wrote:
>
> On 14/02/14 05:16, minux wrote:
> > Due to the liblink changes, the ARM compiler in Go 1.3 will use the
> > GOARM variable at the compiler stage instead only at the linker stage.
> >
> > Can we introduce arm5, arm6 and arm7 (or, more technically correctly,
> > armv5, armv6 and armv7) build tags so that we can have separate
> > implementations for each variant?
> >
> > I want to optimize the atomic ops. used in the runtime, but I don't like
> > the performance penalty of runtime dispatch.
> >
> > Also, we can have NEON (only in armv7) optimized math, math/big and
> > crypto, which will speed up things a lot.
> >
> > What do you think? The only drawback I can think of is collision with
> > existing user defined build tags, but I didn't know any people do that
> > yet. Examples welcome.
>
> There are certainly some ARM features it would be very useful to know
> about a compile time.
>
> In particular when writing assembler for the crypto functions I would
> have like to know whether an unaligned load was allowed or not - this
> would have speeded up the code significantly on newer architectures.
> This depends more on the OS though than the arm version.
>
> The NEON instructions are definitely available in armv7, might be
> available in armv6 and definitely not in armv5. so those categories are
> a little broad in ARM terms.

yes, our GOARM is perhaps misnamed, basically it has the following meaning:
5 ARM(v5) without VFP
6 ARM(v6) with VFPv1 or 2
7 ARM(v7) with VFPv3 (and NEON).

i've suggested in the past to decouple the architecture version with VFP version, but that deemed unnecessary and complex.

so if your ARMv5 has VFP, Go can't use it. (perhaps you can set GOARM=6, but the compiler might as well use ARMv6 specific instructions, such as LDREX)


> Perhaps we should define exactly what we mean by each of those tags in
> terms of ARM features Go code can use
>
> armv7
>  NEON
>  FPv4-SP

that will be overly complex, IMO.
in my proposal, the build tag has directly relationship to the already known GOARM variable.
> armv6

Before (and currently), we don't generate architecture specific instructions (except FP) in the compiler, but i expect it will be changed soon.

for example, Dmitry proposed that we have atomic op. intrinsic in cc, and in that case, using LDREX if allowed by GOARM seems the correct way to go as SWP is not recommended in newer architectures.

Nick Craig-Wood

unread,
Feb 14, 2014, 4:11:45 AM2/14/14
to minux, golang-dev
On 14/02/14 09:00, minux wrote:
>
> On Feb 14, 2014 3:19 AM, "Nick Craig-Wood" <ni...@craig-wood.com
> <mailto:ni...@craig-wood.com>> wrote:
>>
>> On 14/02/14 05:16, minux wrote:
>> > Due to the liblink changes, the ARM compiler in Go 1.3 will use the
>> > GOARM variable at the compiler stage instead only at the linker stage.
>> >
>> > Can we introduce arm5, arm6 and arm7 (or, more technically correctly,
>> > armv5, armv6 and armv7) build tags so that we can have separate
>> > implementations for each variant?
>> >
>> > I want to optimize the atomic ops. used in the runtime, but I don't like
>> > the performance penalty of runtime dispatch.
>> >
>> > Also, we can have NEON (only in armv7) optimized math, math/big and
>> > crypto, which will speed up things a lot.
>> >
>> > What do you think? The only drawback I can think of is collision with
>> > existing user defined build tags, but I didn't know any people do that
>> > yet. Examples welcome.
>>
>> There are certainly some ARM features it would be very useful to know
>> about a compile time.
>>
>> In particular when writing assembler for the crypto functions I would
>> have like to know whether an unaligned load was allowed or not - this
>> would have speeded up the code significantly on newer architectures.
>> This depends more on the OS though than the arm version.

Did you have any thoughts on unaligned loads? That is a feature it would
be really nice to rely on.

>> The NEON instructions are definitely available in armv7, might be
>> available in armv6 and definitely not in armv5. so those categories are
>> a little broad in ARM terms.
> yes, our GOARM is perhaps misnamed, basically it has the following meaning:
> 5 ARM(v5) without VFP
> 6 ARM(v6) with VFPv1 or 2
> 7 ARM(v7) with VFPv3 (and NEON).

Nice table!

Should something like that go in the wiki?

https://code.google.com/p/go-wiki/wiki/GoArm

> i've suggested in the past to decouple the architecture version with VFP
> version, but that deemed unnecessary and complex.
>
> so if your ARMv5 has VFP, Go can't use it. (perhaps you can set GOARM=6,
> but the compiler might as well use ARMv6 specific instructions, such as
> LDREX)
>> Perhaps we should define exactly what we mean by each of those tags in
>> terms of ARM features Go code can use
>>
>> armv7
>> NEON
>> FPv4-SP
> that will be overly complex, IMO.
> in my proposal, the build tag has directly relationship to the already
> known GOARM variable.

Yes that is what I meant too, I''d just like to see an official
defintion of what ARM features Go programmers can use for each version
of GOARM. If that means ignoring some opportunities for using advanced
features on older ARM then that is fine.

> Before (and currently), we don't generate architecture specific
> instructions (except FP) in the compiler, but i expect it will be
> changed soon.
>
> for example, Dmitry proposed that we have atomic op. intrinsic in cc,
> and in that case, using LDREX if allowed by GOARM seems the correct way
> to go as SWP is not recommended in newer architectures.

Sounds like a good plan.

minux

unread,
Feb 14, 2014, 4:21:50 AM2/14/14
to Nick Craig-Wood, golang-dev


On Feb 14, 2014 4:11 AM, "Nick Craig-Wood" <ni...@craig-wood.com> wrote:
> Did you have any thoughts on unaligned loads? That is a feature it would
> be really nice to rely on.

i also think it's a OS-dependent feature so Go can't do anything about it.

for example on linux you can select how the Kernel deals with (un)aligned access, by writing to /proc/cpu/alignment.

Dave Cheney

unread,
Feb 14, 2014, 4:29:43 AM2/14/14
to minux, golang-dev
Just spitballing some ideas, some may be rubbish


On Fri, Feb 14, 2014 at 4:16 PM, minux <minu...@gmail.com> wrote:

Hi gophers,

Due to the liblink changes, the ARM compiler in Go 1.3 will use the GOARM variable at the compiler stage instead only at the linker stage.

I'm not that sad about this really, could it be embedded into the runtime package as a constant ?
 

Can we introduce arm5, arm6 and arm7 (or, more technically correctly, armv5, armv6 and armv7) build tags so that we can have separate implementations for each variant?

If this is a road that everyone is happy to go down i'd like to see it expanded to cover other architectures. Ie, replace GO386, and add tags for sse2, sse3, sse42, etc.

I want to optimize the atomic ops. used in the runtime, but I don't like the performance penalty of runtime dispatch.

I agree. I think the same goes for the x86 variants who also have to do a bunch of checks at runtime.
 

Also, we can have NEON (only in armv7) optimized math, math/big and crypto, which will speed up things a lot.

What do you think? The only drawback I can think of is collision with existing user defined build tags, but I didn't know any people do that yet. Examples welcome.

I think the change of collision is minimal, but if it is too risky maybe a new // +arch alias for the build tag. Probably over thinking it.

Another possible issue is if runtimes start being produced with a wide variety of tags, how can we identify which is built with which. Ie, go version wouldn't tell us that one persons' distro was built on a machine without sse42 so it chose a slower arch (possible built inside a vm), and so forth. 

Sincerely,
minux


 

--
 
---
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Russ Cox

unread,
Feb 24, 2014, 10:52:09 AM2/24/14
to Dave Cheney, minux, golang-dev
Let's discuss this again when we start talking about 1.4. It's too late to do this for Go 1.3. 

The analogy with GO386 is what bothers me the most. There are tons of varieties of x86 if we start enumerating all the different feature sets, and I am not sure we want to have such conditional builds.

Russ 

minux

unread,
Feb 25, 2014, 12:27:26 AM2/25/14
to Russ Cox, Dave Cheney, golang-dev


On Feb 24, 2014 10:52 AM, "Russ Cox" <r...@golang.org> wrote:
>
> Let's discuss this again when we start talking about 1.4. It's too late to do this for Go 1.3. 
>
> The analogy with GO386 is what bothers me the most. There are tons of varieties of x86 if we start enumerating all the different feature sets, and I am not sure we want to have such conditional builds.

The situation is different for ARM here.

on x86, you can do cpuid checks easily,
but ARM generally doesn't have such facility, and runtime code dispatch will also be slower on ARM than on x86.

Keith Randall

unread,
Feb 25, 2014, 12:57:02 PM2/25/14
to minux, Russ Cox, Dave Cheney, golang-dev
On x86 we call CPUID once on startup and store the result in a global variable.  We could/should do the same thing on arm.  Then dispatch is just a load & test.


--

minux

unread,
Feb 25, 2014, 6:34:46 PM2/25/14
to Keith Randall, Russ Cox, Dave Cheney, golang-dev


On Feb 25, 2014 12:57 PM, "Keith Randall" <k...@google.com> wrote:
> On x86 we call CPUID once on startup and store the result in a global variable.  We could/should do the same thing on arm.  Then dispatch is just a load & test.

the problem is such facility is OS dependent as ARM architecture version registers are not readable in user space, and earlier ARMs even don't have one.

On linux, we have auxv. But what about *BSDs?

For x86, even if we don't export cpuid, packages could do that themselves, however, for the linuc/arm case, we must export the auxv from runtime (or the program must read from /proc/self/auxv, and that's still not portable to other OSes)

Brad Fitzpatrick

unread,
Feb 25, 2014, 6:38:46 PM2/25/14
to minux, Russ Cox, Keith Randall, Dave Cheney, golang-dev

Can you just try different instructions and check expected output, catching faults as needed to prevent crashing?

(Clueless)

--

Dave Cheney

unread,
Feb 25, 2014, 7:01:54 PM2/25/14
to Brad Fitzpatrick, minux, Russ Cox, Keith Randall, golang-dev
On Wed, Feb 26, 2014 at 10:38 AM, Brad Fitzpatrick <brad...@golang.org> wrote:

Can you just try different instructions and check expected output, catching faults as needed to prevent crashing?

We sort of have that already. In cmd/dist on arm builds we try a few different instructions and see if they result in a SIGILL then feed that into GOARM. The process isn't foolproof, on freebsd/arm at least the probing can crash the OS (o_O) on some platforms

We already have GOARM and GO386 which work at a coarse grained level to enable or disable more efficient instruction sequences. There is an argument to be made that the granularity of these switches be increased to allow the compiler to make even better choices, or to deal with the cornucopia of different arm system. A really good example is armv7 does not define an DIV instruction, however if you have a chromebook or iphone5 or nexus 4+ then your armv7 chip does have a DIV instruction, and this could be a major saving in the runtime. 

That said, the case I think minux is focusing on is the varoius .s files in the std lib. We currently have one per architecture, and these do not take advantage of GOARM/GO386. Instead we have to have a fast version guarded by a load and compare and a slow version, which is probably under tested. 

The solution to this would appear to be to add more build tags, which would let cmd/go choose the best available implementation at compile time.

The downside is the number of tags will probably be large, and their various intersections will be under tested.

As for my position, I'm with minux, I'd like to see GOARM remove and replaced with more granular build tags.

minux

unread,
Feb 25, 2014, 7:04:27 PM2/25/14
to Brad Fitzpatrick, Keith Randall, Russ Cox, Dave Cheney, golang-dev


On Feb 25, 2014 6:38 PM, "Brad Fitzpatrick" <brad...@golang.org> wrote:
> Can you just try different instructions and check expected output, catching faults as needed to prevent crashing?

but isn't SIGILL will crash the whole program? I haven't thought about extending SetPanicOnFault, probably that's the answer.

Aram Hăvărneanu

unread,
Feb 25, 2014, 7:21:29 PM2/25/14
to minux, Brad Fitzpatrick, Keith Randall, Russ Cox, Dave Cheney, golang-dev
We can catch SIGILL.

--
Aram Hăvărneanu

Brad Fitzpatrick

unread,
Feb 25, 2014, 7:26:32 PM2/25/14
to minux, Keith Randall, Russ Cox, Dave Cheney, golang-dev
Sure, by default, but install a signal handler for it and skip over the offending instruction.  Do this at start-up before any threads have been created.  Then set some process-wide CPU feature globals in package runtime and have each module that needs assembly check that in its init and change which code it runs.  Slightly bigger binaries but so much easier and fewer questions and confused people on mailing lists.

For builder coverage we can have some environment variables or build options that only package runtime needs to know about (and the people running the builders) that forces feature bits off for some builders.  (Advantage of environment variable is we can do the build once on a good ARM builder and re-run the tests with different environment variables to select different ARM code paths...)

I really don't want to see a dozen build tags in addition to GOARM which is already kinda gross.

Dave Cheney

unread,
Feb 25, 2014, 7:27:34 PM2/25/14
to Aram Hăvărneanu, minux, Brad Fitzpatrick, Keith Randall, Russ Cox, golang-dev
Sure. But to bring it back to my example of IDIV on armv7, adding a runtime test and branch before each DIV instruction is really not a good outcome. A similar case exists with arm atomics. The ideal outcome for me would be for the linker to be presented with the best available implementation at link time.

Dave Cheney

unread,
Feb 25, 2014, 7:29:31 PM2/25/14
to Brad Fitzpatrick, minux, Keith Randall, Russ Cox, golang-dev
For builder coverage we can have some environment variables or build options that only package runtime needs to know about (and the people running the builders) that forces feature bits off for some builders.  (Advantage of environment variable is we can do the build once on a good ARM builder and re-run the tests with different environment variables to select different ARM code paths...)


Good point.
 
I really don't want to see a dozen build tags in addition to GOARM which is already kinda gross.

I don't want to see more env vars like GOARM/GO386. More tags does sound gross, but it's working with the pattern we already have today. I think it's important to point out that these tags, say // +build arm_idiv would only occur in the runtime package, possibly math, they would not be common place. 

Brad Fitzpatrick

unread,
Feb 25, 2014, 7:38:53 PM2/25/14
to Dave Cheney, minux, Keith Randall, Russ Cox, golang-dev
Once they're in go/build.Context.FooTags, we're stuck with them forever.  People use anything you make available.

"working with the pattern we have today" is an answer, but it might not be the best answer.  We have the ability to change things.

Josh Bleecher Snyder

unread,
Feb 25, 2014, 7:42:31 PM2/25/14
to Dave Cheney, Aram Hăvărneanu, minux, Brad Fitzpatrick, Keith Randall, Russ Cox, golang-dev
> Sure. But to bring it back to my example of IDIV on armv7, adding a runtime
> test and branch before each DIV instruction is really not a good outcome. A
> similar case exists with arm atomics.

And with NEON, which could provide big performance wins.

-josh

Dave Cheney

unread,
Feb 25, 2014, 7:46:38 PM2/25/14
to Brad Fitzpatrick, minux, Keith Randall, Russ Cox, golang-dev

Once they're in go/build.Context.FooTags, we're stuck with them forever.  People use anything you make available.

"working with the pattern we have today" is an answer, but it might not be the best answer.  We have the ability to change things.

Oh no argument there. One suggestion, probably crap, would be to add a second set of build style tags, ones that are explicitly not available for people to redefine.

// +arch armv7, neon

^  
strawman

Basically // +arch is a reserved namespace, we reserve the right to change it at will.

Lucio De Re

unread,
Feb 25, 2014, 11:25:06 PM2/25/14
to Dave Cheney, Brad Fitzpatrick, minux, Keith Randall, Russ Cox, golang-dev
On 2/26/14, Dave Cheney <da...@cheney.net> wrote:
>
> Oh no argument there. One suggestion, probably crap, would be to add a
> second set of build style tags, ones that are explicitly not available for
> people to redefine.
>
> // +arch armv7, neon
>
> ^
> strawman
>
> Basically // +arch is a reserved namespace, we reserve the right to change
> it at will.
>
It's a fallacy that reserving a keyword or prefix or anything else
actually works. You need an official registry to ensure consistent
use across the public body of use. Think RFC-822's X- prefix. And
you're still stuck with historical usage.

It seems to me, unschooled as I am, that this particular issue is a
point in favour of run-time linkage, or dynamic link objects. I
really fear the day when I need to install different applications on
my (old) Xperia and my Galaxy phones. If the runtime could ensure
that the same application can run correctly and optimally in either
context, I would trade efficiency for that.

I see how others may prefer the opposite and I submit that the choice
should be with the user, while the developer needs all the help she
can get to deliver the best.

After all, what happens to Go's touted portable development if we make
compilation dependent on the platform context?

Lucio.

Dave Cheney

unread,
Feb 25, 2014, 11:35:50 PM2/25/14
to Lucio De Re, Brad Fitzpatrick, minux, Keith Randall, Russ Cox, golang-dev
On Wed, Feb 26, 2014 at 3:25 PM, Lucio De Re <lucio...@gmail.com> wrote:
On 2/26/14, Dave Cheney <da...@cheney.net> wrote:
>
> Oh no argument there. One suggestion, probably crap, would be to add a
> second set of build style tags, ones that are explicitly not available for
> people to redefine.
>
> // +arch armv7, neon
>
> ^
> strawman
>
> Basically // +arch is a reserved namespace, we reserve the right to change
> it at will.
>
It's a fallacy that reserving a keyword or prefix or anything else
actually works.  You need an official registry to ensure consistent
use across the public body of use.  Think RFC-822's X- prefix.  And
you're still stuck with historical usage.

When I said 'reserved', I meant it was not available for use generally, ie, the would be no `go build -arch` flag, or at least not one advertised or supported.
 
It seems to me, unschooled as I am, that this particular issue is a
point in favour of run-time linkage, or dynamic link objects.  I
really fear the day when I need to install different applications on
my (old) Xperia and my Galaxy phones.  If the runtime could ensure
that the same application can run correctly and optimally in either
context, I would trade efficiency for that.

AFAIK all the flags and tweaks we are talking about adding would allow better code generation with newer processors. Thanks to the economic forces of backwards compatibility, code compiled for older hardware pretty much works on newer hardware. That is why most linux distributions compile with -march=pentium on intel, and in some cases -march=armv4t for arm.

And with respect to arm code compiled today, if you built go on an armv7 machine, that installation will generate armv7 code by default which will not run on previous architectures. We even give you a nice warning on startup.
 

I see how others may prefer the opposite and I submit that the choice
should be with the user, while the developer needs all the help she
can get to deliver the best.

Runtime dynamic loading of code is a huge distraction to this discussion. I don't want to talk about it here because it is such an enormous change vs the current debate which is around adding (or possibly not adding) flags and switches to the current compilers.
 

After all, what happens to Go's touted portable development if we make
compilation dependent on the platform context?

I don't really understand what you meant there. 

Dave
 

Lucio.

Lucio De Re

unread,
Feb 26, 2014, 12:20:03 AM2/26/14
to Dave Cheney, Brad Fitzpatrick, minux, Keith Randall, Russ Cox, golang-dev
On 2/26/14, Dave Cheney <da...@cheney.net> wrote:
>
> When I said 'reserved', I meant it was not available for use generally, ie,
> the would be no `go build -arch` flag, or at least not one advertised or
> supported.
>
Still, it will be available and cause maintenance headaches. I don't
have a better option, but perhaps it's the objective that needs to be
defined very carefully. Actually, in my arrogance, I think it's
engineers that need to be persuaded to do things more sensibly, but
that's a political argument :-)

> Runtime dynamic loading of code is a huge distraction to this discussion. I
> don't want to talk about it here because it is such an enormous change vs
> the current debate which is around adding (or possibly not adding) flags
> and switches to the current compilers.
>
I keep thinking that the philosophy of computing deserves its own
subject of study. Specifically, new hardware breaks away into
unconventional directions, to be followed, at great pains, by very
persistent software developers, only to repeat the cycle. Something
tells me engineers are trying to shake us off their trail and that
doesn't really make sense, or does it?

I do agree with you that I brought up a thorny subject, but I thought
it deserved mention, not necessarily exploration. That said, whatever
we pick as the interim solution, it had better not paint us into a
corner, which is why I asked the question.
>
>>
>> After all, what happens to Go's touted portable development if we make
>> compilation dependent on the platform context?
>>
>
> I don't really understand what you meant there.
>
I's possible that there isn't a question here, but just in case I'm
not crazy, my understanding is that if I'm developing for an armV7 on
a Linux/386 platform, I need to invoke the right compiler/linker to
get the desired result. But how do I specify which subtarget I'm
aiming at?

Cross-compilation is precisely the reason I prefer Go over all
available alternative development environments, at the same time, it
is one of the features I understand the least when I start digging
under the surface.

Lucio.

Russ Cox

unread,
Mar 3, 2014, 11:36:09 AM3/3/14
to Lucio De Re, Dave Cheney, Brad Fitzpatrick, minux, Keith Randall, golang-dev
Not for Go 1.3. Let's talk about this when we're planning Go 1.4.

Reply all
Reply to author
Forward
0 new messages