Criteria for adding assembly implementation to a package

733 views
Skip to first unread message

minux

unread,
Mar 10, 2016, 5:22:22 AM3/10/16
to golang-dev
Hi gophers,

There have been some CLs floating around to speed up packages
by adding assembly implementation for key functions. While I can
see that the benefit of adding assembly implementations, there are
also non-negligible drawbacks:

0. maintenance burden of using assembly; And note the fact that
one port introduced assembly implementation will likely encourage
adding more assembly to the package for other ports.

1. some environment, for example, App Engine, forbid using assembly,

2. make otherwise portable Go packages tied to a particular implementation
(the incompatibility of gc/gccgo assembly makes things worse.)

3. discourage future Go compiler optimization efforts;

4. discourage general optimization on the portable Go version, which
means ports without assembly implementation will likely to receive
less love (if amd64 version is fast enough, why would you optimize
the pure Go version if it's not used on amd64?)


I'd like to raise this generic question:
what's the criteria for accepting assembly into a package?


My personal answer is that both of the following must meet to be acceptable:
1. the speed up must be dramatic (say, 2x or more);
2. it's almost impossible for a (near-)future optimizing compilers to reach
a comparable level of performance, (for example, assembly that uses
very specialized instructions like AESNI, CRC32, PCMPESTRI, etc, might
meet this.)


What do you think?

Related CLs:
https://golang.org/cl/20503 compress/flate: use CRC32 and PCMPESTRI


Thanks,
Minux

mart...@uos.de

unread,
Mar 10, 2016, 5:49:18 AM3/10/16
to golang-dev
I would add:

3. The speedup must apply to a performance critical part that is used frequently.

No need using assembly for a rarely used code path that can otherwise be implemented in pure go.

Russ Cox

unread,
Mar 10, 2016, 10:10:53 AM3/10/16
to minux, golang-dev
On Thu, Mar 10, 2016 at 5:21 AM, minux <mi...@golang.org> wrote:
0. maintenance burden of using assembly; And note the fact that
one port introduced assembly implementation will likely encourage
adding more assembly to the package for other ports.

1. some environment, for example, App Engine, forbid using assembly,

2. make otherwise portable Go packages tied to a particular implementation
(the incompatibility of gc/gccgo assembly makes things worse.)

3. discourage future Go compiler optimization efforts;

4. discourage general optimization on the portable Go version, which
means ports without assembly implementation will likely to receive
less love (if amd64 version is fast enough, why would you optimize
the pure Go version if it's not used on amd64?)

I'd like to raise this generic question:
what's the criteria for accepting assembly into a package?

My personal answer is that both of the following must meet to be acceptable:
1. the speed up must be dramatic (say, 2x or more);
2. it's almost impossible for a (near-)future optimizing compilers to reach
a comparable level of performance, (for example, assembly that uses
very specialized instructions like AESNI, CRC32, PCMPESTRI, etc, might
meet this.)

What do you think?

I think you identified the important considerations, but I think the exact criteria necessarily varies. The decisions also depends on how important the package is and how complex the assembly is, not just what the speedup is

Certainly a 50% execution time reduction (what you called 2x) with manageable assembly seems like it would be worth doing. But in an important library like compress/flate it might still be worth doing for as much as a 20% overall reduction if the amount of assembly is reasonable. There's no hard line. (Klaus's excellent work is going to make it that much more reasonable to start compressing our package archives.)

I don't think the hypothetical future compiler is something we should give much weight to. If an important function like gzip compression can get 20% faster today with minimal added code complexity, then there's not much point in saying "no the compiler will be better a year from now".

Do note that what matters is end-to-end benchmarks, like calling exported APIs, not microbenchmarks of individual functions.

Russ

klau...@gmail.com

unread,
Mar 10, 2016, 2:34:49 PM3/10/16
to golang-dev
On Thursday, 10 March 2016 11:22:22 UTC+1, minux wrote:

0. maintenance burden of using assembly; And note the fact that
one port introduced assembly implementation will likely encourage
adding more assembly to the package for other ports.

That is definitely a point. That said, if it is well made, it should also be easy to pull out/disable for testing, which is the impression I have gotten from the assembly in the Go standard packages.

For my own packages I also like to have a "noasm" tag, which must disable assembler. For me, that makes it easier to identify issues and benchmark, but I don't know if there is something that already does this.


1. some environment, for example, App Engine, forbid using assembly,

Could you elaborate - how is that a drawback? That is a choice that makes sense for the App Engine, but it doesn't really affect other platforms.



2. make otherwise portable Go packages tied to a particular implementation
(the incompatibility of gc/gccgo assembly makes things worse.)

gccgo is definitely currently an issue, but is there anything stopping an eager spirit from adding assembler support to gccgo?


3. discourage future Go compiler optimization efforts;

I don't really see any relation. The vast majority of code will still be pure Go.


4. discourage general optimization on the portable Go version, which
means ports without assembly implementation will likely to receive
less love (if amd64 version is fast enough, why would you optimize
the pure Go version if it's not used on amd64?)

Fair point, but for users the thing that matters is the end achieved speed - most will not care if it is Go or assembly. Of course they will be surprised when CRC32 is 10x slower on ARM, since it doesn't have the amd64 assembly, but I don't really see how not having the amd64 assembly would help that - it will still be slow on ARM.


what's the criteria for accepting assembly into a package?
[...]
What do you think?

I think that other than the points, the use case scenario of a function is also important. I focus my optimization points on typical bottlenecks. deflate/gzip was important for me because it is such a common bottleneck, webservers are the obvious case, but docker fs image compression has also come up from the people that have contacted me. For these people even 20% faster code is a massive help, since it reduces their server expenses by maybe 10%, or enables them to have 10% higher throughput on a service. The same goes for the sha3 example.

That said, assembler is high risk. The most important part is that we are bypassing the memory safety of "pure" Go - which should not be taken lightly. *That* is the most significant drawback IMO. When reviewing code, I have argued both for and against assembler, and I would expect the same questions to be raised when I submit it.

In relation to the CL you mention, I have postponed the assembler part and focused on the Go parts, so we can get a clearer picture of the actual gain. Luckily most of the gains are achieved in Go, and when we have all these in place we can add the assembler, if we agree it is a significant improvement. I look very much forward to your review - I will be sure to make the trade-offs clear :)


Thanks,
Minux

/Klaus 

Nigel Tao

unread,
Mar 10, 2016, 6:53:22 PM3/10/16
to Klaus Post, golang-dev
On Fri, Mar 11, 2016 at 6:34 AM, <klau...@gmail.com> wrote:
> On Thursday, 10 March 2016 11:22:22 UTC+1, minux wrote:
>> 1. some environment, for example, App Engine, forbid using assembly,
>
> Could you elaborate - how is that a drawback? That is a choice that makes
> sense for the App Engine, but it doesn't really affect other platforms.

I think what Minux is trying to say is that, if package foo uses
assembly (and the author doesn't remember to use e.g. an appengine
build tag), then a Go App Engine app can't use package foo. That is
transitive - if I have a Go application that I'd like to run both on
and off App Engine, and I import package bar, which imports package
qux, which imports foo, then again, the magic of "go get" is broken.

Sure, the standard library is provided on App Engine, rather than
brung along with the app code, but it means that, if I can think of a
new feature, a bug fix, or optimization to make to flate, then I can't
simply fork the flate package and change an import path.


>> 2. make otherwise portable Go packages tied to a particular implementation
>> (the incompatibility of gc/gccgo assembly makes things worse.)
>
> gccgo is definitely currently an issue, but is there anything stopping an
> eager spirit from adding assembler support to gccgo?

It's certainly possible for an eager spirit (if one magically existed)
to either add gccgo support for Plan 9 style asm, or to port the Plan
9 style asm code to whatever asm format that GCC accepts. But apart
from the opportunity cost of that eager spirit's time, either way, we
now have more code to maintain, and as you noted, anything asm related
is riskier in general. One example is a safety bug being fixed in one
version (e.g. the gc compiler) but not another (possibly in a separate
repo), and now we're shipping vulnerable code.

Mikio Hara

unread,
Mar 10, 2016, 7:15:44 PM3/10/16
to Nigel Tao, Klaus Post, golang-dev
On Fri, Mar 11, 2016 at 8:52 AM, Nigel Tao <nige...@golang.org> wrote:

> I think what Minux is trying to say is that, if package foo uses
> assembly (and the author doesn't remember to use e.g. an appengine
> build tag), then a Go App Engine app can't use package foo. That is
> transitive - if I have a Go application that I'd like to run both on
> and off App Engine, and I import package bar, which imports package
> qux, which imports foo, then again, the magic of "go get" is broken.

If build constraints can resolve the issue above, I'm fine with having
new tags for GOOS such as "appengine." The go command and standard
library already accept "android" or "nacl"-like blended GOOS tags.

Russ Cox

unread,
Mar 10, 2016, 8:23:23 PM3/10/16
to Nigel Tao, Klaus Post, golang-dev
On Thu, Mar 10, 2016 at 6:52 PM, Nigel Tao <nige...@golang.org> wrote:
On Fri, Mar 11, 2016 at 6:34 AM,  <klau...@gmail.com> wrote:
> On Thursday, 10 March 2016 11:22:22 UTC+1, minux wrote:
>> 1. some environment, for example, App Engine, forbid using assembly,
>
> Could you elaborate - how is that a drawback? That is a choice that makes
> sense for the App Engine, but it doesn't really affect other platforms.

I think what Minux is trying to say is that, if package foo uses
assembly (and the author doesn't remember to use e.g. an appengine
build tag), then a Go App Engine app can't use package foo. That is
transitive - if I have a Go application that I'd like to run both on
and off App Engine, and I import package bar, which imports package
qux, which imports foo, then again, the magic of "go get" is broken.

Sure, the standard library is provided on App Engine, rather than
brung along with the app code, but it means that, if I can think of a
new feature, a bug fix, or optimization to make to flate, then I can't
simply fork the flate package and change an import path.

I'm not at all worried about this. It's easy to add the appengine tags, like Klaus already has in his fork of compress/flate (https://github.com/klauspost/compress/blob/master/flate/crc32_amd64.s for example).

Russ

Giovanni Bajo

unread,
Mar 11, 2016, 8:47:41 AM3/11/16
to golang-dev, klau...@gmail.com
On Thursday, March 10, 2016 at 8:34:49 PM UTC+1, klau...@gmail.com wrote:
On Thursday, 10 March 2016 11:22:22 UTC+1, minux wrote:

2. make otherwise portable Go packages tied to a particular implementation
(the incompatibility of gc/gccgo assembly makes things worse.)

gccgo is definitely currently an issue, but is there anything stopping an eager spirit from adding assembler support to gccgo?

One issue might be ABI, for compatibility with gccgo, and for our own benefit of eventually improving the current quite inefficient ABI.

Are ABI improvements in any roadmap of the SSA team?

Giovanni Bajo

Keith Randall

unread,
Mar 11, 2016, 12:00:02 PM3/11/16
to Giovanni Bajo, golang-dev, klau...@gmail.com
We've been thinking about changing the calling convention in various ways.  It's not going to happen any time soon, as we can't really do it until we've permanently disabled the old backend (we don't want to modify the old backend also).  The earliest possible would be 1.8 for amd64 and 1.9 for the other archs.

Changing the calling convention will affect all assembly in existence, so not something to do lightly.  There will be a proposal, design doc, ... at some point.  I don't think we know for sure whether it is even worth doing yet.
 
Giovanni Bajo

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Giovanni Bajo

unread,
Mar 11, 2016, 12:12:41 PM3/11/16
to Keith Randall, golang-dev, klau...@gmail.com
Well, thinking of it, one could also mark assembly functions with the ABI version somewhere, so that the compiler could generate calls to them using the correct ABI. It’s not necessary to break all existing assembly in existence. 

OK anyway, I see it’s not worth discussing in this context.
--
Giovanni Bajo

Russ Cox

unread,
Mar 11, 2016, 4:27:55 PM3/11/16
to Giovanni Bajo, Keith Randall, golang-dev, klau...@gmail.com
Having per-function ABIs significantly complicates indirect function calls. It's basically not worth it. But again, this is not going to happen any time soon.

Reply all
Reply to author
Forward
0 new messages