go implemented shared libraries, thread local storage, and callbacks from non-go-created threads

340 views
Skip to first unread message

Elias Naur

unread,
Feb 19, 2013, 4:13:59 PM2/19/13
to golan...@googlegroups.com
Hi,

I'd like to move forward in implementing enough support for go implemented shared libraries to make direct use of Go in Android possible. Go now has enough support so that with these two CLs,

building go shared libraries works in the basic cases. Of the two, the runtime changes are by far the most invasive.

However, two immediate problems are initialization and thread local storage. After seeing Russ's CL


two questions come to mind:

1. Can the non-go thread callback support be extended to support Go initialization? After a shared library is initialized, it is expected to return control to the the dynamic loader. This seems similar to the use case adressed by Russ's CL. If so, the invasive runtime CL would disappear.

2. In ELF environments thread local storage for a shared library is not located at a known offset to the %fs/%gs registers and must revert to a slower method to locate its storage, as desribed here:

http://www.akkadia.org/drepper/tls.pdf (local dynamic tls model vs. local exec tls model)

In particular, a call to a runtime supplied and architecture dependent __tls_get_addr is needed to locate the TLS. However this comment snippet from Russ CL make me worry that simply replacing direct TLS access with __tls_get_addr will make any go programs loaded as a shared library noticably slower, since TLS is needed for every stack check. Any thought as to how this performance hit can be avoided?



With this approach we cannot in
general assume a fixed location for m and g relative to %fs (or %gs,
depending on the operating system), which would make every
stack check more expensive
 

 - elias

Ian Lance Taylor

unread,
Feb 19, 2013, 8:47:07 PM2/19/13
to Elias Naur, golan...@googlegroups.com
On Tue, Feb 19, 2013 at 1:13 PM, Elias Naur <elias...@gmail.com> wrote:
>
> 2. In ELF environments thread local storage for a shared library is not
> located at a known offset to the %fs/%gs registers and must revert to a
> slower method to locate its storage, as desribed here:
>
> http://www.akkadia.org/drepper/tls.pdf (local dynamic tls model vs. local
> exec tls model)
>
> In particular, a call to a runtime supplied and architecture dependent
> __tls_get_addr is needed to locate the TLS. However this comment snippet
> from Russ CL make me worry that simply replacing direct TLS access with
> __tls_get_addr will make any go programs loaded as a shared library
> noticably slower, since TLS is needed for every stack check. Any thought as
> to how this performance hit can be avoided?

The performance hit is not too bad, as __tls_get_addr is fairly
efficient in the normal case--some 17 instructions on amd64 the second
and subsequent times the value is loaded. Still obviously 17
instructions is worse than 1.

For gccgo I actually did not use a TLS variable. Instead, I stole a
slot in the TCB, a slot that was allocated for transactional memory
support but was not used. gc for shared libraries could use that
slot, at the cost of not being able to interoperate with code compiled
by gccgo.

I don't have a really good solution.

Ian
Reply all
Reply to author
Forward
0 new messages