__thread ios

421 views
Skip to first unread message

jjw...@gmail.com

unread,
Mar 9, 2014, 7:34:02 PM3/9/14
to capn...@googlegroups.com
Hi Kenton,

I've been working on getting capnproto to build and run on ios. One stumbling block I came across is that the __thread extension is not supported when building for ios. To work around this I've modified it to use pthread_setspecific and pthread_getspecific if KJ_USE_PTHREAD_TLS is defined.

With this change, all the kj and capnp unit tests are passing on the simulator, and all but 3 are passing on an actual device: 

- AsyncUnixTest.SignalsMultiListen - Never finishes
- AsyncUnixTest.SignalsMultiReceive - Never finishes
- AsyncUnixTest.SignalsNoWait - Fails, receivedSigurg is false


If you're happy with them I'll send a pull request.

Cheers,
Jason

Kenton Varda

unread,
Mar 9, 2014, 9:33:09 PM3/9/14
to jjw...@gmail.com, capnproto
Hi Jason,

Thanks for fixing on this!  I had no idea that __thread wasn't supported on iOS...  I figured GCC and Clang supported it everywhere.

The signal tests not working is probably not important since the implementation doesn't actually use any signals currently, though it would be nice to know what's going wrong there.

Regarding your code, it looks well-written, but I think we can actually go with something a little simpler here.  In all cases where thread-locals are used in KJ and Cap'n Proto, it's only to store an unowned pointer.  So, we don't really need to allocate a copy of the value on the heap; we can just pass it directly to pthread_setspecific().

It's also a bit sad that this class will require a global constructor.  Global constructors actually have a surprising effect on startup speeds since they force those pages of the executable to be paged in before main(), just to run the constructors.  Protobufs had a lot of global constructors and it has created real problems for e.g. Chrome.  I wonder if we should use a pthread_once, or maybe rely on GCC/Clang's thread-safe initialization of static local variables here, to make sure pthread_key_create doesn't happen until the first time the thread-local is actually used?

Thoughts?

-Kenton


--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.
Visit this group at http://groups.google.com/group/capnproto.

Andrew Lutomirski

unread,
Mar 9, 2014, 10:32:30 PM3/9/14
to Kenton Varda, jjw...@gmail.com, capnproto
On Sun, Mar 9, 2014 at 6:33 PM, Kenton Varda <temp...@gmail.com> wrote:
> Hi Jason,
>
> Thanks for fixing on this! I had no idea that __thread wasn't supported on
> iOS... I figured GCC and Clang supported it everywhere.
>
> The signal tests not working is probably not important since the
> implementation doesn't actually use any signals currently, though it would
> be nice to know what's going wrong there.
>
> Regarding your code, it looks well-written, but I think we can actually go
> with something a little simpler here. In all cases where thread-locals are
> used in KJ and Cap'n Proto, it's only to store an unowned pointer. So, we
> don't really need to allocate a copy of the value on the heap; we can just
> pass it directly to pthread_setspecific().
>
> It's also a bit sad that this class will require a global constructor.
> Global constructors actually have a surprising effect on startup speeds
> since they force those pages of the executable to be paged in before main(),
> just to run the constructors. Protobufs had a lot of global constructors
> and it has created real problems for e.g. Chrome. I wonder if we should use
> a pthread_once, or maybe rely on GCC/Clang's thread-safe initialization of
> static local variables here, to make sure pthread_key_create doesn't happen
> until the first time the thread-local is actually used?

Worse: a global constructor means that code called from other global
constructors can't safely use whatever the gets initialized there.

Jason Choy

unread,
Mar 10, 2014, 9:38:50 AM3/10/14
to Andrew Lutomirski, Kenton Varda, capnproto
I've removed the global constructors and simplified it to support unowned pointers only. Here is the new diff:


Note, I had to change the macros to include the variable name - this is so that they can be initialised to nullptr when using __thread or thread_local storage specifiers, and default constructed when using ThreadLocal<T>.

One thing I'm not too happy with is that you only get one distinct thread local per template instantiation. Currently each of the usages of ThreadLocal is instantiated with a different type, so it works at the moment, but is fragile. I've added a second int template parameter which could be used to distinguish multiple thread locals of the same type - however this relies on the developer choosing unique ids for each declaration. Do you have any thoughts on how to improve this?

Jason


Kenton Varda

unread,
Mar 10, 2014, 7:45:15 PM3/10/14
to Jason Choy, Andrew Lutomirski, capnproto
Hi Jason,

I'm going to do a 0.4.1 release today and want to include this.  I have some more tweaks I want to make so I'll just do them myself.  Thanks for the contribution!

-Kenton

Andrew Lutomirski

unread,
Mar 10, 2014, 7:57:01 PM3/10/14
to Kenton Varda, Jason Choy, capnproto
On Mon, Mar 10, 2014 at 4:45 PM, Kenton Varda <temp...@gmail.com> wrote:
> Hi Jason,
>
> I'm going to do a 0.4.1 release today and want to include this. I have some
> more tweaks I want to make so I'll just do them myself. Thanks for the
> contribution!

I don't suppose you want to make a doc tarball for 0.4.1 in addition
to the c++ tarball?

--Andy

Kenton Varda

unread,
Mar 10, 2014, 8:03:15 PM3/10/14
to Andrew Lutomirski, Jason Choy, capnproto
On Mon, Mar 10, 2014 at 4:57 PM, Andrew Lutomirski <an...@luto.us> wrote:
I don't suppose you want to make a doc tarball for 0.4.1 in addition
to the c++ tarball?

What would this "doc tarball" contain?  The only docs are online and in comments...

Andrew Lutomirski

unread,
Mar 10, 2014, 8:08:37 PM3/10/14
to Kenton Varda, Jason Choy, capnproto
The same thing that's online, so distros can package it as
capnproto-doc and stick it in /usr/share/doc :)

--Andy

Kenton Varda

unread,
Mar 10, 2014, 9:41:12 PM3/10/14
to Andrew Lutomirski, Jason Choy, capnproto
On Mon, Mar 10, 2014 at 5:08 PM, Andrew Lutomirski <an...@luto.us> wrote:
The same thing that's online, so distros can package it as
capnproto-doc and stick it in /usr/share/doc :)

Who actually reads HTML out of /usr/share/doc?  :)

Kenton Varda

unread,
Mar 10, 2014, 9:49:59 PM3/10/14
to Jason Choy, Andrew Lutomirski, capnproto
On Mon, Mar 10, 2014 at 6:38 AM, Jason Choy <jjw...@gmail.com> wrote:
I've removed the global constructors and simplified it to support unowned pointers only. Here is the new diff:

Looking more into it, I think the use of pthread_once is unnecessary as a static local is required to be initialized in a thread-safe manner under C++11.  Unless the iOS compiler is flagrantly in violation of the standard here, there should be no need to muck with pthread_once.
 
Note, I had to change the macros to include the variable name - this is so that they can be initialised to nullptr when using __thread or thread_local storage specifiers, and default constructed when using ThreadLocal<T>.

I think another solution is to declare the constructor-from-nullptr as "constexpr", in which case the compiler is smart enough not to emit a global constructor.  I'm going to try this.
 
One thing I'm not too happy with is that you only get one distinct thread local per template instantiation. Currently each of the usages of ThreadLocal is instantiated with a different type, so it works at the moment, but is fragile. I've added a second int template parameter which could be used to distinguish multiple thread locals of the same type - however this relies on the developer choosing unique ids for each declaration. Do you have any thoughts on how to improve this?

I'm going to try a trick where I declare a one-off type just to use as a template parameter.

Jason Choy

unread,
Mar 10, 2014, 11:07:19 PM3/10/14
to Kenton Varda, Andrew Lutomirski, capnproto
On 11 March 2014 01:49, Kenton Varda <temp...@gmail.com> wrote:
On Mon, Mar 10, 2014 at 6:38 AM, Jason Choy <jjw...@gmail.com> wrote:
I've removed the global constructors and simplified it to support unowned pointers only. Here is the new diff:

Looking more into it, I think the use of pthread_once is unnecessary as a static local is required to be initialized in a thread-safe manner under C++11.  Unless the iOS compiler is flagrantly in violation of the standard here, there should be no need to muck with pthread_once.

Won't you still need it to ensure that pthread_key_create is called only once?
 
 
Note, I had to change the macros to include the variable name - this is so that they can be initialised to nullptr when using __thread or thread_local storage specifiers, and default constructed when using ThreadLocal<T>.

I think another solution is to declare the constructor-from-nullptr as "constexpr", in which case the compiler is smart enough not to emit a global constructor.  I'm going to try this.
 
One thing I'm not too happy with is that you only get one distinct thread local per template instantiation. Currently each of the usages of ThreadLocal is instantiated with a different type, so it works at the moment, but is fragile. I've added a second int template parameter which could be used to distinguish multiple thread locals of the same type - however this relies on the developer choosing unique ids for each declaration. Do you have any thoughts on how to improve this?

I'm going to try a trick where I declare a one-off type just to use as a template parameter.

sounds good

Andrew Lutomirski

unread,
Mar 11, 2014, 12:05:49 AM3/11/14
to Kenton Varda, Jason Choy, capnproto
I do, on occasion :)

See:

https://fedoraproject.org/wiki/Packaging:Guidelines#Documentation

The current situation is a bit odd, since the docs are in git but not
in the release tarball.

Kenton Varda

unread,
Mar 11, 2014, 5:52:52 PM3/11/14
to Jason Choy, capnproto
On Mon, Mar 10, 2014 at 8:17 PM, Jason Choy <jjw...@gmail.com> wrote:

Out of curiosity, what technique did you use to verify that the compiler didn't need to generate a global constructor?

Clang has -Wglobal-constructors.  :)  Although it occurs to me it's not covered in my test script; it's an option I currently only turn on when building with Ekam (which is how I do all my development work).  I'll have to make a note to improve that in 0.5.  For now I manually verified that if I remove the "constexpr" qualifier from the ThreadLocalPtr constructor, Clang starts complaining about global constructors.

I've put together a release candidate for 0.4.1.  Can you please verify that this works with iOS?  I'm not set up for testing against it.

Also, do you mind if I ask you to test future release candidates, when they happen?

-Kenton

Jason Choy

unread,
Mar 11, 2014, 6:40:47 PM3/11/14
to Kenton Varda, capnproto
I've put together a release candidate for 0.4.1.  Can you please verify that this works with iOS?  I'm not set up for testing against it.
 
I've tested commit 8caf313b on iOS 7 and everything works bar the three aforementioned async-unix tests.

Also, do you mind if I ask you to test future release candidates, when they happen?

Sure, no problem 
Reply all
Reply to author
Forward
0 new messages