On RDTSC (reading the timestamp counter)

Mark Seaborn

unread,

Jan 10, 2010, 1:19:29 PM1/10/10

to native-cli...@googlegroups.com

Native Client currently allows the RDTSC x86 instruction to be used [1], but (as Brad mentioned to me) this instruction gets disabled by Linux's seccomp sandboxing mode [2], which could cause us problems if we want NaCl to be able to run under Chromium's seccomp-based Linux sandbox. How do we want to deal with this?

Is there any equivalent of RDTSC on ARM? If not, we might want to disallow RDTSC on the grounds that it is not portable. How would a program use RDTSC if it were compiled to a portable binary?

If we're going to allow access to the functionality of RDTSC it might be good to put it behind a NaCl syscall, similar to how we have talked about handling fetching the thread pointer on x86-64 for supporting TLS. In the case where the OS-level process can execute RDTSC, its use by NaCl does not have to incur the usual NaCl syscall overhead: we can place "rtdsc; naclret" directly in a syscall trampoline. The advantage of doing this is that it gives us the option to virtualise use of RDTSC if we need to. Otherwise, if we allow RDTSC to be used directly from the start, we might get into a situation where NaCl code depends on it and we cannot remove it without breaking compatibility.

It is possible to patch RDTSC in a binary to jump to some other implementation -- the seccomp sandbox rewrites syscall instructions to do this [3] -- but since RDTSC is only 2 bytes long, it involves moving the following instruction out of the way, which is tricky. (Maybe we could allow RDTSC but require it to be followed by 3 NOPs within the same instruction bundle so that it can be overwritten with a 5-byte jump?)

The seccomp sandbox currently catches the SIGSEGV signal that RDTSC produces and emulates it by forwarding the request to the trusted thread. There's a bug open for making the sandbox rewrite the RDTSC instruction instead, the same way it rewrites syscall instructions [4].

Mark

[1] http://code.google.com/p/nativeclient/issues/detail?id=84
[2] Actually, whether it gets disabled varies between kernel versions.
See http://blog.cr0.org/2009/05/time-stamp-counter-disabling-oddities.html
[3] http://code.google.com/p/seccompsandbox/source/browse/trunk/library.cc
[4] http://code.google.com/p/chromium/issues/detail?id=26524

Victor Khimenko

unread,

Jan 11, 2010, 3:23:10 AM1/11/10

to native-cli...@googlegroups.com

I think the good way is to provide a syscall and change call to this syscall to rdtsc if it's allowed. Even if we'll replace rdtsc with call (we'll need 3 nops as discussed) it's still risky because naclret will not work and we'll need some more complex checks instead (for example we can create 32 pseudosyscalls and verifier/rewriter will use open of them as needed) - but this will add significant additional complexity to TCB and this is the last thing we need.

--
You received this message because you are subscribed to the Google Groups "Native Client Discuss" group.
To post to this group, send email to native-cli...@googlegroups.com.
To unsubscribe from this group, send email to native-client-di...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/native-client-discuss?hl=en.

Ian Lewis

unread,

Jan 11, 2010, 11:35:32 AM1/11/10

to native-cli...@googlegroups.com

+1 for a syscall.

RDTSC isn't trivially portable even across x86 platforms. Perhaps the situation has improved in the last couple of years, but when I was with DirectX we dealt with multiple RDTSC issues--out of order execution, multicore issues, and power management can all cause RDTSC to return misleading results. See the MSDN article on RDTSC by my buddy Chuck Walbourn.

Of course, hiding RDTSC shifts the burden to us--instead of exposing the instruction and letting folks shoot themselves in the foot, we take on the task of providing something portable. More work for us. But the portability benefits are likely to be significant.

Ian

Brad Chen

unread,

Jan 11, 2010, 11:35:47 AM1/11/10

to native-cli...@googlegroups.com

What I had imagined, without thinking through all the details or documenting this, is that we would allow RDTSC on the x86 for now, and eventually replace it with a compiler intrinsic that would encourage people to write portable code.

RDTSC is definitely useful for things like audio that have hard real-time constrains. Supporting the instruction directly in the short-term saves some people the trouble of rewriting code using RDTSC, although it's clearly a portability issue.

For seccomp, I was imagining that wouldn't work, and we'd have to use the alternative Chrome setuid sandbox. Neha or Bennet can run-down the host of considerations for the various Linux sandboxing options, but I think the general sense is that the setuid sandbox is good enough.

Brad

On Mon, Jan 11, 2010 at 12:23 AM, Victor Khimenko <kh...@google.com> wrote:

Brad Chen

unread,

Jan 11, 2010, 11:57:18 AM1/11/10

to native-cli...@googlegroups.com

How can we make it as easy as possible to port code that currently uses RDTSC to NaCl?

Brad

Mark Seaborn

unread,

Jan 11, 2010, 12:00:27 PM1/11/10

to native-cli...@googlegroups.com

On Mon, Jan 11, 2010 at 4:57 PM, Brad Chen <brad...@google.com> wrote:

How can we make it as easy as possible to port code that currently uses RDTSC to NaCl?

It would just be a matter of adding an #ifdef, e.g.:

#ifdef __nativeclient__
# include <nacl/gettsc.h>
# define get_tsc nacl_get_tsc
#else
static long long get_tsc() {
long long rc;
asm("rdtsc\n" "mov %%eax, (%0)\n" "mov %%edx, 4(%0)\n" :
: "c"(&rc), "a"(-1), "d"(-1));
return rc;
}
#endif

Mark

Ian Lewis

unread,

Jan 11, 2010, 12:36:18 PM1/11/10

to native-cli...@googlegroups.com

That sounds pretty similar to what Brad and I discussed this morning--make reading a timestamp into a nacl syscall, but implement the trampoline such that if possible it will issue RDTSC and return without actually transitioning to trusted code. I think that's the right way to go.

The main modification I'd suggest to the code below is that even if RDTSC is available, we should massage the result. I have a feeling that many uses of RDTSC are not particularly robust in the face of multiprocessor architectures and/or frequency scaling, so exposing it directly seems like a good way to perpetuate bad behavior.

Ian

Reply all

Reply to author

Forward