NaCl read (load) sandboxing on x86-64 (Linux) platform

133 views
Skip to first unread message

Tiancheng Yi

unread,
Jul 21, 2015, 12:47:17 PM7/21/15
to native-cli...@googlegroups.com
Hi,

I am trying to understand the read sandboxing of NaCl x86-64 and I have a set of questions on it.

1. In this research paper, authors stated that "For data references, stores are sandboxed on both systems. Note that reads of secret data are generally not an issue as the address space barrier between the NaCl module and the browser protects browser resources such as cookies." However it seems that the early version of NaCl does not have read sandboxing; it is added in recent versions as mentioned in this discussion. I am confused that are the phrase "address space barrier" and so called "read sandboxing" added in later version of NaCl actually the same or not?

2. I wonder whether my understanding of read sandboxing in NaCl is correct:
-1)an address is loaded into a general purpose register
-2)the most significant 32 bits are cleared
-3)the value is added to the base of sandbox's address space to get the true address
-4)read from that address

I tried to dump the assembly code of a test program and I found the idea is similar as the one above. However when I try to modify the file in  
SRC/gcc/gcc/config/i386.md 
in order not to let the compiler add  2nd and 3rd operation to different "mov" instructions, the read is still sandboxed when I test an example program via the modified toolchain. So I wonder whether my understanding is correct.

Thanks in advance.
Tiancheng

Victor Khimenko

unread,
Jul 21, 2015, 2:10:31 PM7/21/15
to Native Client Discuss
On Tue, Jul 21, 2015 at 7:47 PM, Tiancheng Yi <tianche...@gmail.com> wrote:
Hi,

I am trying to understand the read sandboxing of NaCl x86-64 and I have a set of questions on it.

1. In this research paper, authors stated that "For data references, stores are sandboxed on both systems. Note that reads of secret data are generally not an issue as the address space barrier between the NaCl module and the browser protects browser resources such as cookies." However it seems that the early version of NaCl does not have read sandboxing; it is added in recent versions as mentioned in this discussion. I am confused that are the phrase "address space barrier" and so called "read sandboxing" added in later version of NaCl actually the same or not?

No, these are different things. Chrome spawns separate process to run .nexe or .pexe. Data mappings are differen in different processes which means that secret data like cookies or passwords is not visible in NaCl process even from TCB code (T == trusted). This works fine on Linux because there each process only observes the data it puts there (newer versions of Linux put VDSO into each and every process, but it does not really contain sensitive data), but on Windows it does not work as well because many sensitive data structures are visible at the same addresses in all processes. Worst offenders are not even parts of Windows itself, but various application control tools, AV software, etc.
 
2. I wonder whether my understanding of read sandboxing in NaCl is correct:
-1)an address is loaded into a general purpose register
-2)the most significant 32 bits are cleared
-3)the value is added to the base of sandbox's address space to get the true address
-4)read from that address

Yeah, current version of NaCl requires the same sandboxing for both reads and writes.
 
I tried to dump the assembly code of a test program and I found the idea is similar as the one above. However when I try to modify the file in  
SRC/gcc/gcc/config/i386.md 
in order not to let the compiler add  2nd and 3rd operation to different "mov" instructions, the read is still sandboxed when I test an example program via the modified toolchain. So I wonder whether my understanding is correct.

What you are observing is NOT sandboxing. Sanboxing is currently done with "%nacl" pseudo-prefix in print_operand_address_parts in i386.c. If you actually looked on the GCC output then you'll see that very often sandboxing is excessive: register is cleared once with some kind of "mov" and then explicit "%nacl" (later transated into "mov %reg,%reg") is added by print_operand_address_parts. This is done that way because it was not possible to guarantee that two instructions will be in the same bundle before .bundle_lock/.bundle_unlock was introduced. Now we could [try to] remove that duplication, it does not really affects speed much thus it and since NaCl's GCC port is mostly abandoned... 

Tiancheng Yi

unread,
Jul 23, 2015, 6:14:38 AM7/23/15
to Native-Client-Discuss, kh...@chromium.org
Many thanks for the reply.

Is it possible now to get an earlier version source code which contains only write sandboxing? It seems that I cannot simply remove the read sandboxing by removing "%nacl" prefix because of the duplication you mentioned. Even though the write sandboxing would also be broken. 

Cheers,
Tiancheng

Victor Khimenko

unread,
Jul 23, 2015, 10:36:36 AM7/23/15
to Tiancheng Yi, Native-Client-Discuss
On Thu, Jul 23, 2015 at 1:14 PM, Tiancheng Yi <tianche...@gmail.com> wrote:
Many thanks for the reply.

Is it possible now to get an earlier version source code which contains only write sandboxing?

Such version have never existed. Well... production-quality version capable of compiling at least SPEC benchmarks never existed.

It seems that I cannot simply remove the read sandboxing by removing "%nacl" prefix because of the duplication you mentioned.

The most you could do is to remove %nacl prefix from instructions which only read memory, not write to it. This will leave some instruction with "read sanboxing" but you'll see that they have it in pure x32 case, too! That's not "sandboxing", that's just how code works.

And we never had a production-quality ompiler with read sandboxing disabled. We've had preliminary version which did that but it was really buggy (half of SPEC tests crashed) and closer to publishing the decision was made to have read sandboxing in production which meant that we only need to remove read sandboxing for paper. We did that with a sed script. There was one test which was initialy broken by our script (not remember which one? crafty? name of global variable matches "nacl" regex and was removed with sandboxing) but other then that everything worked good enough to publish paper.
 
Even though the write sandboxing would also be broken. 

They are [relatively] easy to distinguish in assembler: write sanboxing have comma before %nacl prefix (as in "mov %rax,%nacl:(%r15,%rbx)", read sandboxing does not have it (as in: "mov %nacl:(%r15,%rax),%rbx". That's enough to fix the assembler code and get spec cpu results. Not enough for production compiler, but as I've explained by that time we've decided to enable read sandboxing in production unconditionally.

Derek Schuff

unread,
Jul 23, 2015, 1:09:09 PM7/23/15
to native-cli...@googlegroups.com, Tiancheng Yi
Read sandboxing would probably be pretty straightforward to disable in PNaCl/nacl-clang's source code. If you're up for doing your own toolchain build I can point you to the source for that.

--
You received this message because you are subscribed to the Google Groups "Native-Client-Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to native-client-di...@googlegroups.com.
To post to this group, send email to native-cli...@googlegroups.com.
Visit this group at http://groups.google.com/group/native-client-discuss.
For more options, visit https://groups.google.com/d/optout.

Tiancheng Yi

unread,
Jul 24, 2015, 7:37:28 AM7/24/15
to Native-Client-Discuss, kh...@chromium.org
Many thanks for the great reply.

After I removed the prefix, I wrote a simple test to compile and run by the customized toolchain. The program basically sends the address of a String in host thread (trusted, which creates a NaClApp) to the sandbox thread. That number can be printed out correctly at the sandbox side, while I still cannot read the String at sandbox side.

Say the String is stored at 00007F2ED1CFF010, after I cast it into a char pointer the value then becomes D1CFF010 (forced to be 32 bit with the original toolchain). After I removed the prefix, I got 00000000D1CFF010 which is 64 bit but the most significant 32 bits disappeared. 

So I still cannot read outside the sandbox.

A piece of dumped assembly:

   20280:     48 8b 9c 24 90 00 00       mov    0x90(%rsp),%rbx
   20287:     00
   20288:     bf 1d 02 02 10             mov    $0x1002021d,%edi
   2028d:     31 c0                      xor    %eax,%eax
   2028f:     48 63 f3                   movslq %ebx,%rsi
   20292:     66 0f 1f 84 00 00 00       nopw   0x0(%rax,%rax,1)
   20299:     00 00
   2029b:     e8 a0 24 00 00             callq  22740 <printf>

---

   20280:     8b 9c 24 90 00 00 00       mov    0x90(%rsp),%ebx
   20287:     bf 1d 02 02 10             mov    $0x1002021d,%edi
   2028c:     31 c0                      xor    %eax,%eax
   2028e:     89 de                      mov    %ebx,%esi
   20290:     66 66 2e 0f 1f 84 00       data32 nopw %cs:0x0(%rax,%rax,1)
   20297:     00 00 00 00
   2029b:     e8 a0 24 00 00             callq  22740 <printf>



where the top part is modified while the bottom part is original. I guess the problem is at the red line because it seems that only the number in ebx (not rbx) is passed through. Also there is problem at the "mov" instruction because the modified version specifies the "mov" to copy 32 bit to 64 bit (movslq). Is there anything out of "read sandboxing" that prevent read out of sandbox?

Cheers,
Tiancheng

Tiancheng Yi

unread,
Jul 24, 2015, 7:41:18 AM7/24/15
to Native-Client-Discuss, dsc...@google.com
Many thanks for the reply.

Yes I am trying to remove the read sandbox. It will be very grateful for me if you could offer some help on modifying the source code. Currently I am doing with the GCC based NaCl toolchain but I think PNaCl is also good for me to understand in depth.

Thanks in advance,
Tiancheng

Victor Khimenko

unread,
Jul 24, 2015, 9:02:30 AM7/24/15
to Tiancheng Yi, Native-Client-Discuss
On Fri, Jul 24, 2015 at 2:37 PM, Tiancheng Yi <tianche...@gmail.com> wrote:
Many thanks for the great reply.

After I removed the prefix, I wrote a simple test to compile and run by the customized toolchain. The program basically sends the address of a String in host thread (trusted, which creates a NaClApp) to the sandbox thread. That number can be printed out correctly at the sandbox side, while I still cannot read the String at sandbox side.

Say the String is stored at 00007F2ED1CFF010, after I cast it into a char pointer the value then becomes D1CFF010 (forced to be 32 bit with the original toolchain). After I removed the prefix, I got 00000000D1CFF010 which is 64 bit but the most significant 32 bits disappeared. 

So I still cannot read outside the sandbox.

A piece of dumped assembly:

   20280:     48 8b 9c 24 90 00 00       mov    0x90(%rsp),%rbx
   20287:     00
   20288:     bf 1d 02 02 10             mov    $0x1002021d,%edi
   2028d:     31 c0                      xor    %eax,%eax
   2028f:     48 63 f3                   movslq %ebx,%rsi
   20292:     66 0f 1f 84 00 00 00       nopw   0x0(%rax,%rax,1)
   20299:     00 00
   2029b:     e8 a0 24 00 00             callq  22740 <printf>

---

   20280:     8b 9c 24 90 00 00 00       mov    0x90(%rsp),%ebx
   20287:     bf 1d 02 02 10             mov    $0x1002021d,%edi
   2028c:     31 c0                      xor    %eax,%eax
   2028e:     89 de                      mov    %ebx,%esi
   20290:     66 66 2e 0f 1f 84 00       data32 nopw %cs:0x0(%rax,%rax,1)
   20297:     00 00 00 00
   2029b:     e8 a0 24 00 00             callq  22740 <printf>



where the top part is modified while the bottom part is original. I guess the problem is at the red line because it seems that only the number in ebx (not rbx) is passed through. Also there is problem at the "mov" instruction because the modified version specifies the "mov" to copy 32 bit to 64 bit (movslq). Is there anything out of "read sandboxing" that prevent read out of sandbox?

No, of course no. I think you've misunderstood the whole thing. There are two POV about the compiled NaCl code: compiler's model and NaCl POV.

Compiler's POV is simple: NaCl uses x32 ILP32 memory model ( https://ru.wikipedia.org/wiki/X32_ABI ) and thus is limited to 4GiB address space range. There are no need for read sandboxing or write sandboxing: correctly written program in the absence of bugs in GCC and/or clang will be limited to 4GiB range. 0..4GiB for traditional x32 mode, %R15..%R15+4GiB for NaCl mode.

But we are talking about security. From security POV words "correctly written program" are no something convincing. What'll happen if someone will do something crazy? Will take random number, convert it into pointer to function and call it? Then (depending on whether you are lucky or unlucky) you'll be able to reach out of the sandbox.

To prevent these bugs (intentional or inintentional) from being exploitable NaCl adds "write sandboxing" and/or "read sandboxing". Both are DOING NOTHING in correctly written program.

If you want to reach out of the 4GiB you'll need to use LP64 model which will be something VERY different from what NaCl uses. You could do that (LONG_TYPE_SIZE and POINTER_SIZE are dfined in nacl.h) but I'm not sure what are you planning to do after that. This will break everything: this will break security model, this will expland sizes of data structures (thus TCB will not be usable without changes), this will mean that IRT, NEWLIB and GLIBC will need to be changed...

Huge amount of work with totally unclear end goal. What are planning to produce in the end? If you want to run binary and don't care about sandboxing and security then there are already exist an easy way to do that: just use regular compiler and dlopen!

Tiancheng Yi

unread,
Jul 24, 2015, 9:45:04 AM7/24/15
to Native-Client-Discuss, kh...@chromium.org


But we are talking about security. From security POV words "correctly written program" are no something convincing. What'll happen if someone will do something crazy? Will take random number, convert it into pointer to function and call it? Then (depending on whether you are lucky or unlucky) you'll be able to reach out of the sandbox.

Sorry, I might have not clearly explained what I am trying to do. Basically it is not me who want to do the "crazy" things. The thing I am concerning is that if someone doing something crazy inside the NaCl sandbox, is that possible that he/she can read outside the sandbox (my credit card No.?)? So I had the previous naive test; if I am the crazy man and I get the address of the sensitive data, can I read it inside the sandbox?
 

To prevent these bugs (intentional or inintentional) from being exploitable NaCl adds "write sandboxing" and/or "read sandboxing". Both are DOING NOTHING in correctly written program.

If I cannot read outside the sandbox even without the read sandboxing, I would like to have a version with it removed. Previously you mentioned that the read sandboxing is mainly focusing on windows (have I misunderstood it?), so I would like to test it on Linux.

Derek Schuff

unread,
Jul 24, 2015, 12:51:01 PM7/24/15
to Tiancheng Yi, Native-Client-Discuss
In PNaCl's LLVM sources in lib/Target/X86/X86NaClRewritePass.cpp on line 522 in X86NaClRewritePass::ApplyMemorySFI there's a check:

if (!IsLoad(MI) && !IsStore(MI))

which decides whether an instruction should have its memory operands sandboxed. It should be sufficient to remove the IsLoad check, although I haven't tried it.

For getting the sources and building the PNaCl toolchain, refer to https://www.chromium.org/nativeclient/pnacl/developing-pnacl (in particular the "Toolchain Development" section)

In answer to your other question. If you remove the read sandboxing inserted by the compiler, (or more properly, if you skip validation or change the validator so that it accepts such programs), then the untrusted code will be able to read outside the sandbox. This means it can read anything in the NaCl process, including the trusted code's data and whatever the OS happens to inject into the process. Maybe that's acceptable to you, or maybe not, depending on what you're trying to do.

On Fri, Jul 24, 2015 at 4:41 AM Tiancheng Yi <tianche...@gmail.com> wrote:
Many thanks for the reply.

Yes I am trying to remove the read sandbox. It will be very grateful for me if you could offer some help on modifying the source code. Currently I am doing with the GCC based NaCl toolchain but I think PNaCl is also good for me to understand in depth.


in the LLVM source 

Victor Khimenko

unread,
Jul 24, 2015, 2:15:56 PM7/24/15
to Tiancheng Yi, Native-Client-Discuss
On Fri, Jul 24, 2015 at 4:45 PM, Tiancheng Yi <tianche...@gmail.com> wrote:


But we are talking about security. From security POV words "correctly written program" are no something convincing. What'll happen if someone will do something crazy? Will take random number, convert it into pointer to function and call it? Then (depending on whether you are lucky or unlucky) you'll be able to reach out of the sandbox.

Sorry, I might have not clearly explained what I am trying to do. Basically it is not me who want to do the "crazy" things. The thing I am concerning is that if someone doing something crazy inside the NaCl sandbox, is that possible that he/she can read outside the sandbox (my credit card No.?)? So I had the previous naive test; if I am the crazy man and I get the address of the sensitive data, can I read it inside the sandbox?

You could (if you remove validation checks), but not if you'll use ANSI C. Go below that (e.g. add couple lines in inline assembler) and everything should work like a charm.

Note that currently there are no way to enforce "write-sandboxing only" model: if you enable validation then your program without read sandboxing will not be validateable, if you'll disable it then you'll be easily to go beyond sandbox boundaries.
 
 

To prevent these bugs (intentional or inintentional) from being exploitable NaCl adds "write sandboxing" and/or "read sandboxing". Both are DOING NOTHING in correctly written program.

If I cannot read outside the sandbox even without the read sandboxing, I would like to have a version with it removed.

You can - just not with a proper ANSI C program. Go beyond ANSI C - and you could easily do that. That's why read sandboxing is there and that's why it's enforced.
 
Previously you mentioned that the read sandboxing is mainly focusing on windows (have I misunderstood it?),

The NEED for read sandboxing is dictated by Windows, yes. You'll not find many interesting things in the memory of sel_ldr process on Liniux - just some internal sel_ldr structures, no credit card numbers, no cookies, etc. On Windows you could find a lot of data besides that. PROBABLY no card numbers, but a lot of internal system-wide structures.

Tiancheng Yi

unread,
Jul 30, 2015, 9:04:33 AM7/30/15
to Native-Client-Discuss, kh...@chromium.org
As I have now turned off the masking by removing %nacl prefix, I now wonder can I able to further disable the "base+offset" addressing model (as I presume they are the two which add the overhead). I am curious about the performance gain without the read sandboxing on Linux. According to your previous explanation, is it the case that this operation will also break the nacl LP32 model?

Victor Khimenko

unread,
Jul 30, 2015, 11:24:12 AM7/30/15
to Tiancheng Yi, Native-Client-Discuss
On Thu, Jul 30, 2015 at 4:04 PM, Tiancheng Yi <tianche...@gmail.com> wrote:
As I have now turned off the masking by removing %nacl prefix, I now wonder can I able to further disable the "base+offset" addressing model (as I presume they are the two which add the overhead). I am curious about the performance gain without the read sandboxing on Linux. According to your previous explanation, is it the case that this operation will also break the nacl LP32 model?

If you removed the %nacl prefix then you've already broken the security sandbox even if correct C programs couldn't exploit it. It's harder (but not impossible) to remove %r15 use because all pointers in x86_64-nacl-gcc compiler code only contain low 32bits of address, high 32bits are added in the memory access.

It would easier to give suggestions if you'll explain what you are trying to achieve. Right now it looks as if you are trying to randomly change bits and pieces of NaCl with no clear goal in sight. Since you first step made the result useless for security sandbox I'm not sure what and where you want to do in the next steps: if you don't need NaCl for security then... what other use could you imagine for it?

Bill Huang

unread,
Jul 30, 2015, 11:34:35 AM7/30/15
to Native-Client-Discuss, tianche...@gmail.com, dsc...@google.com

Hi Derek:
  I am also interested in this topic and I have try your suggestion. Basically it works for removing registers masking when instruction might be "Load" (like move from memory to regs). However, I still have some problem to actually access physical memory address since now the instruction still using 32 bit registers as dst. As below:



  Then the program can not run somehow because access not correct place. 

  Could I ask further where could be possible to make change to use like rax instead to get the full address?

Derek Schuff

unread,
Jul 30, 2015, 12:46:17 PM7/30/15
to Bill Huang, Native-Client-Discuss, tianche...@gmail.com
Hi Bill,

You're right, that change won't quite do it.
Since the pointers are 32-bit we would still need to generate loads with r15 as a base register, like so:
mov 4(%r15, %rax), %esi
(This is what the RewritePass does).

The only difference would be that you wouldn't need the mov %eax,%eax before it.
To change that, look for HandleMemoryRefTruncation and/or its callers in lib/Target/X86/MCTargetDesc/X86MCNaCl.cpp. that would be a small and simple change.

In principle, without the sandboxing restriction for loads, you could, e.g. calculate a full 64-bit pointer and do arbitrary things like:
leaq (%r15, %rax), %rax
mov (%rax), %esi
mov 4(%rax), %edi

In practice it would be pretty nontrivial to get the compiler to do things like that (especially with loads and stores being asymmetric).

Also in the example you posted, it looks to me like all of the loads are sandboxed in the usual way.

Victor Khimenko

unread,
Jul 30, 2015, 2:20:47 PM7/30/15
to Native Client Discuss, Tiancheng Yi, Derek Schuff
On Thu, Jul 30, 2015 at 6:34 PM, Bill Huang <s9424...@gmail.com> wrote:

Hi Derek:
  I am also interested in this topic and I have try your suggestion.

We have a topic? News to me. All I can see is a discussion about how to break NaCl in many different ways so far.

Basically dicsussion goes like this:
- Where wheels are attached to the Volkswagen Beetle car?
- See the bolts near the bottom.
- I've removed them and nothing have changed!
- You need to remove wheel, too.
- I've replaced wheels with wheels from Caterpillar 797 and now everything is broken!
- Well, duh: of course you couldn't just attach wheels from Caterpillar 797 to Volkswagen Beetle and produce something usable. If you want to make monster truck you'll need to change suspension, engine and do many other modification before you'll get something moveable. Volkswagen Beetle is just not designed 4-meter tall wheels!
 
Basically it works for removing registers masking when instruction might be "Load" (like move from memory to regs). However, I still have some problem to actually access physical memory address since now the instruction still using 32 bit registers as dst. As below:



  Then the program can not run somehow because access not correct place. 

Of course. NaCl passes part of the address in %r15 register and another part in the %eax, you are using %rax here for the memory access.

  Could I ask further where could be possible to make change to use like rax instead to get the full address?

Somewhere in the NaCl code. Don't really know where. There are many places where the assumption that UCB pointer is 32bit is backed it. You'll need to find them all. Then you'll need to change pointer size (in gcc/config/i386/nacl.h in case of GCC) and also make sure that newlib, IRT and other places support large pointers, too.

I think there are fundamental misunderstanding here: you've read the article which explained how NaCl *sandbox* works and see that the only thing which restricts a program to 4GiB rand is write sandboxing (+read sandboxing in production) then assumed that it's THE ONLY thing which keeps program tied to these 4GiB address space. But that not true at all: we've started with the certain security model ("untrusted code operates with 4GiB of address space between %r15 and %r15+4GiB" and some other restrictions) and then we've written compiler which is designed for that mode, standard library with is designed for that mode, linker with works in that mode (dynamic linker in glibc, static linker is mostly mode-ignorant), ABI designed with that mode and so. There are thousands lines of code and hundreds of places in NaCl where assumption that we are operating with 32bit pointers in the area between %r15 and %r15+4GiB are embedded!

If you want to go beyong the sandbox - it's easy: remove read/write sandboxing requirement and write few instructions in raw assembler, bam: done. But if you want to write code in C which does the same then you'll need to basically change all these pieces to not assume that we are dealign with 32bit pointers in range %r15..%r15+4GiB, but with the whole 16EiB range!



  


On Friday, 24 July 2015 17:51:01 UTC+1, Derek Schuff wrote:
In PNaCl's LLVM sources in lib/Target/X86/X86NaClRewritePass.cpp on line 522 in X86NaClRewritePass::ApplyMemorySFI there's a check:

if (!IsLoad(MI) && !IsStore(MI))

which decides whether an instruction should have its memory operands sandboxed. It should be sufficient to remove the IsLoad check, although I haven't tried it.

For getting the sources and building the PNaCl toolchain, refer to https://www.chromium.org/nativeclient/pnacl/developing-pnacl (in particular the "Toolchain Development" section)

In answer to your other question. If you remove the read sandboxing inserted by the compiler, (or more properly, if you skip validation or change the validator so that it accepts such programs), then the untrusted code will be able to read outside the sandbox. This means it can read anything in the NaCl process, including the trusted code's data and whatever the OS happens to inject into the process. Maybe that's acceptable to you, or maybe not, depending on what you're trying to do.

On Fri, Jul 24, 2015 at 4:41 AM Tiancheng Yi <tianche...@gmail.com> wrote:
Many thanks for the reply.

Yes I am trying to remove the read sandbox. It will be very grateful for me if you could offer some help on modifying the source code. Currently I am doing with the GCC based NaCl toolchain but I think PNaCl is also good for me to understand in depth.


in the LLVM source 

--

Victor Khimenko

unread,
Jul 30, 2015, 2:26:54 PM7/30/15
to Native Client Discuss, Bill Huang, Tiancheng Yi
On Thu, Jul 30, 2015 at 7:46 PM, 'Derek Schuff' via Native-Client-Discuss <native-cli...@googlegroups.com> wrote:
Hi Bill,

You're right, that change won't quite do it.
Since the pointers are 32-bit we would still need to generate loads with r15 as a base register, like so:
mov 4(%r15, %rax), %esi
(This is what the RewritePass does).

The only difference would be that you wouldn't need the mov %eax,%eax before it.
To change that, look for HandleMemoryRefTruncation and/or its callers in lib/Target/X86/MCTargetDesc/X86MCNaCl.cpp. that would be a small and simple change.

In principle, without the sandboxing restriction for loads, you could, e.g. calculate a full 64-bit pointer and do arbitrary things like:
leaq (%r15, %rax), %rax
mov (%rax), %esi
mov 4(%rax), %edi

In practice it would be pretty nontrivial to get the compiler to do things like that (especially with loads and stores being asymmetric).

Also in the example you posted, it looks to me like all of the loads are sandboxed in the usual way.

They are sandboxed on the left and not sandboxed on the right. %r15 is also removed from the address thus the whole thing blew apart as expected.
 
On Thu, Jul 30, 2015 at 8:34 AM Bill Huang <s9424...@gmail.com> wrote:

Hi Derek:
  I am also interested in this topic and I have try your suggestion. Basically it works for removing registers masking when instruction might be "Load" (like move from memory to regs). However, I still have some problem to actually access physical memory address since now the instruction still using 32 bit registers as dst. As below:



  Then the program can not run somehow because access not correct place. 

  Could I ask further where could be possible to make change to use like rax instead to get the full address?



  


On Friday, 24 July 2015 17:51:01 UTC+1, Derek Schuff wrote:
In PNaCl's LLVM sources in lib/Target/X86/X86NaClRewritePass.cpp on line 522 in X86NaClRewritePass::ApplyMemorySFI there's a check:

if (!IsLoad(MI) && !IsStore(MI))

which decides whether an instruction should have its memory operands sandboxed. It should be sufficient to remove the IsLoad check, although I haven't tried it.

For getting the sources and building the PNaCl toolchain, refer to https://www.chromium.org/nativeclient/pnacl/developing-pnacl (in particular the "Toolchain Development" section)

In answer to your other question. If you remove the read sandboxing inserted by the compiler, (or more properly, if you skip validation or change the validator so that it accepts such programs), then the untrusted code will be able to read outside the sandbox. This means it can read anything in the NaCl process, including the trusted code's data and whatever the OS happens to inject into the process. Maybe that's acceptable to you, or maybe not, depending on what you're trying to do.

On Fri, Jul 24, 2015 at 4:41 AM Tiancheng Yi <tianche...@gmail.com> wrote:
Many thanks for the reply.

Yes I am trying to remove the read sandbox. It will be very grateful for me if you could offer some help on modifying the source code. Currently I am doing with the GCC based NaCl toolchain but I think PNaCl is also good for me to understand in depth.


in the LLVM source 

--

Derek Schuff

unread,
Jul 30, 2015, 2:29:20 PM7/30/15
to native-cli...@googlegroups.com, Bill Huang, Tiancheng Yi
On Thu, Jul 30, 2015 at 11:26 AM Victor Khimenko <kh...@chromium.org> wrote:

Also in the example you posted, it looks to me like all of the loads are sandboxed in the usual way.

They are sandboxed on the left and not sandboxed on the right. %r15 is also removed from the address thus the whole thing blew apart as expected.
 

Oh, wow I completely failed to scroll to the right and never saw the right-hand set of instructions. So yeah, ignore that statement, but my previous statements about LLVM still stand. 

Tiancheng Yi

unread,
Jul 31, 2015, 12:19:47 PM7/31/15
to Native-Client-Discuss, tianche...@gmail.com, kh...@chromium.org
In one word what I want is the difference of performance. Originally I thought NaCl without read sandboxing is not that unacceptable as it was published as paper. With masking and "base+offset" removed perhaps there would be better performance (although it would not be that safe as before). Also I only need the nacl sandbox but not based on browser. 

Derek Schuff

unread,
Jul 31, 2015, 12:29:21 PM7/31/15
to Native-Client-Discuss, tianche...@gmail.com, kh...@chromium.org
Probably the only way to do what you want is to try it out.
Probably what you can do is the following:
0. (I assume you've done this, but run some benchmarks first to see if the performance with LLVM is acceptable as it is today).
1. Make a simple modification to LLVM to remove the extra masking on reads. It will still use the r15 base register but that's an easy change to make in the compiler.
2. Run whatever benchmarks you want with the unmodified compiler, and with the modified compiler with the validator disabled. If the performance is unacceptable, you could try more invasive compiler changes, but it might be challenging. If the performance is acceptable, then...
3. Modify the validator to accept reads without masking.


Victor Khimenko

unread,
Jul 31, 2015, 1:18:54 PM7/31/15
to Tiancheng Yi, Native-Client-Discuss
On Fri, Jul 31, 2015 at 7:19 PM, Tiancheng Yi <tianche...@gmail.com> wrote:
In one word what I want is the difference of performance.

If your goal is performance then why are you trying to access memory outside of sandbox? I'm confused.
 
Originally I thought NaCl without read sandboxing is not that unacceptable as it was published as paper.

It may be acceptable in some cases. And yes, read sandboxing is more expensive than write sandboxing (because it breaks various prefetch mechanisms in CPUs), but you'll still have some bundles, no "ret" (which breaks call/ret optimiations on contemporary CPUs), etc.
 
With masking and "base+offset" removed perhaps there would be better performance (although it would not be that safe as before).

You remove masking both from reads and writes (and structure of NaCl gcc makes it hard to remove only read sandboxing without removing write sandboxing, too) then you are opening the sandbox wide open. I think you'll want to play with LLVM instead if you want to see numbers.

Bill Huang

unread,
Aug 3, 2015, 9:16:59 AM8/3/15
to Native-Client-Discuss, tianche...@gmail.com, kh...@chromium.org, dsc...@google.com
Hi all: 
  Thanks for the replies. For one more minor question. Basically I follow the instructions here to rebuild and install pnacl toolchain. But like the discussion above, for LLVM I just changed a file or two, but even with incrementally build , it still take so long time and seems checking every directory.
  Is any other approach that I can just build for llvm directory maybe? Or all the other related toolchain components always have to been rebuild together?

Cheers,

Bill
To unsubscribe from this group and stop receiving emails from it, send an email to native-client-discuss+unsub...@googlegroups.com.

Tiancheng Yi

unread,
Aug 3, 2015, 10:38:51 AM8/3/15
to Native-Client-Discuss, tianche...@gmail.com


On Friday, 31 July 2015 18:18:54 UTC+1, khim wrote:

On Fri, Jul 31, 2015 at 7:19 PM, Tiancheng Yi <tianche...@gmail.com> wrote:
In one word what I want is the difference of performance.

If your goal is performance then why are you trying to access memory outside of sandbox? I'm confused.

I am sorry for making this point confused. I do not want, but I would like to verify that I can able to do that to make sure I really get the version without read sandbox to play with.

Victor Khimenko

unread,
Aug 3, 2015, 10:50:00 AM8/3/15
to Native Client Discuss, Tiancheng Yi
On Mon, Aug 3, 2015 at 5:38 PM, Tiancheng Yi <tianche...@gmail.com> wrote:


On Friday, 31 July 2015 18:18:54 UTC+1, khim wrote:

On Fri, Jul 31, 2015 at 7:19 PM, Tiancheng Yi <tianche...@gmail.com> wrote:
In one word what I want is the difference of performance.

If your goal is performance then why are you trying to access memory outside of sandbox? I'm confused.

I am sorry for making this point confused. I do not want, but I would like to verify that I can able to do that to make sure I really get the version without read sandbox to play with.

There are much simpler ways to do that: just use "x86_64-nacl-gcc -S -dP" and look on the output. If "%nacl" is gone from expected places then you are done. "-dP" will prepend each assembler instriction with the RTL representation which produced that instruction. If you are seeing something like
  mov     %edi, %edi      / 3     zero_extendsidi2_rex64/1
then it's obvious that it's not read/write sandboxing! Read/write sandboxing looks like this:
  movl    %nacl:4(%r15,%rdi), $0  / 7     *movsi_1/2

Note that when you remove read sanboxing from these two instructions the end result STILL looks like it's sandboxed:
  mov     %edi, %edi      / 3     zero_extendsidi2_rex64/1
  movl    4(%r15,%rdi), $0  / 7     *movsi_1/2
but that code is there to make code correct, not to make it validateable!

--
You received this message because you are subscribed to the Google Groups "Native-Client-Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to native-client-di...@googlegroups.com.

Derek Schuff

unread,
Aug 3, 2015, 11:49:17 AM8/3/15
to Bill Huang, Native-Client-Discuss, tianche...@gmail.com, kh...@chromium.org
If you run toolchain_build_pnacl.py without giving it a particular build target (e.g. llvm_x86_64_linux) then it will build not only LLVM, but all of the tools (LLVM, binutils) and target libraries (newlib, libc++, etc) for all of the supported architectures (x86, arm, etc), which is indeed a lot of stuff.

What I usually do when I'm just hacking on LLVM is to run the whole thing once to ensure everything is set up, then make changes to llvm and just run make (or ninja, if using --cmake) from the toolchain_build_out/llvm_x86_64_linux_work directory (that's the build directory for LLVM). Then you can run the regression tests, feed source files to clang to see the output, etc. Then when you think you might have something working you can run the whole toolchain build again, and you'll have something that has all the target libraries built with your changes, and if you install it, you'll have something that can actually link programs.

To unsubscribe from this group and stop receiving emails from it, send an email to native-client-di...@googlegroups.com.

Bill Huang

unread,
Aug 4, 2015, 3:05:40 PM8/4/15
to Native-Client-Discuss, s9424...@gmail.com, tianche...@gmail.com, kh...@chromium.org, dsc...@google.com
Hi Derek: 
  Thanks for the tip. 
  BTW, I am trying to play with modifying assembly file (.s) produced by nacl-newlib toolchain. But I also wanna try the same thing in pnacl, however, the clang tool seems not able to feed with .s file to produce .bc or .o file, right? Is there any other options can make it work with modified .s file that integrated with pnacl-llvm toolchain?
  More specific, in the .s file produced by pnacl-clang (produced with -S option), I saw some functions called with i32 prefix , can I make change to that? like then call i64 version functions?

Cheers,

Bill
To unsubscribe from this group and stop receiving emails from it, send an email to native-client-discuss+unsub...@googlegroups.com.

Derek Schuff

unread,
Aug 18, 2015, 1:29:38 PM8/18/15
to Bill Huang, Native-Client-Discuss, tianche...@gmail.com, kh...@chromium.org
Hi Bill,
Sorry for the delay here;
The short answer to this question is that you should try out x86_64-nacl-clang instead of pnacl-clang; it behaves much more like a traditional compiler than pnacl, and it supports use of assembly files, etc. By default it uses x86_64-nacl-as (gnu as) as its assembler, but if you use the -integrated-as flag it will use the LLVM assembler, and this is also supported.
Reply all
Reply to author
Forward
0 new messages