How instructions are disassembled for validation on ᴀᴍᴅ64 ?

44 views
Skip to first unread message

lael.c...@gmail.com

unread,
Sep 9, 2016, 2:51:16 PM9/9/16
to Native-Client-Discuss
Hi,

I couldn’t found what are the functions which handle instructions validation…

So how are they validated in the case of ᴀᴍᴅ64 ? does ɴaᴄl start disassembling at each 4 bytes boundary or does it to disassemble starting at each jump target ?

Victor Khimenko

unread,
Sep 9, 2016, 3:04:15 PM9/9/16
to Native Client Discuss
On Fri, Sep 9, 2016 at 8:51 PM, <lael.c...@gmail.com> wrote:
Hi,

I couldn’t found what are the functions which handle instructions validation…

Hmm... The functions are caled ValidateChunkIA32 and ValidateChunkAMD64 which kind of gived a clue about what they are doing.
 
So how are they validated in the case of ᴀᴍᴅ64 ? does ɴaᴄl start disassembling at each 4 bytes boundary or does it to disassemble starting at each jump target ?

It disassembles each 16-byte chunk separately. It remembers the instruction boundaries and then checks jump targets.

The whole thing is described in some details in src/trusted/validator_ragel/docs/validator_internals.html

lael.c...@gmail.com

unread,
Sep 9, 2016, 5:03:02 PM9/9/16
to Native-Client-Discuss

Le vendredi 9 septembre 2016 21:04:15 UTC+2, khim a écrit :

It disassembles each 16-byte chunk separately. It remembers the instruction boundaries and then checks jump targets.
But, isn’t the boundary of the target of jump be 4 bytes both on ɪᴀ‑32 and on ᴀᴍᴅ64 ?

The whole thing is described in some details in src/trusted/validator_ragel/docs/validator_internals.html
But of which project name… May you post a link please ?

Victor Khimenko

unread,
Sep 9, 2016, 6:43:36 PM9/9/16
to Native Client Discuss
On Fri, Sep 9, 2016 at 11:03 PM, <lael.c...@gmail.com> wrote:

Le vendredi 9 septembre 2016 21:04:15 UTC+2, khim a écrit :

It disassembles each 16-byte chunk separately. It remembers the instruction boundaries and then checks jump targets.
But, isn’t the boundary of the target of jump be 4 bytes both on ɪᴀ‑32 and on ᴀᴍᴅ64 ?

No. It must either be 16-byte boundary or, if it's static jump, an instruction boundary. Both of these possibilities are considered by validator.
 

The whole thing is described in some details in src/trusted/validator_ragel/docs/validator_internals.html
But of which project name… May you post a link please ?

I don't have a link, sorry. It's part of native client project and as such you can find source here:

But it's not easy to read it like that.

Better to pull native client code and go from there:

lael.c...@gmail.com

unread,
Sep 9, 2016, 7:27:25 PM9/9/16
to Native-Client-Discuss
Le samedi 10 septembre 2016 00:43:36 UTC+2, khim a écrit :

No. It must either be 16-byte boundary or, if it's static jump, an instruction boundary. Both of these possibilities are considered by validator.
By an instruction boundary, do you mean the end of a previous instruction ?

Do you also mean static jump can both jump on instruction boundary and on 16 bytes boundary ? or does static jump can only jump on instruction boundary ?

Victor Khimenko

unread,
Sep 9, 2016, 8:57:39 PM9/9/16
to Native Client Discuss
On Sat, Sep 10, 2016 at 1:27 AM, <lael.c...@gmail.com> wrote:
Le samedi 10 septembre 2016 00:43:36 UTC+2, khim a écrit :

No. It must either be 16-byte boundary or, if it's static jump, an instruction boundary. Both of these possibilities are considered by validator.
By an instruction boundary, do you mean the end of a previous instruction ?

More-or-less. There are "superinstructions" which are idivisible sequences of instructions, you couldn't jump inside of that sequence.

Do you also mean static jump can both jump on instruction boundary and on 16 bytes boundary ? or does static jump can only jump on instruction boundary ?

Code must be split in 16-byte bundles and each 16th byte must be start of the instruction thus every jump at 16-byte aligned address is valid (static or dynamic). nacl-as will add "nops" as required to enforce that. Static jumps can also jump on 16-byte unaligned address if it's point between two instructions.

lael.c...@gmail.com

unread,
Sep 9, 2016, 9:39:55 PM9/9/16
to Native-Client-Discuss
Le samedi 10 septembre 2016 02:57:39 UTC+2, khim a écrit :

Do you also mean static jump can both jump on instruction boundary and on 16 bytes boundary ? or does static jump can only jump on instruction boundary ?

Code must be split in 16-byte bundles and each 16th byte must be start of the instruction thus every jump at 16-byte aligned address is valid (static or dynamic). nacl-as will add "nops" as required to enforce that. Static jumps can also jump on 16-byte unaligned address if it's point between two instructions.
And on the reverse, I think it’s impossible to static jump at a 16 byte aligned address if it’s in the middle of an instruction. Isn’t it ?

Otherwise, I guess finding the 0xCD80 (int 0x80) hex sequence at r15+0x164f170 would be a red flag. (with 0x164f170 being in a range with execute permissions)

Derek Schuff

unread,
Sep 9, 2016, 11:37:15 PM9/9/16
to Native-Client-Discuss
A 16-byte aligned address can never be in the middle of an instruction. Otherwise it could be jumped to by an indirect jump.

--
You received this message because you are subscribed to the Google Groups "Native-Client-Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to native-client-di...@googlegroups.com.
To post to this group, send email to native-cli...@googlegroups.com.
Visit this group at https://groups.google.com/group/native-client-discuss.
For more options, visit https://groups.google.com/d/optout.

Victor Khimenko

unread,
Sep 10, 2016, 6:40:47 AM9/10/16
to Native Client Discuss
On Sat, Sep 10, 2016 at 3:39 AM, <lael.c...@gmail.com> wrote:
Le samedi 10 septembre 2016 02:57:39 UTC+2, khim a écrit :

Do you also mean static jump can both jump on instruction boundary and on 16 bytes boundary ? or does static jump can only jump on instruction boundary ?

Code must be split in 16-byte bundles and each 16th byte must be start of the instruction thus every jump at 16-byte aligned address is valid (static or dynamic). nacl-as will add "nops" as required to enforce that. Static jumps can also jump on 16-byte unaligned address if it's point between two instructions.
And on the reverse, I think it’s impossible to static jump at a 16 byte aligned address if it’s in the middle of an instruction. Isn’t it ?

If any instruction (or undivisible instruction sequence) crosses 16-byte boundary then the whole file is declared "invalid" and execution does not even start. Because, as Derek pointed out, in such a case indirect jump would be able to jump there, too.
 
Otherwise, I guess finding the 0xCD80 (int 0x80) hex sequence at r15+0x164f170 would be a red flag. (with 0x164f170 being in a range with execute permissions)

In the case of x86_64 we also use NX bit, but yes, that's really suspicious. If your loader allows such configuration then your sandbox is broken.

lael.c...@gmail.com

unread,
Sep 10, 2016, 8:13:36 AM9/10/16
to Native-Client-Discuss


Le samedi 10 septembre 2016 12:40:47 UTC+2, khim a écrit :

If any instruction (or undivisible instruction sequence) crosses 16-byte boundary then the whole file is declared "invalid" and execution does not even start. Because, as Derek pointed out, in such a case indirect jump would be able to jump there, too.
No, I mean the case it would cross an instruction, but that it would still be aligned on a 16 byte boundary.
 
Otherwise, I guess finding the 0xCD80 (int 0x80) hex sequence at r15+0x164f170 would be a red flag. (with 0x164f170 being in a range with PROT_EXEC permissions)

In the case of x86_64 we also use NX bit, but yes, that's really suspicious. If your loader allows such configuration then your sandbox is broken.
Normally, nx is disabled if there’s PROT_EXEC permissions. I just think everything is normal because It’s in the middle of an opcode, so that static jumping to 0x164f170 is disallowed in that case.

And anyway, on return, it would fall to the next instruction which is probably invalid.

Victor Khimenko

unread,
Sep 10, 2016, 1:42:27 PM9/10/16
to Native Client Discuss
Static jump at such address is always permitted. Because you could always use dynamic jump instead. If your system allows such code then it's, most likely, unsecure and could be broken into.
 
And anyway, on return, it would fall to the next instruction which is probably invalid.

But that's not Native Client security model. On Native Client we ENSURE that it's IMPOSSIBLE to jump into the middle of instruction. Code which makes such things possible is just flat-out rejected. This ensures the viability of sandbox. If you run your code without validation enabled then you are just doing ritual dances which bring you complexity, but don't give you security.

lael.c...@gmail.com

unread,
Sep 10, 2016, 2:30:50 PM9/10/16
to Native-Client-Discuss
Le samedi 10 septembre 2016 19:42:27 UTC+2, khim a écrit :

But that's not Native Client security model. On Native Client we ENSURE that it's IMPOSSIBLE to jump into the middle of instruction. Code which makes such things possible is just flat-out rejected. This ensures the viability of sandbox. If you run your code without validation enabled then you are just doing ritual dances which bring you complexity, but don't give you security.
Ok so getting 0xCD80 at at r15+0x164f170 executable and validated should definitely be considered as a flag which require investigating both the sandbox and the ꜱᴅᴋ isn’t it ?

Between the binary was compiled by Google, and yes it has the nexe extension.

lael.c...@gmail.com

unread,
Sep 12, 2016, 3:46:42 PM9/12/16
to Native-Client-Discuss
Le samedi 10 septembre 2016 00:43:36 UTC+2, khim a écrit :
No. It must either be 16-byte boundary or, if it's static jump, an instruction boundary. Both of these possibilities are considered by validator.
Ok I think you might got wrong https://developer.chrome.com/native-client/reference/sandbox_internals/x86-64-sandbox states the jump boundary is 32 bytes. Indeed, no function have an address which is %16 but not %32.
That might explain why I found sysenter at 16 bytes executable boundary isn’t it ?

Bennet Yee (余仕斌)

unread,
Sep 12, 2016, 5:36:06 PM9/12/16
to Native Client Discuss
see sel_config.h NACL_BLOCK_SHIFT.  the block size is 16 for arm, 32 for x86-{32,64}.  https://chromium.googlesource.com/native_client/src/native_client.git/+/master/src/trusted/service_runtime/nacl_config.h

--
You received this message because you are subscribed to the Google Groups "Native-Client-Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to native-client-discuss+unsub...@googlegroups.com.
To post to this group, send email to native-client-discuss@googlegroups.com.



--
bennet s yee
i usually don't capitalize due to mild tendonitis

lael.c...@gmail.com

unread,
Sep 12, 2016, 5:41:55 PM9/12/16
to Native-Client-Discuss
Le lundi 12 septembre 2016 23:36:06 UTC+2, Bennet Yee a écrit :
see sel_config.h NACL_BLOCK_SHIFT.  the block size is 16 for arm, 32 for x86-{32,64}.  https://chromium.googlesource.com/native_client/src/native_client.git/+/master/src/trusted/service_runtime/nacl_config.h
This explains everything. So khim is wrong.

However, I don’t see the point of enforcing alignment on ᴀʀᴍ since it’s already done by the hardware.

Last question : do ɴaᴄl pseudo codes have their own opcodes or they are just combined instructions.

Bennet Yee (余仕斌)

unread,
Sep 12, 2016, 6:36:28 PM9/12/16
to Native Client Discuss
thumb mode.  see how code switches between arm and thumb instruction sets.

not familiar with assembler changes; i think there are addressing notation that automatically adds the pseudo-atomic masking operations, and there is a mode where it could automatically pad on x86-64.  but this, like the compiler, is not part of the TCB.  just a convenience.  from the validator's view point, it's all just an instruction stream, and there are no weird addressing modes nor weird pseudo-atomic instructions.

--
You received this message because you are subscribed to the Google Groups "Native-Client-Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to native-client-discuss+unsub...@googlegroups.com.
To post to this group, send email to native-client-discuss@googlegroups.com.
Visit this group at https://groups.google.com/group/native-client-discuss.
For more options, visit https://groups.google.com/d/optout.

Victor Khimenko

unread,
Sep 13, 2016, 12:02:49 PM9/13/16
to Native Client Discuss
On Mon, Sep 12, 2016 at 11:41 PM, <lael.c...@gmail.com> wrote:
Le lundi 12 septembre 2016 23:36:06 UTC+2, Bennet Yee a écrit :
see sel_config.h NACL_BLOCK_SHIFT.  the block size is 16 for arm, 32 for x86-{32,64}.  https://chromium.googlesource.com/native_client/src/native_client.git/+/master/src/trusted/service_runtime/nacl_config.h
This explains everything. So khim is wrong.

Sorry about that. In early days we had that configurable and I have forgotten that 32 was chosen, not 16.
 
However, I don’t see the point of enforcing alignment on ᴀʀᴍ since it’s already done by the hardware.

Not exactly. Induvidual instructions are aligned, sure, but there are some "indivisible instruction sequences" which must be either executed or omitted (memory access is two operations: first one verifies that address is in first 1GiB, second one actually accesses memory). Thus ARM requires "bundles", too.
 
Last question : do ɴaᴄl pseudo codes have their own opcodes or they are just combined instructions.

Both methods are implemented by NaCl assembler:

Therea re also "automatic sandboxing assembler" implemented in PNaCl, but I don't know details. Look in the native-client-discuss archive.

P.S. Note that list of instructions was expanded after these documents were written. Some AVX instructions are allowed today, too.

lael.c...@gmail.com

unread,
Oct 26, 2016, 7:20:12 PM10/26/16
to Native-Client-Discuss
Hello,

sorry to resurrect this.

Le samedi 10 septembre 2016 12:40:47 UTC+2, khim a écrit :
If any instruction (or undivisible instruction sequence) crosses 16-byte boundary then the whole file is declared "invalid" and execution does not even start. Because, as Derek pointed out, in such a case indirect jump would be able to jump there, too.
How does it knows an instruction cross a boundary ? Does it start disassembling outside the current boundary in order to check or does each boundaries are disassembled separately so that if an instruction cross a boundary it would be considered truncated and invalid ?

Bennet Yee (余仕斌)

unread,
Oct 26, 2016, 7:27:43 PM10/26/16
to Native Client Discuss

--
You received this message because you are subscribed to the Google Groups "Native-Client-Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to native-client-discuss+unsub...@googlegroups.com.
To post to this group, send email to native-client-discuss@googlegroups.com.
Visit this group at https://groups.google.com/group/native-client-discuss.
For more options, visit https://groups.google.com/d/optout.

lael.c...@gmail.com

unread,
Oct 26, 2016, 8:05:35 PM10/26/16
to Native-Client-Discuss
Le jeudi 27 octobre 2016 01:27:43 UTC+2, Bennet Yee a écrit :
Does it applies to 64 bits or is it segmentation based ? 

Bennet Yee (余仕斌)

unread,
Oct 26, 2016, 8:14:44 PM10/26/16
to Native Client Discuss

--
You received this message because you are subscribed to the Google Groups "Native-Client-Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to native-client-discuss+unsub...@googlegroups.com.
To post to this group, send email to native-client-discuss@googlegroups.com.
Visit this group at https://groups.google.com/group/native-client-discuss.
For more options, visit https://groups.google.com/d/optout.

lael.c...@gmail.com

unread,
Oct 26, 2016, 8:25:59 PM10/26/16
to Native-Client-Discuss
Le jeudi 27 octobre 2016 01:27:43 UTC+2, Bennet Yee a écrit :
I see nothing on how disassembly parsing is performed in that section, do you mean only the first reference matter ?

lael.c...@gmail.com

unread,
Oct 26, 2016, 8:26:59 PM10/26/16
to Native-Client-Discuss
Le jeudi 27 octobre 2016 02:14:44 UTC+2, Bennet Yee a écrit :

Bennet Yee (余仕斌)

unread,
Oct 26, 2016, 9:54:26 PM10/26/16
to Native Client Discuss
disassembly is done using a DFA.  see native_client/src/trusted/validator_ragel/ after you


despite the README in src/trusted/validator_ragel/ saying it's not in use, the BUILD.gn in src/ builds it.  for x86 both 32 and 64 bits, rdfa_validator is used.

--
You received this message because you are subscribed to the Google Groups "Native-Client-Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to native-client-discuss+unsub...@googlegroups.com.
To post to this group, send email to native-client-discuss@googlegroups.com.
Visit this group at https://groups.google.com/group/native-client-discuss.
For more options, visit https://groups.google.com/d/optout.

lael.c...@gmail.com

unread,
Oct 27, 2016, 10:08:39 AM10/27/16
to Native-Client-Discuss
Le jeudi 27 octobre 2016 03:54:26 UTC+2, Bennet Yee a écrit :
disassembly is done using a DFA.  see native_client/src/trusted/validator_ragel/ after you


despite the README in src/trusted/validator_ragel/ saying it's not in use, the BUILD.gn in src/ builds it.  for x86 both 32 and 64 bits, rdfa_validator is used.
It’s unclear how operands are disassembled for me.

However, I’ve a second question : In the x86 instruction set, there’s no limit on instruction length (though it’s toped at 15 bytes since Pentium).But because of nacl, I suppose it’s impossible to repeat a prefix enough times to make a single instruction fills a 32 bytes block.
So what is the maximum valid nacl instruction size for amd64 ?
Reply all
Reply to author
Forward
0 new messages