On 18 September 2018 at 21:43, Michael Clark <
michae...@mac.com> wrote:
> Hi Folks,
>
> I guess this is probably appropriate for the security or crypto working
> groups but I thought it might be worthwhile to promote some public
> discussion on “encrypting the computer” vs “computing encryption”, the
> rationale being to add to the pool of methods, whether in formal academic
> papers or methods and ideas published on mailing lists or short essays.
>
> Background: My particular interest that led me to RISC-V was piqued by the
> lowRISC project, particularly the work on tagged pointers, as part of
> self-study investigating security vulnerabilities and mitigations, combined
> with some cryptography research.
>
> I’m emailing today because of the recent release of arm authenticated
> pointers which are now in production on iOS 12 devices. I believe the
> lowRISC work, and a general body of work in this area predates arm’s
> authenticated pointers. There are several papers that publish well known
> schemes for in-memory permutations of binary images, control flow integrity
> via keys in landing pads and many other techniques to thwart modern attacks
> such as ROP.
>
> I wrote a short essay, possibly inspired in part by the lowRISC project, and
> previous experiences, reading in cryptography. The essay if high-level,
> mentions storing keys in the unused address bits of canonical pointers which
> we all know are presently all zeros or all ones in the conventional
> canonical pointer scheme. The essay is titled Crypto Binary Translation:
>
>
https://github.com/michaeljclark/rv8/blob/master/doc/src/bintrans.md
Hi Michael, thanks for starting this discussion and sharing your thoughts.
Regarding obfuscating binaries, it would be worth looking at prior
work on "software diversification". This paper seems to provide a good
overview of work in this area
<
https://www.ics.uci.edu/~perl/automated_software_diversity.pdf>. As
work like blind ROP shows
<
http://www.scs.stanford.edu/brop/bittau-brop.pdf>, attackers may
still be able to extract sufficient information about the binary to
craft a successful exploit even without access to the target
executable. You might also be interested in looking up work on
instruction set randomisation, which attacks the same problem from a
slightly different angle.
For encrypted pointers I think you'd be interested in reading
Cryptographically Enforced Control Flow Integrity
<
https://arxiv.org/pdf/1408.1451.pdf>. It notes that with a 48-bit
address space, the remaining 16-bits of a 64-bit pointer could be used
for other metadata. Though I think it only actually uses a single bit
to differentiate function pointers vs return address.
Obviously embedding metadata within pointers isn't a new idea. A
128-bit capability representation was proposed in an MIT tech report
in 2000 (co-authored by Andrew 'bunnie' Huang)
<
http://www.ai.mit.edu/projects/aries/Documents/Memos/ARIES-05.pdf>.
It observed that a 64-bit encoding would be practical, using 48-bits
for the address and the other 16-bits to encode bounds information.
This idea was revisited much later with "low-fat pointers"
<
http://www.crash-safe.org/assets/fatptr_ccs2013.pdf>. The CHERI
project of course also explores various capability representations.
Dynamic language runtimes also like to exploit unused address bits. As
this bug report shows
<
https://bugzilla.mozilla.org/show_bug.cgi?id=1143022> this can lead
to portability issues if there is no kernel co-operation. For a long
time, I've thought it would be neat to have a configurable mask to
control which bits of a virtual address are ignored, and we discussed
this a bit on isa-dev a couple of years ago
<
https://groups.google.com/a/groups.riscv.org/d/msg/isa-dev/Pm9mso-urM8/nRsoKBSsDAAJ>.
Cesar Eduardo Barros suggested a modification of the idea that has
less flexibility in hardware, but provides the same interface to the
process by having the kernel set up extra page mappings
<
https://groups.google.com/a/groups.riscv.org/d/msg/isa-dev/Pm9mso-urM8/Ji8uLyrCDAAJ>.
I think such a mechanism alongside 1 or more tag bits to mark
integrity of the embedded pointer metadata is an interesting and
potentially powerful combination.
Best,
Alex