* reads from unprotected memory (pkey 0) should be rare and need to be handled as untrusted data
Hi,tl;dr: would it make sense to add support for a pkey-protected partition to PA?Some background:We're working on CFI for v8 (go/v8-cfi) and one of the bigger challenges is making sure that our compilers don't generate code that leads to arbitrary control-flow even assuming that the attacker has an arbitrary read/write primitive in the renderer address space.We're planning to add a verification step to the compiler that performs certain checks on the generated code before copying it into executable memory. This will include for example that the code can't include syscall instructions and that we can track the control flow. We explicitly don't care if the code will corrupt memory since we assume that the attacker can already do this.We want to do this verification step in-process with the Intel Memory Protection Keys. I.e. we will create an isolated mode that you can switch in/out of where all data is protected with a pkey:* a pkey-protected stack* some pkey-protected globals* a pkey-protected heap (or heaps)When we're in this isolated mode, we have some stricter requirements about what the code should do.* reads from unprotected memory (pkey 0) should be rare and need to be handled as untrusted data* in particular, we can't write to a pointer coming from the unprotected memorySo as mentioned above, we need a pkey-protected heap and I'm trying to figure out if it makes sense to add pkey-support to PA.I imagine it could look like this:* a separate PartitionAllocator with a new PartitionOption to control pkey support* when changing page permissions, also set the requested pkey* all allocator metadata should also live in pkey-protected memoryI implemented a quick proof-of-concept of this in crrev.com/c/3879104.With some hacks, I can completely remove access to pkey 0 in that example and do an Alloc() and a Free() call.Some issues I ran into:* there is global state that is shared between PartitionAllocators and that I had to pkey_mprotect:* sentinel_slot_span_ (this sounds like it could be const / read-only?)* PartitionAlloc calls out to external libraries, at least in debug mode. There's no guarantee that the code won't touch pkey 0 memory.* DebugMemset calls memset which on my machine reads from global variables (this case seems benign for our use case, but will break if disabling read access to pkey 0)* Lock debug code calls pthread_self() which reads from the TLS.So to make this work, it would require:* moving the required global state into the partition (or making it read-only after initialization)* avoiding calls to shared libraries when in pkey-mode (in theory we could allow some benign-looking calls, but then we have no guarantee that they don't introduce CFI bypasses)What do you think? Does this sound like a feature that PartitionAlloc could support?Cheers,Stephen
--
You received this message because you are subscribed to the Google Groups "platform-architecture-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architect...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/platform-architecture-dev/CABg10jwi9h6MZbfcSsfm1Y3Hewp%3D2Uzp2%2B9AWZmwh1rENB6v6Q%40mail.gmail.com.
Are pkeys already used in any code in the renderer already? (I see V8 does use them) Just curious how many pkeys are left to be allocated, could the "one per-partition" be problematic?
dave.On Wed, Sep 7, 2022 at 11:48 AM 'Kentaro Hara' via platform-architecture-dev <platform-arc...@chromium.org> wrote:I strongly support this proposal.At the moment, it's not clear to me how widely pkey will be available in the next few years and I was wondering if we should support pkey in PA now or we should revisit in the future. If there is a use case in V8 and you are willing to implement it, why not? :D I believe that the per-partition pkey support can be implemented without adding code complexity.A few questions:* reads from unprotected memory (pkey 0) should be rare and need to be handled as untrusted dataDo you really need to allow read from pkey 0? Will the code refactoring in PA make it unnecessary?
Why are you adding the pkey support to PA instead of V8's allocator if your intended use case is a V8 internal thing? I'm just curious -- technically it's possible to export PA's pkey supported partition to V8.
I implemented a quick proof-of-concept of this in crrev.com/c/3879104.
With some hacks, I can completely remove access to pkey 0 in that example and do an Alloc() and a Free() call.
Some issues I ran into:
* there is global state that is shared between PartitionAllocators and that I had to pkey_mprotect:
* sentinel_slot_span_ (this sounds like it could be const / read-only?)
* reservation_offset_tables_
* PartitionAlloc calls out to external libraries, at least in debug mode. There's no guarantee that the code won't touch pkey 0 memory.
* DebugMemset calls memset which on my machine reads from global variables (this case seems benign for our use case, but will break if disabling read access to pkey 0)
* Lock debug code calls pthread_self() which reads from the TLS.
Sorry for the late reply...I implemented a quick proof-of-concept of this in crrev.com/c/3879104.
With some hacks, I can completely remove access to pkey 0 in that example and do an Alloc() and a Free() call.Some issues I ran into:
* there is global state that is shared between PartitionAllocators and that I had to pkey_mprotect:Sorry, I don't see you doing this in your CL. Anyway, am I reading it correctly that this is merely meant as a hack and we need a proper solution to split the below global state per-partition?
* sentinel_slot_span_ (this sounds like it could be const / read-only?)Indeed read-only, and should be able to easily make it per-partition anyway.* AddressPoolManager::singleton_* PartitionAddressSpace::setup_These for tracking virtual address pool usage. Multiple partitions can allocate from the same pool. I don't see what we could do about this on 32-bit systems (are pkeys even available there?), but on 64-bit systems we could create a separate pool just for this. Note, that we can't pre-reserve an arbitrary amount of memory on Android so that may not fly there, unless the pool is meant to be small?
Note also that by switching pools you'll lose BRP protection. If you want to keep it, we'd have to add an additional check to IsManagedByPartitionAllocBRPPool() that is on a very hot path.
* reservation_offset_tables_On 64-bit, these are per pool, so if we split the pools the problem will go away. Otherwise, we could make a per-root copy... it's 16kB.* PartitionAlloc calls out to external libraries, at least in debug mode. There's no guarantee that the code won't touch pkey 0 memory.
* DebugMemset calls memset which on my machine reads from global variables (this case seems benign for our use case, but will break if disabling read access to pkey 0)
* Lock debug code calls pthread_self() which reads from the TLS.Not sure what to do about those.
Your mention of TLS reminded me you'll be losing thread-cache support for this partition, which helps performance. Are these allocations currently handled by malloc/new? If not, then nothing to worry about.
There is also a PartitionRoot object itself. You'd have to make sure to allocate it from a pkey-protected memory, so I guess get your own page and use placement `new`.
Oh, and one other thing. The 64-bit-only comment is only about what *kernel* is running. It's perfectly supported to run a 32-bit userspace application on a 64-bit kernel and use pkeys. The kernel even has a selftest/ for both 32-bit and 64-bit binaries to make sure we don't break either.On Friday, September 23, 2022 at 9:30:19 AM UTC-7 Dave Hansen wrote:Hi Folks,I don't know of any Intel CPUs that are 32-bit only and support PKU. Second, the two Intel 32-bit paging modes ("32-bit Paging" and "PAE Paging") don't even have space defined in their page tables to store pkeys. Third, the Linux support is 64-bit only (depends on X86_64 at compile-time).I think it's pretty safe to assume that pkeys are a 64-bit-only thing on x86. It's _possible_ that AMD did something different with their CPUs, but I really doubt it.On Friday, September 23, 2022 at 4:54:17 AM UTC-7 Kentaro Hara wrote:I'd suggest you run blink_perf.bindings.* and Speedometer2.