It's quite messy here. What was originally a NaCl-only edge case has become a requirement for v8's virtual address space management.
Right now, we're internally quite inconsistent about when and how we round size when creating shared memory.
On POSIX, we do not align size in //base. gin::ArrayBufferSharedMemoryMapper does align though–so strictly speaking, we end up mmaping() more memory than we originally asked for. But this works out on all supported platforms, because POSIX manages things on page boundaries–so even if we only ask for 100 bytes, it's OK to try to mmap() in 4096 bytes later.
On Windows, we currently align to 64K boundaries. But this isn't actually required by any of the Windows APIs, as far as I can tell:
CreateFileMapping(),
MapViewOfFile(), and friends don't make any mention of alignment requirements for `size`. If I delete the alignment in //base, pretty much all the tests still pass–except the ones that go through v8's custom shared memory mapper.
The root cause for this turns out to be related to MSDN comment about mapping a view:
The number of bytes of a file mapping to map to the view. All bytes must be within the maximum size specified by CreateFileMapping. If this parameter is 0 (zero), the mapping extends from the specified offset to the end of the file mapping.
If we do not align in //base, but we still align to 64k in v8's shmem mapper, Windows does *not* like it if we ask for 100 bytes initially but then try to map in 65536 bytes later: IIRC, a call to `VirtualAlloc()` (or maybe `VirtualAlloc2()`?) fails and complains about invalid parameters.
What does work, even without //base alignment, is if v8's shmem wrapper only aligns to page size, i.e. 4k.
To me, this is an (indirect) indication that Windows can manage memory on a page size boundary–it's just that the system allocator itself hands out things at 64k boundaries.
What I can't figure out is if v8 has a hard dependency on aligning to allocation granularity (i.e. there's some Windows API I missed somewhere that requires this), or if we can simplify the code and remove the need to consider allocation granularity at all.