Reducing dependencies on glibc internals in sandbox/linux/services/namespace_sandbox.cc, ForkWithFlags

152 views
Skip to first unread message

Florian Weimer

unread,
Jan 3, 2025, 9:09:25 PMJan 3
to chromi...@chromium.org
With these glibc changes

commit 7c22dcda27743658b6b8ea479283b384ad56bd5a
Author: Florian Weimer <fwe...@redhat.com>
Date: Tue Dec 17 09:20:20 2024 +0100

nptl: More useful padding in struct pthread

The previous use of padding within a union made it impossible to
re-use the padding for GLIBC_PRIVATE ABI preservation because
tcbhead_t could use up all of the padding (as was historically the
case on x86-64). Allocating padding unconditionally addresses this
issue.

Reviewed-by: Adhemerval Zanella <adhemerva...@linaro.org>

commit 30d3fd7f4f4bc8f767d73ad4e4b005c1bd234310
Author: Florian Weimer <fwe...@redhat.com>
Date: Thu Dec 19 20:56:44 2024 +0100

x86_64: Remove unused padding from tcbhead_t

This padding is difficult to use for preserving the internal
GLIBC_PRIVATE ABI. The comment is misleading. Current Address
Sanitizer uses heuristics to determine struct pthread size.
It does not depend on its precise layout. It merely scans for
pointers allocated using malloc.

Due to the removal of the padding, the assert for its start
is no longer required.

Reviewed-by: Noah Goldstein <goldst...@gmail.com>

the glibc thread descriptor layout changes significantly, and the code
in MaybeUpdateGlibcTidCache no longer works and results in crashes.

I think we can update the TID cache in the TCB in clone if !CLONE_VM on
the glibc side. This means that the TID cache check via the recursive
mutex succeeds, and the patching is skipped. This should store
compatibility with existing Chromium binaries. A first patch is here:

x86_64: Reset TID in TCB on clone that does not use CLONE_VM
<https://inbox.sourceware.org/libc-alpha/87jzbbl...@oldenburg.str.redhat.com/>

However, the longjmp out of the clone callback remains. I'm not sure if
this is something we will be able to support in the long term,
particularly not with shadow stacks. If we provide a clone_fork function
that does not involve a callback, would you be able to use that?

Are there any other dependencies on glibc internals in this area?

Thanks,
Florian

Tom Anderson

unread,
Jan 7, 2025, 2:48:57 PMJan 7
to fwe...@redhat.com, chromi...@chromium.org, Lei Zhang
Hi Florian, thanks for making the CLONE_VM change to unbreak Chromium.

I'm assuming the longjmp() you're referring to is the one in base/process/launch_posix.cc.  If you provide clone_fork(), we can definitely use it when available.

I would expect most glibc-specific code to be guarded with defined(LIBC_GLIBC).  It appears outside of testing, it occurs in these places:
base/stack_canary_linux.cc
base/process/memory_linux.cc
sandbox/linux/services/namespace_sandbox.cc
sandbox/linux/seccomp-bpf-helpers/syscall_parameters_restrictions.cc

--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
    http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/87bjwnlw8x.fsf%40oldenburg.str.redhat.com.

Florian Weimer

unread,
Jan 8, 2025, 11:08:07 AMJan 8
to Tom Anderson, chromi...@chromium.org, Lei Zhang
* Tom Anderson:

> Hi Florian, thanks for making the CLONE_VM change to unbreak Chromium.

We have reverted the change for the upcoming release because my
workaround was specific to x86-64. We'll bring this back in a more
comprehensive form later this year (still with the workaround/TCB
patching in clone to benefit unchanged Chromium).

> I'm assuming the longjmp() you're referring to is the one in
> base/process/launch_posix.cc. If you provide clone_fork(), we can
> definitely use it when available.

Correct. Good to know that you could use it if it is available
(probably probing for it using dlsym or weak symbols).

> I would expect most glibc-specific code to be guarded with defined(LIBC_GLIBC). It appears outside of
> testing, it occurs in these places:
> base/stack_canary_linux.cc
> base/process/memory_linux.cc
> sandbox/linux/services/namespace_sandbox.cc
> sandbox/linux/seccomp-bpf-helpers/syscall_parameters_restrictions.cc
> Refs in code search: https://source.chromium.org/search?q=%22defined
> (libc_glibc)%22&sq=&ss=chromium

The one that sticks out is the stack protector canary patching:

NO_STACK_PROTECTOR void ResetStackCanaryIfPossible() {
uintptr_t canary;
base::RandBytes(base::byte_span_from_ref(canary));
// First byte should be the null byte for string functions.
canary &= ~static_cast<uintptr_t>(0xff);

// The x86/x64 offsets should work for musl too.
#if defined(ARCH_CPU_X86_64)
asm volatile("movq %q0,%%fs:%P1" : : "er"(canary), "i"(0x28));
#elif defined(ARCH_CPU_X86)
asm volatile("movl %0,%%gs:%P1" : : "ir"(canary), "i"(0x14));
#elif defined(ARCH_CPU_ARM_FAMILY)
// ARM's stack canary is held on a relro page. So, we'll need to make the page
// writable, change the stack canary, and then make the page ro again.
// We want to be single-threaded when changing page permissions, since it's
// reasonable for other threads to assume that page permissions for global
// variables don't change.
size_t page_size = base::GetPageSize();
uintptr_t __stack_chk_guard_page = base::bits::AlignDown(
reinterpret_cast<uintptr_t>(&__stack_chk_guard), page_size);
PCHECK(0 == mprotect(reinterpret_cast<void*>(__stack_chk_guard_page),
page_size, PROT_READ | PROT_WRITE));
__stack_chk_guard = canary;
PCHECK(0 == mprotect(reinterpret_cast<void*>(__stack_chk_guard_page),
page_size, PROT_READ));
#endif
}

The TCB updates are probably fine because the canary location is
obviously part of the ABI (even if undocumented), and we have
application-writable (also undocumented) ABI fields next to it on x86.

However, the Arm part is really iffy, for two reasons. Once we turn on
memory sealing for ld.so, the first mprotect call will fall, crashing
the process. It's also missing a check whether the memory is writable
before the patching. Until recently, some AArch64 binaries relied on
the dynamic linker rounding down the end of the RELRO area, so even if
glibc is built with RELRO active to the extend that the canary is placed
in a RELRO section (which is certainly the case for most builds), it's
conceivable that it ends in that tail that does not actually get
protected by the dynamic linker, on a page shared with things that must
not be made read-only.

The third category I saw is some code for dealing with the historic
glibc malloc hooks and __libc_malloc. I don't quite understand what
this is supposed to do. The comment about redirecting internal glibc
malloc calls seems to have been written with the a.out object code
format in mind.

Thanks,
Florian

Reply all
Reply to author
Forward
0 new messages