mksnapshot fails on windows with is_official_build = true, is_component_build = true, use_custom_libcxx = false

352 views
Skip to first unread message

Jean-Claude Monnin

unread,
Aug 3, 2023, 12:54:46 PM8/3/23
to v8-u...@googlegroups.com
Hi,

On windows, the v8 version 11.5 build fails when generating the snapshot with following error:

C:/Users/jean-claude/Documents/src/google/depot_tools/bootstrap-2@3_8_10_chromium_26_bin/python3/bin/python3.exe ../../tools/run.py ./mksnapshot --turbo_instruction_scheduling --target_os=win --target_arch=x64 --embedded_src gen/embedded.S --embedded_variant Default --random-seed 314159265 --startup_blob snapshot_blob.bin --no-native-code-counters
Return code is 2147483651

These are the options used (args.gn):
is_official_build = true
target_cpu = "x64"
is_component_build = true
use_custom_libcxx = false
chrome_pgo_phase = false
treat_warnings_as_errors = false
fatal_linker_warnings = false
symbol_level = 0

When using `is_debug=false` instead of `is_official_build = true` it builds fine, but it comes with performance regressions compared to older version 9.3 build with `is_official_build = true`.

If using either `is_component_build = false` or `use_custom_libcxx = true`, it builds fine too, however it's not really an option as I need a dll build and I need to use Microsoft's C++ standard library because third party dependencies prevents us to use libc++.

I also tried version 11.4 and 11.6 and they give the same error.

Any hints in how to diagnose/fix that would be appreciated.

Auxiliary question: Is any big project using `use_custom_libcxx = false` (eg. Microsoft's C++ standard library), or is this untested? Chrome/node/deno all use libc++?

Best regards,
Jean-Claude

Jakob Gruber

unread,
Aug 8, 2023, 6:14:09 AM8/8/23
to v8-u...@googlegroups.com
Hi Jean-Claude,

no, we don't have a lot of test coverage for `use_custom_libcxx=false`, this mode is only supported on a best-effort basis.

For debugging: a backtrace and symbols would be useful. Does running `mksnapshot` in a debugger give more infos? Also, a bisect to find the culprit change would be very helpful.

--
--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
---
You received this message because you are subscribed to the Google Groups "v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-users/984f3518-4b8e-4403-b794-923be66ccf08%40app.fastmail.com.

Jean-Claude Monnin

unread,
Aug 8, 2023, 9:19:58 AM8/8/23
to v8-u...@googlegroups.com
Hi Jakob,

Thanks for your reply.
It looks like using Microsoft's C++ library instead of libc++ is somewhat exotic for v8. Unfortunately there are cases where it's almost impossible to switch to libc++.

Since I have a chance to get some feedback here of how to address this issue, I'm going to try to give you as much info as possible.

`mksnapshot.exe` aborts at `VirtualMemoryCage::InitReservation` at following check:
  CHECK(IsAligned(params.reservation_size, allocate_page_size));

When adding following print on the line before
  i::PrintF(stdout, "VirtualMemoryCage::InitReservation %u %u\n", params.reservation_size, allocate_page_size);
it prints
  VirtualMemoryCage::InitReservation 3356617664 65536
It looks like the supplied `params.reservation_size` is not aligned.

Full call stack is included in screenshot below (sorry for the screenshot, I couldn't find a way to copy text from WinDbg)
I'm happy to investigate further, but wanted to send this out in case there is anything specific that would be helpful.

Jean-Claude

Jean-Claude Monnin

unread,
Aug 8, 2023, 5:21:07 PM8/8/23
to v8-u...@googlegroups.com
Hi Jakob,

The issue started with 11.0 (10.9 is good). More precisely it's commit 26bc8bb4 (see [1]).
Unfortunately, without knowledge of the v8 internals, it's really hard to find a direct link in this commit to the place mksnapshot exits and debug further. It would be fantastic if I could get some more hints or a patch to try (even if untested).

Thanks,
Jean-Claude

--------------------------
[1] git bisect output

C:\Users\jean-claude\Documents\src\google\v8>git bisect good
26bc8bb4013a984d9e7a3e8feff8b1058458f349 is the first bad commit
commit 26bc8bb4013a984d9e7a3e8feff8b1058458f349
Author: Leszek Swirski <les...@chromium.org>
Date:   Wed Nov 23 15:06:55 2022 +0100

    [ext-code-space] Make process-wide code range leaky

    Make the process-wide code range a once-initialised leaky object, rather
    than having a global weak_ptr + per-heap shared pointers and allowing it
    to be collected when all Isolates die.

    These weak pointers add locking overhead when accessing the code range,
    which shows up in GC and deoptimization traces when attempting to
    calculate Code objects from PCs. The process-wide pointer compression
    cage is already leaky, so it makes sense for the code range to be
    similar.

    Bug: v8:11460

    Change-Id: Ibebd468ebad9eafe8aec49f575cdbf604e4b6cc0
    Reviewed-by: Igor Sheludko <ish...@chromium.org>
    Reviewed-by: Michael Lippautz <mlip...@chromium.org>
    Commit-Queue: Leszek Swirski <les...@chromium.org>
    Cr-Commit-Position: refs/heads/main@{#84462}

src/execution/isolate.cc               |  3 +-
src/heap/code-range.cc                 | 61 ++++++++++++++++------------------
src/heap/code-range.h                  |  7 ++--
src/heap/heap.cc                       | 28 ++++++++--------
src/heap/heap.h                        | 14 ++++++--
src/init/isolate-allocator.cc          |  3 +-
src/objects/code.cc                    |  2 +-
src/snapshot/embedded/embedded-data.cc |  2 +-
src/snapshot/embedded/embedded-data.h  |  2 +-
9 files changed, 64 insertions(+), 58 deletions(-)

Jakob Gruber

unread,
Aug 9, 2023, 2:04:11 AM8/9/23
to v8-u...@googlegroups.com
On Tue, Aug 8, 2023 at 3:20 PM Jean-Claude Monnin <jc_m...@emailplus.org> wrote:
Hi Jakob,

Thanks for your reply.
It looks like using Microsoft's C++ library instead of libc++ is somewhat exotic for v8. Unfortunately there are cases where it's almost impossible to switch to libc++.

Since I have a chance to get some feedback here of how to address this issue, I'm going to try to give you as much info as possible.

`mksnapshot.exe` aborts at `VirtualMemoryCage::InitReservation` at following check:
  CHECK(IsAligned(params.reservation_size, allocate_page_size));

When adding following print on the line before
  i::PrintF(stdout, "VirtualMemoryCage::InitReservation %u %u\n", params.reservation_size, allocate_page_size);
it prints
  VirtualMemoryCage::InitReservation 3356617664 65536
It looks like the supplied `params.reservation_size` is not aligned.

Thanks for the investigation, very helpful. I wonder where that reservation_size comes from. It doesn't look like any value we'd set in V8. Corrupted? Uninitialized?

I'd expect it to be set by mksnapshot here and picked up by isolate initialization here. There it should either be some reasonable aligned value, or 0 and we'd fall back to kMaximalCodeRangeSize.
 

Jean-Claude Monnin

unread,
Aug 9, 2023, 4:05:12 AM8/9/23
to v8-u...@googlegroups.com
I've tried to figure out a bit more what is going on by adding prints along the call stack. It looks like it's the `base::CallOnce` in `code-range.cc` introduced in commit 26bc8bb4 that is the problem. Here the code with the added prints:
V8_DECLARE_ONCE(init_code_range_once);
void InitProcessWideCodeRange(v8::PageAllocator* page_allocator,
                              size_t requested_size) {
  i::PrintF(stdout, "InitProcessWideCodeRange %u\n", requested_size);
  CodeRange* code_range = new CodeRange();
  if (!code_range->InitReservation(page_allocator, requested_size)) {
    V8::FatalProcessOutOfMemory(
        nullptr, "Failed to reserve virtual memory for CodeRange");
  }
  process_wide_code_range_ = code_range;
#ifdef V8_EXTERNAL_CODE_SPACE
#ifdef V8_COMPRESS_POINTERS_IN_SHARED_CAGE
  ExternalCodeCompressionScheme::InitBase(
      ExternalCodeCompressionScheme::PrepareCageBaseAddress(
          code_range->base()));
#endif  // V8_COMPRESS_POINTERS_IN_SHARED_CAGE
#endif  // V8_EXTERNAL_CODE_SPACE
}
}  // namespace

// static
CodeRange* CodeRange::EnsureProcessWideCodeRange(
    v8::PageAllocator* page_allocator, size_t requested_size) {
  i::PrintF(stdout, "CodeRange::EnsureProcessWideCodeRange %u\n", requested_size);
  base::CallOnce(&init_code_range_once, InitProcessWideCodeRange,
                 page_allocator, requested_size);
  return process_wide_code_range_;
}

It outputs:
CodeRange::EnsureProcessWideCodeRange 536870912
InitProcessWideCodeRange 2034756544

It looks like the `requested_size` isn't forwarded correctly in `base::CallOnce`.
I'm not sure to understand the CallOnce implementation, but I wonder if calling `std::function<void()>` with `init_func(args...)` isn't undefined behavior. Not sure how to fix/work around.

Jakob Gruber

unread,
Aug 9, 2023, 4:32:39 AM8/9/23
to v8-u...@googlegroups.com
Which part would be undefined behavior? From a quick glance, the CallOnce implementation looks reasonable to me. It sounds like something around the lambda capture-by-copy is buggy (accessing the wrong stack slot?). Perhaps you can verify with disassembly of InitProcessWideCodeRange, or step through and find out what it's actually copying.
 

Jean-Claude Monnin

unread,
Aug 9, 2023, 12:05:27 PM8/9/23
to v8-u...@googlegroups.com
Hi Jakob,

Sorry, I should have looked at the code more carefully before sending my reply. The lambda capture-by-copy of the variadic template arguments being buggy would be one explanation, but it looks like it's something else.

I tried following workaround:
CodeRange* CodeRange::EnsureProcessWideCodeRange(
    v8::PageAllocator* page_allocator, size_t requested_size) {
  base::CallOnce(&init_code_range_once, [&]() {
    InitProcessWideCodeRange(page_allocator, requested_size);
  });
  return process_wide_code_range_;
}
This uses the overload taking the `std::function<void()>` directly. It crashes in `CallOnceImpl` at call of `init_func()`.
I also tried to call `CallOnceImpl` directly, same crash.

It looks like the `std::function` is corrupted in `CallOnceImpl`. It works when supplying a static function with no arguments, but any lambda crashes. Using simpler test code, it seems that it's linked to pass a `std::function` across dll boundary. It works when calling an template/inline function, but calling a function exported by the dll doesn't work (see [1]).
I can't see why passing a `std::function` to a dll call is problematic. I tested the same test code on another project that uses MSVC compiler and it works fine there. It's not impossible it's a clang issue. Unfortunately, this looks too complicated for me to look further into (I'm not familiar with assembler and such low level stuff).

I tried this ugly workaround since I can pass static functions, which allows the compile to pass. d8.exe launches, but tests fail.
v8::PageAllocator* g_page_allocator;
size_t g_requested_size;

static void InitProcessWideCodeRangeNoArgs() {
  InitProcessWideCodeRange(g_page_allocator, g_requested_size);
}

// static
CodeRange* CodeRange::EnsureProcessWideCodeRange(
    v8::PageAllocator* page_allocator, size_t requested_size) {
  static base::Mutex mx;
  base::MutexGuard guard(&mx);
  g_page_allocator = page_allocator;
  g_requested_size = requested_size;
  base::CallOnce(&init_code_range_once, InitProcessWideCodeRangeNoArgs);
  return process_wide_code_range_;
}

Regards,
Jean-Claude

-------------------
[1] Test to reproduce the issue of passing a std::function across dll boundaries:

src/base/once.h
inline void Test1(std::function<void()> init_func) {
  printf("Test1 before init_func\n");
  init_func();
  printf("Test1 after init_func\n");
}

V8_BASE_EXPORT void Test2(std::function<void()> init_func);

src/base/once.cc
void Test2(std::function<void()> init_func) {
  printf("Test2 before init_func\n");
  init_func();
  printf("Test2 after init_func\n");
}

mksnapshot.cc
void TestFunction() {
  printf("  called TestFunction\n");
}

int main(int argc, char** argv) {
  v8::base::Test1([]() {
    printf("  called CallOnceFake1 lambda\n");
  });

  v8::base::Test2(TestFunction);

  v8::base::Test2([]() {
    printf("  called CallOnceFake2 lambda\n");
  });

The output is
Test1 before init_func
  called CallOnceFake1 lambda
Test1 after init_func
Test2 before init_func
  called TestFunction
Test2 after init_func
Test2 before init_func
The call to `Test2` with a lambda crashes

Jakob Gruber

unread,
Aug 10, 2023, 2:10:41 AM8/10/23
to v8-u...@googlegroups.com
Interesting.. Sorry, I don't have any more suggestions either. Depending on how much time you still have to spend on this, please consider either filing a bug at crbug.com/v8/new to document your current findings, or trying to build a minimal repro (without V8). I'd still like to understand if this is a bug in V8 or in the MSVC stdlib.

Jean-Claude Monnin

unread,
Aug 10, 2023, 5:32:35 AM8/10/23
to v8-u...@googlegroups.com
Hi Jakob,

I did a minimal repo, but I can't reproduce the issue (using v8's clang-cl compiler to compile it).
I can't spend more time on it right now; we will be sticking to 10.9 for now. I might do further investigations at a later time.

Thanks for your input.

Regards,
Jean-Claude
Reply all
Reply to author
Forward
0 new messages