Sharing Memory between Host and WebAssembly Module

90 views
Skip to first unread message

Immanuel Haffner

unread,
Jan 9, 2020, 6:56:49 AM1/9/20
to v8-dev
Hi everyone,

I am JIT compiling programs in a high level language to a WebAssembly module. I want to run them in embedded V8. I succeeded in embedding V8, creating an isolate, and with some JS glue code instantiate the WebAssembly module and invoke an exported function. Now I am facing the problem that the WebAssembly code must access memory of the host. E.g. after the host allocates an integer array, I want to map the underlying memory of the array into the linear memory of the WebAssembly module s.t. the module can directly access and modify it. The broader idea is that the host manages data structures and does the memory management and the WASM modules can use these data structures.

Here is an excerpt of what I have done so far:

    /* Create WASM module. */
        auto v8_wasm_module = v8::WasmModuleObject::DeserializeOrCompile(
            /* isolate=           */ isolate_,
            /* serialized_module= */ { nullptr, 0 },
            /* wire_bytes=        */ v8::MemorySpan<const uint8_t>(binary_addr, binary_size)
            ).ToLocalChecked();

    /* Get the WebAssembly instance class prototype. */
    auto web_assembly_class = context->Global()->Get(context, V8STR("WebAssembly")).ToLocalChecked().As<v8::Object>();

    /* Create a WebAssembly memory object. */
    auto wasm_memory_class = web_assembly_class->Get(context, V8STR("Memory")).ToLocalChecked().As<v8::Object>();
    auto memory_params_object = v8::Object::New(isolate_);
    memory_params_object->Set(context, V8STR("initial"), v8::Int32::New(isolate_, 1));
    memory_params_object->Set(context, V8STR("maximum"), v8::Int32::New(isolate_, 256));
    v8::Local<v8::Value> memory_args[] = { memory_params_object };
    auto wasm_memory = wasm_memory_class->CallAsConstructor(context, 1, memory_args).ToLocalChecked().As<v8::Object>();

    /* Allocate and initialize host memory. */
    auto host_memory = new int32_t[1024];
    for (int32_t i = 0; i != 1024; ++i)
        host_memory[i] = i;
    auto store = v8::ArrayBuffer::NewBackingStore(host_memory, 1024 * sizeof(int32_t), [](void* data, size_t, void*) { delete[] reinterpret_cast<int32_t*>(data); }, nullptr);
    auto buffer = v8::ArrayBuffer::New(isolate_, std::move(store));

    /* Replace the WebAssembly.Memory's underlying ArrayBuffer. FIXME: Not working as intended! */
    wasm_memory->Set(context, V8STR("buffer"), buffer);

    /* Create the import object for instantiating the WebAssembly module. */
    auto host_object = v8::Object::New(isolate_);
    host_object->Set(context, V8STR("mem"), wasm_memory);
    auto import_object = v8::Object::New(isolate_);
    import_object->Set(context, V8STR("env"), host_object);

    /* Create the import object for instantiating the WebAssembly module. */
    auto host_object = v8::Object::New(isolate_);
    host_object->Set(context, V8STR("mem"), wasm_memory);
    auto import_object = v8::Object::New(isolate_);
    import_object->Set(context, V8STR("env"), host_object);

    /* Get the exports of the created WebAssembly instance. */
    auto exports = instance->Get(context, V8STR("exports")).ToLocalChecked().As<v8::Object>();

    /* Get exported function `run` from the exports. */
    auto run = exports->Get(context, V8STR("run")).ToLocalChecked().As<v8::Function>();

    /* Invoke the exported function `run` of the module. */
    v8::Local<v8::Value> args[] = { };
    auto result = run->Call(context, context->Global(), 0, args).ToLocalChecked().As<v8::Object>();

The FIXME shows what I intend to do: Create a WebAssembly.Memory instance and then replace its underlying ArrayBuffer by an instance that wraps memory in the host (using NewBackingStore()). This WebAssembly.Memory object is then imported into the module when it is instantiated.

Is there any way of mapping memory of the host into a WebAssembly lienar memory? Copying data is not an option.

Thanks in advance.
Kind regards,
Immanuel

Clemens Backes

unread,
Jan 15, 2020, 10:34:48 AM1/15/20
to v8-dev
Hi Immanuel,

you cannot create WebAssembly.Memory from any ArrayBuffer, or replace the underlying ArrayBuffer later, since the allocation that is backing the wasm memory has special requirements (guard regions around it to catch out of bounds accesses from wasm).
What you can do instead is accessing the wasm memory buffer from C++ directly, i.e. doing it the other way around. You get the ArrayBuffer backing the wasm memory via the "buffer" accessor, and then get a pointer to the underlying backing store via v8::ArrayBuffer::GetContents() -> Contents::Data().

Not sure if that helps, since you mentioned that copying data is not an option.

Cheers,
Clemens


--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/f2ddf715-f8a6-47a8-b872-f77312b0e4c5%40googlegroups.com.


--

Clemens Backes

Software Engineer

clem...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.


This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.



--

Clemens Backes

Software Engineer

clem...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.


This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

Immanuel Haffner

unread,
Jan 15, 2020, 11:04:57 AM1/15/20
to v8-dev
Hi Clemens,

Thanks for your answer. I already had that thought. But copying data is just not an option.

To make my scenario more concrete:
I habe data in main memory, easily tens or hundreds of gigabytes. Programs fly in requiring to access the data. Some programs are short-lived, some will take a long time to compute. Some programs are executed only one, some are executed repeatedly. So I will have to JIT compile many programs to WASM modules. Copying the data into the memory of each module would totally kill performance.

Another idea that comes to my mind is to create a single WASM memory object and implement a designated allocator to allocate memory for host objects in that WASM memory. Then I could import this memory object into the modules... Would that be possible? How would the guard pages affect this approach? (I am thinking of jemalloc on WASM memory.)

I understand that the problem I am proposing is somewhat in contradiction with the WebAssembly idea of sandboxed modules. Anyways, I believe solving this would make WebAssembly make an attractive choice for JIT compiling general purpose languages.

Regards,
Immanuel

Clemens Backes

unread,
Jan 15, 2020, 11:37:26 AM1/15/20
to v8-dev
On Wed, Jan 15, 2020 at 5:05 PM Immanuel Haffner <haffner....@gmail.com> wrote:
Hi Clemens,

Thanks for your answer. I already had that thought. But copying data is just not an option.

To make my scenario more concrete:
I habe data in main memory, easily tens or hundreds of gigabytes. Programs fly in requiring to access the data. Some programs are short-lived, some will take a long time to compute. Some programs are executed only one, some are executed repeatedly. So I will have to JIT compile many programs to WASM modules. Copying the data into the memory of each module would totally kill performance.

Another idea that comes to my mind is to create a single WASM memory object and implement a designated allocator to allocate memory for host objects in that WASM memory. Then I could import this memory object into the modules... Would that be possible? How would the guard pages affect this approach? (I am thinking of jemalloc on WASM memory.)

Sharing wasm memory across modules works generally. You could have one module that provides some kind of library to the other programs for allocating and deallocating memory, and accessing shared state.
You would have to trust all programs though not to mess with the shared memory. A memory bug in one program could easily cause failures in others.
Also, most programs use parts of the memory for their stack and for global variables. You have to make sure then to malloc all of this memory so there are no conflicts between programs.

If you don't mind patching v8 a bit for your work, you could of course add a way to make a wasm instance use a given array buffer for its memory. Things like growing the memory would not work then, and you either have to enable explicit bounds checks (by disabling trap handlers, --no-wasm-trap-handler flag), or you need to trust your programs never to access memory out of bounds.

These might be viable options if
a) this is a research project / prototype, or
b) you control / trust the programs you execute.

Cheers,
Clemens


I understand that the problem I am proposing is somewhat in contradiction with the WebAssembly idea of sandboxed modules. Anyways, I believe solving this would make WebAssembly make an attractive choice for JIT compiling general purpose languages.

Regards,
Immanuel

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.

Immanuel Haffner

unread,
Jan 16, 2020, 3:24:33 AM1/16/20
to v8-dev
What I am working on is (a) a research project and (b) the programs are all trust-worthy and harmless.
Patching V8 sounds like an option. I would highly appreciate some guidance where to start. I am only superficially familiar with the public API yet.

Clemens Backes

unread,
Jan 16, 2020, 3:55:03 AM1/16/20
to v8-dev
One quick and dirty way to do this would be:

1. If you need to catch memory out-of-bounds errors, disable the --wasm-trap-handler flag (in flag-definitions.h), because there will be no guard regions to reliably trigger the signal.
2. Add an API method to directly patch the MemoryStart and MemorySize fields of a given WasmInstance. These fields are used for memory accesses and bounds checks in generated code (see Liftoff and TurboFan).
3. Make your program instantiate a wasm instance, then call the new API method, then execute the code.

Let me know if you need more specific hints.


On Thu, Jan 16, 2020 at 9:24 AM Immanuel Haffner <haffner....@gmail.com> wrote:
What I am working on is (a) a research project and (b) the programs are all trust-worthy and harmless.
Patching V8 sounds like an option. I would highly appreciate some guidance where to start. I am only superficially familiar with the public API yet.

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.

Ben Noordhuis

unread,
Jan 16, 2020, 6:18:42 AM1/16/20
to v8-...@googlegroups.com
On Wed, Jan 15, 2020 at 5:05 PM Immanuel Haffner
<haffner....@gmail.com> wrote:
> To make my scenario more concrete:
> I habe data in main memory, easily tens or hundreds of gigabytes. Programs fly in requiring to access the data. Some programs are short-lived, some will take a long time to compute. Some programs are executed only one, some are executed repeatedly. So I will have to JIT compile many programs to WASM modules. Copying the data into the memory of each module would totally kill performance.

Totally a hack but you can munmap() the memory that V8 allocated for
the ArrayBuffer's backing store and mmap() your own at the same
address with MAP_FIXED. You can use a memfd[0] to create anonymous
memory that's mapped at multiple addresses.

[0] http://man7.org/linux/man-pages/man2/memfd_create.2.html

Immanuel Haffner

unread,
Jan 16, 2020, 7:15:21 AM1/16/20
to v8-dev
@Clemens That sounds doable. I will try this!

@Ben Thanks, I thought about that already but I don't know whether that would work so easily. If i remap the memory, would a WASM load at address 0 really load from the remapped memory at offset 0? I will keep this in mind as a last resort :D

Immanuel Haffner

unread,
Jan 16, 2020, 9:11:22 AM1/16/20
to v8-dev
@Clements Regarding point 2, WasmInstanceObject::SetRawMemory() seems to to exactly what I want, right? How can I access this, given a v8::Object of a WebAssembly.Instance ?

Clemens Backes

unread,
Jan 16, 2020, 9:38:17 AM1/16/20
to v8-dev
Ack, SetRawMemory does exactly the right thing.

If you add an API method that takes a Local<Object> (holding the wasm instance), you can implement that in api.cc via something like:
  i::Handle<Object> obj = Utils::OpenHandle(this);
  i::WasmInstanceObject::cast(*obj)->SetRawMemory(...);


On Thu, Jan 16, 2020 at 3:11 PM Immanuel Haffner <haffner....@gmail.com> wrote:
@Clements Regarding point 2, WasmInstanceObject::SetRawMemory() seems to to exactly what I want, right? How can I access this, given a v8::Object of a WebAssembly.Instance ?

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.

Immanuel Haffner

unread,
Jan 17, 2020, 3:36:37 AM1/17/20
to v8-dev
It worked! Thanks a million :)

Here is the patch:
diff --git a/include/v8.h b/include/v8.h
index 199fc6ae09..01bc30b031 100644
--- a/include/v8.h
+++ b/include/v8.h
@@ -11776,6 +11776,7 @@ size_t SnapshotCreator::AddData(Local<T> object) {
  * \example process.cc
  */
 
+void SetWasmInstanceRawMemory(Local<Object> wasmInstance, uint8_t *mem_start, size_t mem_size);
 
 }  // namespace v8
 
diff --git a/src/api/api.cc b/src/api/api.cc
index ffc89d7dc8..795cc110b8 100644
--- a/src/api/api.cc
+++ b/src/api/api.cc
@@ -340,6 +340,11 @@ class CallDepthScope {
 
 }  // namespace
 
+void SetWasmInstanceRawMemory(Local<Object> wasmInstance, uint8_t *mem_start, size_t mem_size) {
+  i::Handle<i::Object> obj = Utils::OpenHandle(*wasmInstance);
+  i::WasmInstanceObject::cast(*obj).SetRawMemory(mem_start, mem_size);
+}
+
 static ScriptOrigin GetScriptOriginForScript(i::Isolate* isolate,
                                              i::Handle<i::Script> script) {
   i::Handle<i::Object> scriptName(script->GetNameOrSourceURL(), isolate);
diff --git a/src/flags/flag-definitions.h b/src/flags/flag-definitions.h
index 2b68204af5..e10a583f63 100644
--- a/src/flags/flag-definitions.h
+++ b/src/flags/flag-definitions.h
@@ -746,7 +746,7 @@ DEFINE_BOOL(wasm_no_stack_checks, false,
 DEFINE_BOOL(wasm_math_intrinsics, true,
             "intrinsify some Math imports into wasm")
 
-DEFINE_BOOL(wasm_trap_handler, true,
+DEFINE_BOOL(wasm_trap_handler, false,
             "use signal handlers to catch out of bounds memory access in wasm"
             " (currently Linux x86_64 only)")
 DEFINE_BOOL(wasm_fuzzer_gen_test, false,

Now I can do the following in the host:
    /* Create host memory. */
    auto host_memory = new int32_t[1024];
    for (int32_t i = 0; i != 1024; ++i)
        host_memory[i] = i;

    v8::SetWasmInstanceRawMemory(instance, reinterpret_cast<uint8_t*>(host_memory), 4096);

And from within the module I can issue a load and retrieve a value from the array. I yet have to experiment with overflows or grows...

As I already said, I want to work with really large allocations, maybe tens or hundreds of gigabytes.  Because WebAssembly currently only has 2GiB address space, I need some mechanism to feed the host allocations to the WASM module in chunks. I think of installing a callback from the WASM module to the host, via a WASM module import, that allows the WASM module to signal to the host, that the current chunk was processed and the next chunk is requested. So roughly every 2GiB of processed data, the WASM module issues a callback to the host. When that callback returns, the module can assume the next chunk is mapped into the linear memory and proceed.
Does this sound like a good approach? Any concerns here (regarding correctness or performance)?

Clemens Backes

unread,
Jan 17, 2020, 4:37:55 AM1/17/20
to v8-dev
Great to hear that it works!

The approach with a re-mapping callback sounds very reasonable. You would still have to be careful about memory requirements by the compiled programs. Would they not need any memory for modeling their own stack, or putting any other state?

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.

Immanuel Haffner

unread,
Jan 20, 2020, 1:50:54 AM1/20/20
to v8-dev
So far, programs consist only of one function, no recursion. I think local variables suffice to represent the state - together with the mmap'd memory. I don't think I need an explicit stack frame. I will perform more experiments and progress in small steps. I will let you know of progress and problems I encounter ;)
Reply all
Reply to author
Forward
0 new messages