Catching out-of-bounds memory accesses

40 views
Skip to first unread message

John Dallman

unread,
Sep 3, 2025, 9:09:53 AMSep 3
to emscripte...@googlegroups.com
I'm porting a math library and its test harness to WebAssembly. The test harness is allowed to be specific to Node.js, but the library is not. Both are compiled from C and C++ code, which is my comfort zone. JavaScript is a language I don't know well and haven't done anything difficult in.  

I'm aware that WASI doesn't support traditional signals. At present, when I intentionally set off an access violation, I get "RuntimeError: memory access out of bounds" and a traceback as Node.js exits.

Is there a way to catch these errors and prevent Node.js exiting? Ideally, I'd be able to notify the test harness in some way that this had happened. If this involves JavaScript, please explain slowly and gently: I'm from the C world and new to web applications. 

The reason I'm asking this is that I will have to provide support to customers when the WebAssembly version of the library is released, and prefer to have my answers ready ahead of time. My employer wants to maintain their good reputation for customer service, and goes as far as having tests for deliberately-set-off runtime errors as part of routine testing, so that we can document what happens and how to handle them.

Thanks very much,

John 

Brooke Vibber

unread,
Sep 3, 2025, 2:32:32 PMSep 3
to emscripte...@googlegroups.com
On Wed, Sep 3, 2025 at 6:09 AM John Dallman <jgdats...@gmail.com> wrote:
I'm aware that WASI doesn't support traditional signals. At present, when I intentionally set off an access violation, I get "RuntimeError: memory access out of bounds" and a traceback as Node.js exits.

Is there a way to catch these errors and prevent Node.js exiting? Ideally, I'd be able to notify the test harness in some way that this had happened. If this involves JavaScript, please explain slowly and gently: I'm from the C world and new to web applications. 

These can be caught like a standard JavaScript exception by the surrounding test harness JS, something like:

try {
  Module.run_my_c_code();
} catch (e) {
  // Probably out of bounds access, or divide by zero etc
  console.log("Error while running run_my_c_code: " + e);
}

The stack trace attached to the exception may not be very useful, but this gives your test harness a chance to process the failure at least. :D

Note that this will only catch WebAssembly accesses outside of linear memory -- a NULL dereference, read or write, will *not* trigger a runtime error -- but any actually out-of-bounds accesses, or other operations like divide by zero that trap, can be caught this way even on a fully optimized production build.


You might also look into the SAFE_HEAP build option in emscripten, which runs all memory accesses through a double-check for out-of-bounds or NULL dereference and logs it. Check emscripten's src/settings.js for comments documenting this and other build-time options.

-- brooke

Sam Clegg

unread,
Sep 3, 2025, 7:52:10 PMSep 3
to emscripte...@googlegroups.com
On Wed, Sep 3, 2025 at 6:09 AM John Dallman <jgdats...@gmail.com> wrote:
I'm porting a math library and its test harness to WebAssembly. The test harness is allowed to be specific to Node.js, but the library is not. Both are compiled from C and C++ code, which is my comfort zone. JavaScript is a language I don't know well and haven't done anything difficult in.  

I'm aware that WASI doesn't support traditional signals. At present, when I intentionally set off an access violation, I get "RuntimeError: memory access out of bounds" and a traceback as Node.js exits.

Is there a way to catch these errors and prevent Node.js exiting? Ideally, I'd be able to notify the test harness in some way that this had happened. If this involves JavaScript, please explain slowly and gently: I'm from the C world and new to web applications. 

Is the test harness and the library-under-test designed to be compiled into the same executable?    i.e. on other platforms does it somehow catch and recover from sefaults?

From the JS side you basically have two choice:

1. Wrap your calls in a JS try/catch and inspect the exception you caught and then (somehow?) continue with the test suite.  (This is what Brooke suggested already)
2. Install a global `onerror` handler that will catch all exceptions (much like the global signal handler on linux).  See https://nodejs.org/api/process.html#event-uncaughtexception.

I suppose in either the case the tricky part is going to be continuing the test suite where you left off.   How does that work in the native case?

 

The reason I'm asking this is that I will have to provide support to customers when the WebAssembly version of the library is released, and prefer to have my answers ready ahead of time. My employer wants to maintain their good reputation for customer service, and goes as far as having tests for deliberately-set-off runtime errors as part of routine testing, so that we can document what happens and how to handle them.

Thanks very much,

John 

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/emscripten-discuss/CAH1xqgnMY0%2BoOLMs51mo2x%2BBUhWdZq%3DkxFp-En7AX3OsVqcoQQ%40mail.gmail.com.

John Dallman

unread,
Sep 8, 2025, 6:40:28 AM (11 days ago) Sep 8
to emscripte...@googlegroups.com
> Is the test harness and the library-under-test designed to be compiled into the 
> same executable? 

Yes. We prefer to have the library-under-test be a shared object or Windows DLL, on platforms where that's possible, but we can have the harness and the library linked together, and that's what I'm planning to do for WebAssembly. I'm trying to avoid producing a JS wrapper for an API with hundreds of functions, hundreds of structs, and thousands of constants. It also passes lots of pointers to code and data through the interface. My customers who want a WebAssembly version of the library already have C/C++ or Swift code that calls it and want to use it that way.  

> i.e. on other platforms does it somehow catch and recover from sefaults?

Yes.On platforms with signals, those are turned on for segmentation faults (and for some other signals, depending on the platform). The code is C, which sets regular checkpoints with setjmp() and the signal handling function longjmp()s to the latest checkpoint with a "test aborted" value. That's the basic idea, though it's rather more complicated in practice. 

The JS code I'm going to use will be very minimal. The test harness already has its own scripting language (a LISP dialect) built into it, and we have hundreds of thousands (not hyperbole) of test cases already written in it. So I'm not going to try to organise and control the testing from JS, because that would be a duplication of work already done. I'm just going to start the test harness and let it do its thing. 

I am not an experienced JS coder - I learned some of the language for the first time for this project - and the product is not targeted to the general web application market.

> 1. Wrap your calls in a JS try/catch and inspect the exception you caught 
> and then (somehow?) continue with the test suite.  (This is what Brooke 
> suggested already)

I have a couple of questions about that:

When the catch gets called, is it called by the same thread as did the segmentation violation? 

Is that thread's stack still intact? Has it been unwound? 

If it is the same thread and the stack is intact, then I should be able to call the function within the library that does signal handling on other platforms, and have it do its longjmp()s. If those conditions don't apply, then things are going to get more difficult.   

> 2. Install a global `onerror` handler that will catch all exceptions (much like the 
> global signal handler on linux).  See https://nodejs.org/api/process.html#event-uncaughtexception.

That says it's unsafe to resume, so that probably also applies to the case above? 

Thanks,

John
John

Brooke Vibber

unread,
Sep 8, 2025, 1:52:44 PM (10 days ago) Sep 8
to emscripte...@googlegroups.com
On Mon, Sep 8, 2025 at 3:40 AM John Dallman <jgdats...@gmail.com> wrote:
> 1. Wrap your calls in a JS try/catch and inspect the exception you caught 
> and then (somehow?) continue with the test suite.  (This is what Brooke 
> suggested already)

I have a couple of questions about that:

When the catch gets called, is it called by the same thread as did the segmentation violation? 

Yes, the try/catch stays in the same thread, it's just up the call stack. This means if you trap on a pthread, you would need a try/catch on the pthread setup, not in your test harness that calls things on the main thread. I don't know how to set this up offhand.

Is that thread's stack still intact? Has it been unwound? 

If it is the same thread and the stack is intact, then I should be able to call the function within the library that does signal handling on other platforms, and have it do its longjmp()s. If those conditions don't apply, then things are going to get more difficult.   

The WASM code will stop at the point of the trap, and the WASM stack will be unwound back to the call point where the JS catch can grab it. However the *C stack* is not unwound, nor are C++ exception handlers called, because the WASM runtime knows nothing about these (they are creations of the C ABI and not inherent parts of WASM).

So you can expect to end up with linear memory in an inconsistent state. You certainly can't recover execution from the next instruction or anything like that. You know an error took place during the call, and that the module is likely inconsistent and unusable now.
 

> 2. Install a global `onerror` handler that will catch all exceptions (much like the 
> global signal handler on linux).  See https://nodejs.org/api/process.html#event-uncaughtexception.

That says it's unsafe to resume, so that probably also applies to the case above? 

Correct.

-- brooke 

Sam Clegg

unread,
Sep 8, 2025, 7:18:06 PM (10 days ago) Sep 8
to emscripte...@googlegroups.com
On Mon, Sep 8, 2025 at 3:40 AM John Dallman <jgdats...@gmail.com> wrote:
> Is the test harness and the library-under-test designed to be compiled into the 
> same executable? 

Yes. We prefer to have the library-under-test be a shared object or Windows DLL, on platforms where that's possible, but we can have the harness and the library linked together, and that's what I'm planning to do for WebAssembly. I'm trying to avoid producing a JS wrapper for an API with hundreds of functions, hundreds of structs, and thousands of constants. It also passes lots of pointers to code and data through the interface. My customers who want a WebAssembly version of the library already have C/C++ or Swift code that calls it and want to use it that way.  

> i.e. on other platforms does it somehow catch and recover from sefaults?

Yes.On platforms with signals, those are turned on for segmentation faults (and for some other signals, depending on the platform). The code is C, which sets regular checkpoints with setjmp() and the signal handling function longjmp()s to the latest checkpoint with a "test aborted" value. That's the basic idea, though it's rather more complicated in practice. 

Oh wow, `longjmp` out of your signal handler sounds pretty gnarly.    It's going to be even more gnarly trying to make that work with emscripten-generated code, but maybe not impossible?
 
Are there segfault tests limited in number?  i.e. would it be possible to choose a different approach when running on the web (just for these few tests)?   

Sam Clegg

unread,
Sep 8, 2025, 7:31:31 PM (10 days ago) Sep 8
to emscripte...@googlegroups.com, Heejin Ahn
If you want to use `longjmp` in emscripten to get back to start of the failing test, we have two setjmp/longjmp mechanism.  (1) The old emscripten method (2) The method using wasm exception handling.

However, I believe that in both cases the target of the long jump has to be above the caller on the stack.  That is, once you unwind the stack all of the way it will no longer be possible to `longjmp` to the target in question since its no longer on the stack.   @Heejin Ahn can you confirm this?

If that is correct then you will need to some kind of alternative mechanism when running in emscrpten.  Something like this maybe:

```
void run_death_test(death_test_fn_t fn) {
#ifdef __EMSCRIPTEN__
  EM_ASM({
     try {
          ...call_fn_from_js..
         report_failure_to_die()
     } catch (e) { 
         report_success_if_e_looks_good(e)
     }
  })
#else
   setup_longjmp_target():
   fn();
#endif
}
```

Heejin Ahn

unread,
Sep 9, 2025, 2:37:17 AM (10 days ago) Sep 9
to emscripte...@googlegroups.com, Heejin Ahn
Correct. Both Emscripten and Wasm SjLj handling requires the setjmp point to be "lower" than the longjmp point, because both use exceptions to simulate setjmp-longjmp.
So this works:
```
static jmp_buf buf; 

void bar() {
}

int main() {
  int jmpval = setjmp(buf);
  if (jmpval == 0) {
    printf("first call\n");
  } else {
    printf("second call\n");
    exit(0);
  }
  bar();
  return 0;
}
```

But this does NOT work:
```
static jmp_buf buf;

void foo() {
  int jmpval = setjmp(buf);
  if (jmpval == 0) {
    printf("first call\n");
  } else {
    printf("second call\n");
    exit(0);
  }
}

void bar() {
  longjmp(buf, 1);
}

int main() {
  foo();
  bar();
  return 0;
}
```

Because by the time longjmp is called, foo's call stack has been destroyed.

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.

John Dallman

unread,
Sep 11, 2025, 10:00:30 AM (7 days ago) Sep 11
to emscripte...@googlegroups.com
OK, what I want to do isn't possible in Emscripten. 

Thanks, everyone. 

John

John Dallman

unread,
Sep 12, 2025, 11:03:13 AM (6 days ago) Sep 12
to emscripte...@googlegroups.com
> OK, what I want to do isn't possible in Emscripten. 

It looks like it is, actually, but it's a bit complicated. Here's the scheme: am I requiring anything that doesn't exist?

I have a great big math library that I want to make available in WebAssembly form, as a commercial product. It is written in C and C++, and runs on Android, iOS, Linux, macOS and Windows. Its heritage is from 1980s and 1990s technical computing, but it is still going strong as a commercial product. The immediate customers for it want to call it from C/C++ code that they already use on other platforms (mostly Linux and Windows), rather than from JavaScript. 

This math library has a test harness, also written in C and C++. The test cases are not written in C/C++, but in interpreted LISP, because that's quite suitable for handling the data that the library works with, plus it was easy to write our own portable implementation of the language. 

When you run the test harness, it looks for a LISP file to run and starts interpreting it, which usually starts by reading a lot more LISP files to set up a working environment. It then starts reading test scripts and making the calls to the math library that they request. The test harness does not exit until it has finished running everything it was given, or it hits an unrecoverable error. On its existing platforms, it can recover from SIGSEGV, SIGPFE, and other run-time errors, aborting the test that caused them and continuing with the subsequent test(s).

That last part is the problem with a naive WebAssembly implementation, where the library and the test harness are statically linked together. That means there's no opportunity for a JavaScript error handler to get involved before exceptions caused by access violations come flying out of the top of the test harness. By then, the call stack is gone, and there's no way to recover from the exception and continue with the next test. 

That's where I'd got to when I wrote " . . . what I want to do isn't possible in Emscripten." I've had what seem like better ideas since then. 

WebAssembly "modules" are, as far as I understand, kind of like Windows DLLs or Linux shared libraries. There's a difference in that you can't link a module against other modules: inter-module calls have to go via JavaScript. Is that correct? 

I was slow to realise that I can put JavaScript error handlers into any JavaScript layer. To take advantage of that, I would link my math library as a module, and its test harness as the main program. The test harness would call JavaScript code, which would call the library from within a try/catch block, catch exceptions, call the library's "tidy up" functions, and return an error code to the test harness. That lets the test harness recover from exceptions and move on to the next test. It also means I need to create a JavaScript wrapper for the whole of the library's API, which is a bit of a big job, but makes the library more generally usable in WebAssembly code. 

While writing this mail,I realised a possible short-cut to let me get the test harness running more quickly, without having to generate a wrapper for hundreds of different functions, with thousands of associated data types and constants. It goes like this:  

The calls to the library from the test harness are generated by C macros, and I can add code to those for the WebAssembly build. So it appears I could use EM_ASM() to create JavaScript error handlers at each of those call sites, as part of the test harness source. I could then statically link the library and the test harness for running tests. Is that plausible?

Thanks in advance,

John 



John Dallman

unread,
Sep 15, 2025, 4:41:28 AM (4 days ago) Sep 15
to emscripte...@googlegroups.com
> While writing this mail,I realised a possible short-cut to let me get the test harness 
> running more quickly, without having to generate a wrapper for hundreds of different 
> functions, with thousands of associated data types and constants. It goes like this:  
>
> The calls to the library from the test harness are generated by C macros, and I can 
> add code to those for the WebAssembly build. So it appears I could use EM_ASM() 
> to create JavaScript error handlers at each of those call sites, as part of the test 
> harness source. I could then statically link the library and the test harness for running 
> tests. Is that plausible?

Over the weekend I realised the difficulty with that: I'd have to get all the arguments into JavaScript variables, pass them into the calls inside the try/catch blocks, and then get them back into Emscripten C/C++ variables. Since they contain loads of arrays, C struts and strings, this amounts to generating a wrapper anyway. 

Alternatively, will catching exceptions with -fwasm-exceptions allow me to catch out-of-bounds exceptions in Emscripten-compiled C++? 

Thanks in advance,

John 

Sam Clegg

unread,
Sep 15, 2025, 5:52:21 PM (3 days ago) Sep 15
to emscripte...@googlegroups.com
On Fri, Sep 12, 2025 at 8:03 AM John Dallman <jgdats...@gmail.com> wrote:
> OK, what I want to do isn't possible in Emscripten. 

It looks like it is, actually, but it's a bit complicated. Here's the scheme: am I requiring anything that doesn't exist?

I have a great big math library that I want to make available in WebAssembly form, as a commercial product. It is written in C and C++, and runs on Android, iOS, Linux, macOS and Windows. Its heritage is from 1980s and 1990s technical computing, but it is still going strong as a commercial product. The immediate customers for it want to call it from C/C++ code that they already use on other platforms (mostly Linux and Windows), rather than from JavaScript. 

This math library has a test harness, also written in C and C++. The test cases are not written in C/C++, but in interpreted LISP, because that's quite suitable for handling the data that the library works with, plus it was easy to write our own portable implementation of the language. 

When you run the test harness, it looks for a LISP file to run and starts interpreting it, which usually starts by reading a lot more LISP files to set up a working environment. It then starts reading test scripts and making the calls to the math library that they request. The test harness does not exit until it has finished running everything it was given, or it hits an unrecoverable error. On its existing platforms, it can recover from SIGSEGV, SIGPFE, and other run-time errors, aborting the test that caused them and continuing with the subsequent test(s).

That last part is the problem with a naive WebAssembly implementation, where the library and the test harness are statically linked together. That means there's no opportunity for a JavaScript error handler to get involved before exceptions caused by access violations come flying out of the top of the test harness. By then, the call stack is gone, and there's no way to recover from the exception and continue with the next test. 

That's where I'd got to when I wrote " . . . what I want to do isn't possible in Emscripten." I've had what seem like better ideas since then. 

WebAssembly "modules" are, as far as I understand, kind of like Windows DLLs or Linux shared libraries. There's a difference in that you can't link a module against other modules: inter-module calls have to go via JavaScript. Is that correct? 

I was slow to realise that I can put JavaScript error handlers into any JavaScript layer. To take advantage of that, I would link my math library as a module, and its test harness as the main program. The test harness would call JavaScript code, which would call the library from within a try/catch block, catch exceptions, call the library's "tidy up" functions, and return an error code to the test harness. That lets the test harness recover from exceptions and move on to the next test. It also means I need to create a JavaScript wrapper for the whole of the library's API, which is a bit of a big job, but makes the library more generally usable in WebAssembly code. 

While writing this mail,I realised a possible short-cut to let me get the test harness running more quickly, without having to generate a wrapper for hundreds of different functions, with thousands of associated data types and constants. It goes like this:  

The calls to the library from the test harness are generated by C macros, and I can add code to those for the WebAssembly build. So it appears I could use EM_ASM() to create JavaScript error handlers at each of those call sites, as part of the test harness source. I could then statically link the library and the test harness for running tests. Is that plausible?

Yes, there is nothing stopping you wrapping your function calls in JS try/catch using EM_ASM or EM_JS.   Thanks was what I was trying to show up thread in my EM_ASM example.   You don't need to do any kind of dynamic linking to make this work.

You could write a C function such as `void* call_with_catch_handler(my_c_function, my_catch_handler_caller)` function and use EM_JS/EM_ASM to implement this.   Any fatal errors could then be piped to your `my_catch_handler_caller` function.   Is it safe to re-enter a Wasm module after a runtime trap?   I'm not sure if the spec specifies this but I think it should work for this kind of use case.

 

Heejin Ahn

unread,
Sep 15, 2025, 7:20:45 PM (3 days ago) Sep 15
to emscripten-discuss
Traps are not catchable by Wasm's try-catch or try_table. If you are going to catch them in JS that calls Wasm functions, you can (because traps surface as WebAssembly.RuntimeError), but I don't think you can 'reenter' the Wasm code that trapped. I'm not sure if I understand your plans correctly though.

John Dallman

unread,
Sep 17, 2025, 9:57:54 AM (yesterday) Sep 17
to emscripte...@googlegroups.com
> Yes, there is nothing stopping you wrapping your function calls in JS try/catch using EM_ASM or EM_JS.  
> That was what I was trying to show up thread in my EM_ASM example. You don't need to do any kind of 
> dynamic linking to make this work.

That sounds good, but I have more questions. 

> You could write a C function such as 
> void* call_with_catch_handler( my_c_function,  my_catch_handler_caller) 
>
> and use EM_JS/EM_ASM to implement this.  Any fatal errors could then be piped to your 
> `my_catch_handler_caller` function.

Presumably, "my_catch_handler_caller" is a JavaScript function? I'm unfortunately having to deal with these issues with zero practical experience of using JavaScript for anything. My knowledge is all from the javascript.info tutorial. There are other people in the company who build and test WebAssembly stuff, but they don't even attempt to handle run-time errors. I'm porting a larger library to WebAssembly and need to do a bit better.

The practical difficulty with following your example is that I don't know how to pass the arguments into JavaScript to be used in the try block, or get values back out of JavaScript into WebAssembly. I presume this is necessary? 

I've found ccall() and cwrap() in the documentation: are they all there is, or are there higher-level functions? The bind.h documentation seems to indicate that it is not yet complete, and it's not immediately clear how much of it can be used for C data structures. ; Many of my C functions have arguments that are C structs, some are passed by value, and some have a mixture of types (I don't know yet if there are any that are both passed by value and have a mixture of types). 

To make up an example:

#ifdef __EMSCRIPTEN__
AC_error_code_t Catcher_AC_FAN_set_speed( catch_handler, int speed, AC_FAN_speed_table_t table)
{
  EM_ASM(
  {
     try 
        {
        // How to call AC_FAN_set_speed( speed, table) from JavaScript? 
        // also need to capture its return value.
some functions, changes in their arguments
        } 
      catch (e) 
        { 
        // If we caught an exception, set a flag  
        }
    });
    // If exception flag set, return AC_ERROR_runtime_error.
    // else, return the return value of the C call.  
}
#endif //__EMSCRIPTEN__

Thanks, 

John


Sam Clegg

unread,
Sep 17, 2025, 1:49:31 PM (yesterday) Sep 17
to emscripte...@googlegroups.com
On Wed, Sep 17, 2025 at 6:57 AM John Dallman <jgdats...@gmail.com> wrote:
> Yes, there is nothing stopping you wrapping your function calls in JS try/catch using EM_ASM or EM_JS.  
> That was what I was trying to show up thread in my EM_ASM example. You don't need to do any kind of 
> dynamic linking to make this work.

That sounds good, but I have more questions. 

> You could write a C function such as 
> void* call_with_catch_handler( my_c_function,  my_catch_handler_caller) 
>
> and use EM_JS/EM_ASM to implement this.  Any fatal errors could then be piped to your 
> `my_catch_handler_caller` function.

Presumably, "my_catch_handler_caller" is a JavaScript function? I'm unfortunately having to deal with these issues with zero practical experience of using JavaScript for anything. My knowledge is all from the javascript.info tutorial. There are other people in the company who build and test WebAssembly stuff, but they don't even attempt to handle run-time errors. I'm porting a larger library to WebAssembly and need to do a bit better.

In this example `my_c_function` and `my_catch_handler_caller` would both be C function pointers.

The JS could that implements `call_with_catch_handler` (e.g. the EM_ASM block or the EM_JS block) can call back into native code using those function pointer.

You could also skip `my_catch_handler_caller` and instead signal failure via the return value.

Do call into native (Wasm) for from JS, given a function pointer, you can use the `dynCall` function.  It takes a function signature (represented as a string), a function pointer and an array of arguments.   See `test/core/test_dyncall_pointers.c` for an example of how to use it.


The practical difficulty with following your example is that I don't know how to pass the arguments into JavaScript to be used in the try block, or get values back out of JavaScript into WebAssembly. I presume this is necessary? 

I've found ccall() and cwrap() in the documentation: are they all there is, or are there higher-level functions? The bind.h documentation seems to indicate that it is not yet complete, and it's not immediately clear how much of it can be used for C data structures. ; Many of my C functions have arguments that are C structs, some are passed by value, and some have a mixture of types (I don't know yet if there are any that are both passed by value and have a mixture of types). 


The EM_ASM block can take arguments, but they will always show up in JS and numbers (pointers are numbers too).
Since structs are always passed as pointers in Wasm they will also show up as numbers on the JS side.
 
Reply all
Reply to author
Forward
0 new messages