Segmentation faults in wasm workers

Dieter Weidenbrück

unread,

May 25, 2023, 10:28:37 AM5/25/23

to emscripten-discuss

All,

I am experiencing segmentation faults when using wasm workers.
Overview:
I am working on a project with considerable 3D data sets. The code has been stable for a while when running in the main thread alone. Then I started using js workers (no shared memory), and again all was well.
Now I've switched to SharedArrayBuffers and wasm workers, and I keep running into random problems.
I have prepared the code such that I can run with 0 workers up to hardware.concurrency workers. All is well with 0 workers, but as soon as I use one or more workers, I keep getting segfaults because of invalid pointers, access out of bounds and similar.

What happens in main thread and what in the wasm workers:
I allocate all objects in the main thread when importing the 3D file. Then i fire off a function for each object that will do some serious calculations of the data, including allocating and disposing of memory. The workers allocate approx. 300 to 400 MB in addition to the main thread. All this happens in the same sharedArrayBuffer, of course.

Here is what I've tried so far:
- compiling with SAFE_HEAP=1

not a lot of helpful information,
- compiling with -fsanitize=address
everything works without problems here!
- compiling with ASSERTIONS=2
gave me this information:

To me it looks like another resize call is executed while other workers keep working on the buffer, and then something gets into conflict.
To test this, I allocated 1.8 GB right after startup in the main thread and disposed the mem blocks again just to trigger heap resize. After that everything works like a charm.

Is there anything I am doing wrong?
Sorry for not providing a sample, but there is a lot of code involved, and it is not easy to simulate this behavior. Happy to answer questions.

All comments are appreciated.
Thanks,
Dieter

Dieter Weidenbrück

unread,

May 25, 2023, 11:06:55 AM5/25/23

to emscripten-discuss

The joy was premature, even with pre-allocated heap size segfaults occur. :(

Sam Clegg

unread,

May 25, 2023, 11:55:41 AM5/25/23

to emscripte...@googlegroups.com

Firstly, if you are allocating 1.8Gb you are likely pushing up against browser limits. Are you specifying a MAXIMUM_MEMORY of larger than 2GB?

Secondly, it looks like you are using wasm workers, which are still relatively new. Do you have a version of your code that uses pthreads instead? It might tell is if the issue is related to wasm workers.

cheers,

sam

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/80d56314-59d8-4332-bb2e-ebe00fe52ea3n%40googlegroups.com.

Dieter Weidenbrück

unread,

May 25, 2023, 12:06:27 PM5/25/23

to emscripten-discuss

Hi Sam,

I noticed already that I am bumping against browser limits, especially with sanitizer switched on, so I reduced the pre-allocation calls.

It turns out that asan uses so much memory that I can't use it to analyze this case.

I use

-s ALLOW_MEMORY_GROWTH=1

but don't specify any MAXIMUM_MEMORY.

No pthreads version so far. I might try this next.

Cheers,

Dieter

Dieter Weidenbrück

unread,

May 25, 2023, 12:19:59 PM5/25/23

to emscripten-discuss

This is a memory snapshot when using SAFE_HEAP. So here I am quite below the browser limits, still the segfault occurs in different places.
Ignore the first console line, it results from Norton Utilities I think.

Sam Clegg

unread,

May 25, 2023, 2:29:58 PM5/25/23

to emscripte...@googlegroups.com

This looks like some kind of memory corruption, most likely due to the use of muiltithreading/wasm_workers Are you able to build a single threaded version of your program, or one that uses normal pthreads rather than wasm workers?

Also, can you share the full link command you are using?

cheers,

sam

To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/cfc03512-f69f-44b0-8c14-1f1a8e4ffe9fn%40googlegroups.com.

Dieter Weidenbrück

unread,

May 25, 2023, 4:27:04 PM5/25/23

to emscripten-discuss

Hi Sam,

I can run the code in a single thread without problems, and I have done that for a while. So I assume that the code is stable.

Here is the command line I use in a .bat file:

emcc ./src/main.c ^
...

./src/w_com.c ^-I ./include/ ^
-g3 ^
--source-map-base ./ ^
-gsource-map ^
-s ALLOW_MEMORY_GROWTH=1 ^
-s ENVIRONMENT=web,worker ^
--shell-file ./index_template.html ^
-s SUPPORT_ERRNO=0 ^
-s MODULARIZE=1 ^
-s ABORTING_MALLOC=0 ^
-sWASM_WORKERS ^
-s "EXPORT_NAME='wasmMod'" ^
-s EXPORTED_FUNCTIONS="['_malloc','_free','_main']" ^
-s EXPORTED_RUNTIME_METHODS="['cwrap','UTF16ToString','UTF8ToString','stringToUTF8','allocateUTF8']" ^
-o index.html

I will start familiarizing myself with pthreads to test whether that would work better.

BTW, as an old C programmer I am fascinated by emscripten and its possibilities. Excellent job!

Cheers,
Dieter

Sam Clegg

unread,

May 25, 2023, 5:20:33 PM5/25/23

to emscripte...@googlegroups.com

Is there some reason you added `-sABORTING_MALLOC=0`.. that looks a little suspicious, since it means the program can continue after malloc fails.. which mean that any callsite that doesn't check the return value of malloc can lead to segfaults. If you remove that setting does the behaviour change?

To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/e568e189-4259-460f-9601-e7996927cdb7n%40googlegroups.com.

Dieter Weidenbrück

unread,

May 26, 2023, 6:25:08 AM5/26/23

to emscripten-discuss

Hi Sam,

IIRC, when I started with Emscripten a while ago the program would abort in case of a memory error. As my app is comparable to a desktop app, this was not acceptable, so I set ABORTING_MALLOC to 0. I understand that this flag has a different meaning today. Here is how all my allocation calls work:

Error_T allocMemPtr(MemPtr_T *p,uint32_T size,boolean_T clear) {
_MemPtr_T mp;

if (clear)
mp = (_MemPtr_T)calloc(1,size + sizeof(_Mem_T));
else
mp = (_MemPtr_T)malloc(size + sizeof(_Mem_T));
if (mp) {
mp->size = size;
*p = (MemPtr_T)((char_T*)mp + sizeof(_Mem_T));
return kErr_NoErr;
}
return kErr_MemErr;
}
Error_T setMemPtrSize(MemPtr_T *p,uint32_T size){
_MemPtr_T m = _MP(*p);
MemPtr_T newPtr;

newPtr = realloc(m,size + sizeof(_Mem_T));
if (newPtr) {
m = (_MemPtr_T)newPtr;
m->size = size;
*p = (MemPtr_T)((char_T*)m + sizeof(_Mem_T));
return kErr_NoErr;
}
return kErr_MemErr;
}

So I should catch all errors. However, errors (i.e. return value == 0) are not reported by malloc or calloc during the problems I am experiencing. I added debug lines, but not a single failure was recorded.

Removing ABORTING_MALLOC did not result in any change of error outcome.

I see two different behaviors now:
- setting up workers and checking that they run by
static void startUpWorker(void) {
#ifdef __EMSCRIPTEN__
int32_T w = emscripten_wasm_worker_self_id();
if (! emscripten_current_thread_is_wasm_worker()){
EM_ASM_({
console.log("Error: No worker: " + $0);
},w);
}
#endif //__EMSCRIPTEN__
}

- then I do my stuff and receive about 10 of the "Uncaught RuntimeError: memory access out of bounds" errors.

- no failures of malloc/calloc recognized

The second behavior is

- in main() I call this routine:

static void memtest(void) {
#define NUM_CHUNKS 15
const int CHUNK_SIZE = 100 * 1024 * 1024;
int i;
void* p[NUM_CHUNKS];
Error_T err = kErr_NoErr;

for (int i = 0; i < NUM_CHUNKS; i++) {
err = allocMemPtr(&p[i],CHUNK_SIZE,FALSE); //see function above
if (err != kErr_NoErr || p[i] == NULLPTR) {
printf("Error chunk %d\n",i);
break;
}
}
for (int i = 0; i < NUM_CHUNKS; i++) {
if (p[i] == NULLPTR)
break;
disposeMemPtr(p[i]);
}
}

- then I start up the workers as described above
- then I do my stuff

- sometimes this results in error free behavior, but not always. If an error occurs, I only get one "Uncaught RuntimeError" message.

I am pretty confident that I handle memory allocation correctly, because my background is in development of desktop apps in C for 30+ years, and there you better not have any leaks and keep the app running whenever possible. So I must be doing something wrong when dealing with multiple threads.

I will try out pthreads next, because I have no idea anymore what the cause could be here.

Cheers,
Dieter

Sam Clegg

unread,

May 26, 2023, 4:47:12 PM5/26/23

to emscripte...@googlegroups.com

Can I ask why you chose not to use pthreads to start with? I'd like to understand better why folks would choose wasm workers over pthreads.

To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/b20d2de8-2532-4441-b8fc-3ef8f049f7f0n%40googlegroups.com.

Jukka Jylänki

unread,

May 26, 2023, 7:25:24 PM5/26/23

to emscripte...@googlegroups.com

This is a good bug report about debuggability shortcomings of Wasm on two accounts:

1. when we had the old JS-based SAFE_HEAP feature, the feature said something along the lines of "segmentation fault: attempted to write i32 value x to memory address y", so one could deduce whether the memory address was zero, out of wasm memory bounds, or compare it against the memory map, e.g. if it is out of the dynamic malloc region memory bounds even if it might have been a dynamically allocated ptr. And one could verify the alignment as well, to use that as a heuristic to figure out if the ptr was stomped on and was garbled.

The wasm-based SAFE_HEAP feature does not provide any of above, but only prints that "Aborted(segmentation fault)".

2. Wasm browser debuggers should be able to show the line of code that does the segfault operation. Not sure if the browser debuggers are yet powerful enough to do that. Also currently we have no integration with those - maybe a SAFE_HEAP should generate a wasm trap instead of going out to JS to throw a JS exception? (although catching that exception should allow one to still manually inspect the callstack to find the offending line)

Given that there exists an error print into dlmalloc, might be interesting to see how the error message adjust if you build against emmalloc, by using the linker flag -sMALLOC=emmalloc. There also exist variants -sMALLOC=emmalloc-debug, -sMALLOC=emmalloc-memvalidate, and -sMALLOC=emmalloc-memvalidate-verbose with increasing levels of dynamic memory allocator internal consistency checks (and increasing levels of slowness and log spamming).

If nothing stands out, getting to a reduced repro test case would be helpful.

To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/CAL_va28k7RyF2n-x6B8M9pbgri2bCDCQA7N%2BG7x-6GVP%2Bpqumg%40mail.gmail.com.

Dieter Weidenbrück

unread,

May 27, 2023, 11:43:19 AM5/27/23

to emscripten-discuss

Certainly, I appreciate your interest.

I had to abandon a single-thread solution, because it would block the main thread for minutes.
Step1: Using js workers (emscripten_create_worker(wname))

Worked very well. Bonus was to have 2GB of RAM (if available) in each worker. Caveat: lots of copying back and forth. No messaging between workers.

Step2: Decision for wasm workers (emscripten_malloc_wasm_worker)

The requirement to move to a different kind of workers came with the usage of SharedArrayBuffers. I could allocate my data in the main thread, and then send of parts of it for processing to a list of workers, without the need for copying stuff around.

Not being familiar with pthreads nor wasm workers I followed the recommendation on this page:

https://emscripten.org/docs/api_reference/wasm_workers.html?highlight=wasm%20worker

" If an application is only developed to target WebAssembly, and portability is not a concern, then using Wasm Workers can provide great benefits in the form of simpler compiled output, less complexity, smaller code size and possibly better performance."

(see section " Pthreads vs Wasm Workers: Which One to Use?")

Other than that I had no particular reason to choose wasm workers, although I liked the idea of just a couple of bytes on disk for the wasm workers.

Cheers,
Dieter

Dieter Weidenbrück

unread,

May 27, 2023, 12:03:37 PM5/27/23

to emscripten-discuss

JJ,

thanks for your recommendations. Here are some first results (I hope it is ok to post these screenshots here, if not I can post links):

SAFE_HEAP switched off.

Using emmalloc:

First try:

Second try:

Third try (try == same binary, just restarted and performed the same steps)

So the behavior is not really predictable nor reproducable.

Using emmalloc-debug:

First try:

Second try:

In both cases Chrome was left in a state using 98+% of the CPU without returning. I had to kill Chrome

Using emmalloc-memvalidate:

Right after startup without any interaction from my side, the app aborted with this message.

Setting the number of wasm workers used to 0 (i.e. run as a single thread app), everything worked fine with emmalloc and emmalloc-debug. For emmalloc-memvalidate I had to #if 0 all calls to wasm workers and remove

-sWASM_WORKERS, then the app would run without problems.

A comment about debugging in general:
source code debugging in a browser is a tedious task. Every line requires several clicks/keys to advance, and I have not found a way to inspect the values of variables.
If I have to run through a loop with a couple of hundred or thousand iterations this doesn't help a lot to raise my enthusiasm.
So what I do is, I have set up a project in Visual Studio with a slightly different main function as a cmdline-app. From there I can call specific functions and debug them with all bells and whistles. 
So I have some good confidence that the code works before entering the wasm/browser world.

Cheers,

Dieter

Dieter Weidenbrück

unread,

May 28, 2023, 4:32:55 AM5/28/23

to emscripten-discuss

One more detail:

- use hello_wasm_worker.c from the test lib without changes

- use

emcc ./hello_wasm_worker.c ^

-g3 ^
--source-map-base ./ ^
-gsource-map ^
-s ALLOW_MEMORY_GROWTH=1 ^
-s ENVIRONMENT=web,worker ^

-s SUPPORT_ERRNO=0 ^
-sWASM_WORKERS ^
-s ASSERTIONS=2 ^
-sMALLOC=emmalloc-memvalidate ^
-o hello_wasm_worker.html

Result:

Tested on Dell XPS17, 12th Gen Intel(R) Core(TM) i9-12900HK 2.50 GHz, 64 GB RAM, Windows 11 64bit
Browsers: current Chrome and Firefox

Hope this helps,

Dieter

Jukka Jylänki

unread,

May 28, 2023, 8:51:57 AM5/28/23

to emscripte...@googlegroups.com

Oops, that looks like an oversight with emmalloc + multithreading + memvalidate assertion checks implementation. Posted a fix PR to that at https://github.com/emscripten-core/emscripten/pull/19465

To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/7de28d6a-0cc9-4953-b9eb-bb9a2844ac9dn%40googlegroups.com.

Dieter Weidenbrück

unread,

May 30, 2023, 3:10:38 PM5/30/23

to emscripten-discuss

All,
with lots of debug lines I am able to isolate one error in my code, however, I have not been able yet to build a reduced test case for it.
I tested this with a single worker to avoid races.

The pointer in question got allocated in the worker first. Later I needed to enlarge the size. Using code like the following snippet I detected that the content of the reallocated block was different from the original one (considering the old size only, of course):

Assume p is the block that was allocated initially with size oldSize:

#define smallChunkSize  256
  uint8_T     test1[smallChunkSize],test2[smallChunkSize];
  uint8Ptr_T  p = (uint8Ptr_T)myP;
  uint8Ptr_T  newP  = 0L;
  int i,n;

  n = min(oldSize,smallChunkSize);
  memcpy(test1,p,n);
  newP = realloc(p,newSize);
  assert(newP != 0L);

  memcpy(test2,newP,n);

  for (i=0;i<n;i++){
    assert(test1[i] == test2[i]);
  }

The newP contained garbage.
I can't say when this happens, but I can reproduce it with my code. A wild guess is that it happens during a heap resize, but I don't have tools to verify this.

Just as a reminder, if I set the number of workers to 0 and run this code in a single thread nothing happens.

As a workaround I tried to allocate a new block, copy everything from the old block to the new one, and dispose of the old block. That resolves this particular problem, but then other problems arise.
So I will keep searching.

Cheers,
Dieter

Sam Clegg

unread,

May 30, 2023, 3:59:31 PM5/30/23

to emscripte...@googlegroups.com

On Tue, May 30, 2023 at 12:10 PM 'Dieter Weidenbrück' via emscripten-discuss <emscripte...@googlegroups.com> wrote:

All,
with lots of debug lines I am able to isolate one error in my code, however, I have not been able yet to build a reduced test case for it.
I tested this with a single worker to avoid races.

The pointer in question got allocated in the worker first. Later I needed to enlarge the size. Using code like the following snippet I detected that the content of the reallocated block was different from the original one (considering the old size only, of course):
Assume p is the block that was allocated initially with size oldSize:

#define smallChunkSize 256

uint8_T test1[smallChunkSize],test2[smallChunkSize];
uint8Ptr_T p = (uint8Ptr_T)myP;
uint8Ptr_T newP = 0L;
int i,n;

n = min(oldSize,smallChunkSize); memcpy(test1,p,n);
newP = realloc(p,newSize);
assert(newP != 0L);

memcpy(test2,newP,n);

for (i=0;i<n;i++){
assert(test1[i] == test2[i]);
}

Is newSize always greater than n ?

Are there other workers (or the main thread) that might be accessing `myP` while this code is running?

To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/c438a02c-af55-4c44-9999-064c49a27d78n%40googlegroups.com.

Shlomi Fish

unread,

May 30, 2023, 9:32:29 PM5/30/23

to 'Sam Clegg' via emscripten-discuss

Hi all,

On Tue, 30 May 2023 12:59:18 -0700

"'Sam Clegg' via emscripten-discuss" <emscripte...@googlegroups.com>
wrote:

> On Tue, May 30, 2023 at 12:10 PM 'Dieter Weidenbrück' via
> emscripten-discuss <emscripte...@googlegroups.com> wrote:
>
> > All,
> > with lots of debug lines I am able to isolate one error in my code,
> > however, I have not been able yet to build a reduced test case for it.
> > I tested this with a single worker to avoid races.
> >
> > The pointer in question got allocated in the worker first. Later I needed
> > to enlarge the size. Using code like the following snippet I detected that
> > the content of the reallocated block was different from the original one
> > (considering the old size only, of course):
> > Assume p is the block that was allocated initially with size oldSize:
> > #define smallChunkSize 256
> > uint8_T test1[smallChunkSize],test2[smallChunkSize];
> > uint8Ptr_T p = (uint8Ptr_T)myP;
> > uint8Ptr_T newP = 0L;
> > int i,n;
> >
> > n = min(oldSize,smallChunkSize); memcpy(test1,p,n);
> > newP = realloc(p,newSize);
> > assert(newP != 0L);
> >
> > memcpy(test2,newP,n);
> >
> > for (i=0;i<n;i++){
> > assert(test1[i] == test2[i]);

please trim the quoted parts of the emails:
https://en.wikipedia.org/wiki/Posting_style . 70 kilibytes emails are appalling
.

--

Shlomi Fish https://www.shlomifish.org/
What Makes Software Apps High Quality - https://shlom.in/sw-quality

Rindolf is the Evil twin brother of Rudolph and Randolph, Santa’s goody-two-
shoes reindeer, who are among his arch-enemies. He is also one of the
cornerstones of the Evil Reindeer Evil World Domination Evil Conspiracy, which
aims to spread Evil in general and Reindeer Evil in particular around the world.

Please reply to list if it's a mailing list post - https://shlom.in/reply .

Dieter Weidenbrück

unread,

May 31, 2023, 2:52:53 AM5/31/23

to emscripten-discuss

Sam,

there is only the main thread and one wasm worker. No other code than the one in the worker is accessing this particular pointer.
newSize is always greater than oldSize.

If I call the worker's code directly, i.e. work with the main thread only, all is fine.
There is always a possibility that some of my code may not be thread-safe. This is what I am examining now. As I am using my own allocMemPtr and resizeMemPtr functions (plus two other functions), there are no other calls in my code using malloc or realloc.

Using these to shield the memory calls seems to help somewhat (taken from emmalloc.c, MALLOC_ACQUIRE, but renamed to avoid potential conflicts):
static volatile uint8_T threadLock = 0;
#define LOCK_ACQUIRE() while(__sync_lock_test_and_set(&threadLock , 1)) {while(threadLock){ /*nop*/ }}
#define LOCK_RELEASE() __sync_lock_release(&threadLock )

However, still there is corruption, and LOCK_ACQUIRE ends up in an infinite loop in a simple malloc call.

It would be helpful to be able to have a watch function to keep an eye on a specific memory portion to see when and by whom it gets changed.

@Shlomi: sorry, will do.

Cheers, Dieter

Dieter Weidenbrück

unread,

Jun 3, 2023, 12:11:01 PM6/3/23

to emscripten-discuss

All,

I think I have found the problem and a workaround. First, I got everything working using pthreads without problems. However, I wasn't too impressed with the time savings vs. single thread. So I went back and did some more debugging on the wasm workers.

Using the flags -sMALLOC=emmalloc -sWASM_WORKERS , no sanitizer, no safe-heap, I stepped into the realloc routine of emmalloc. All goes well there if the existing ptr can be enlarged, however, if a new ptr is allocated, there is a problem sometimes. The new ptr gets allocated, but the memcpy goes wrong, as it looks. memcpy is not a part of emmalloc, of course.

So I changed my central resizing routine to allocate a new block and copy the content myself:

Error_T setMemPtrSize(MemPtr_T *p,uint32_T size){
_MemPtr_T m = _MP(*p),mNew;
MemPtr_T newPtr;
uint32_T n;
Error_T err = kErr_NoErr;

LOCK_ACQUIRE();
//newPtr = realloc(m,size + sizeof(_Mem_T));

newPtr = (MemPtr_T)calloc(1,size + sizeof(_Mem_T));
if (newPtr != NULLPTR) {
mNew = (_MemPtr_T)newPtr;
n = min(m->size,size) + sizeof(_Mem_T);
memcpy(newPtr,m,n);
free(m);
mNew->size = size;
*p = (MemPtr_T)((char_T*)mNew + sizeof(_Mem_T));
}
else
err = kErr_MemErr;
LOCK_RELEASE();
ASSERT_LOCK_IS_NOT_ACQUIRED();
return err;
}

I can not say what exactly goes wrong, because it is quite cumbersome to debug without have access to all the details. However, with this change, everything works beautifully now. Tested with both malloc and emmalloc, with different numbers of workers between 1 and 20, different file sizes up to 350 MB, and with both debug and release settings.

Some times for processing the largest file:

pthreads: between 489.537000 and 521.209450 secs in debug mode (using queues) using 10 threads

wasm workers: in debug mode: 59.942190 secs using 10 workers, 54.108360 secs using 20 workers

wasm workers in release mode: 21.243100 secs using 10 workers, 19.642640 secs using 20 workers
single-thread: 50.066905 secs in release mode

Thanks for all your comments, they helped a lot.
Best, Dieter

Sam Clegg

unread,

Jun 4, 2023, 10:40:26 PM6/4/23

to emscripte...@googlegroups.com

Your results for pthreads look rather odd, so something strange must be going on. In terms of performance there should be very little difference between wasm workers and pthreads if you implement the same algorithm since they are both based on the same shared memory primitives. Most of the differences are related to the code size.

Thanks for all your comments, they helped a lot.
Best, Dieter

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/45762a6f-ea22-4149-a95b-6d882975fcf6n%40googlegroups.com.

Dieter Weidenbrück

unread,

Jun 5, 2023, 2:11:39 AM6/5/23

to emscripten-discuss

s...@google.com schrieb am Montag, 5. Juni 2023 um 04:40:26 UTC+2:

Your results for pthreads look rather odd, so something strange must be going on. In terms of performance there should be very little difference between wasm workers and pthreads if you implement the same algorithm since they are both based on the same shared memory primitives. Most of the differences are related to the code size.

Agreed. Maybe there was still a flag like SAFE_HEAP enabled. I was more focussing on getting things to work.
That leaves the strange behavior when using realloc that I can not further test here.

Jukka Jylänki

unread,

Jun 5, 2023, 11:18:30 AM6/5/23

to emscripte...@googlegroups.com

I explored this area in the repository by writing a stress test code for realloc + wasm workers + aborting malloc=0 + memory growth, which can be found in https://github.com/emscripten-core/emscripten/pull/19465/files#diff-02c8c2b081a2137723784e50ae5aedd7a2df1ac8dc3cfcb4ad6ac9eb8137f0fb , but was not able to coax out an issue.

If you are still keen, one approach might be to try to mutate that test case to look like your scenario, if you might be able to get the test to show the same issue you were seeing.

--

You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/644e2c21-042d-45a3-a132-5d56d76278e5n%40googlegroups.com.

Dieter Weidenbrück

unread,

Jun 10, 2023, 8:45:51 AM6/10/23

to emscripten-discuss

jj schrieb am Montag, 5. Juni 2023 um 17:18:30 UTC+2:

I explored this area in the repository by writing a stress test code for realloc + wasm workers + aborting malloc=0 + memory growth, which can be found in https://github.com/emscripten-core/emscripten/pull/19465/files#diff-02c8c2b081a2137723784e50ae5aedd7a2df1ac8dc3cfcb4ad6ac9eb8137f0fb , but was not able to coax out an issue.

If you are still keen, one approach might be to try to mutate that test case to look like your scenario, if you might be able to get the test to show the same issue you were seeing.

jj,
thanks for using your time for this.
I am able to reproduce the error in my code in a consistent way, however, I have not been able to isolate it into standalone code. I saw your fixes to emmalloc-memvalidate, so I will wait for this to happen and then try again.

The alternative would be to give somebody access to the project, however, I can't post the code publicly.

Dieter Weidenbrück

unread,

Jul 17, 2023, 8:57:51 AM7/17/23

to emscripten-discuss

All,
I think I have finally found the culprit. It looks like there was a race condition between an sbrk and a local copy of HEAP32. This led to an overwrite of a memory block (really hard to nail down the exact sequence of events with workers) which in turn led to a segmentation fault when accessing the garbage and interpreting it as pointers into memory.
The error is gone now as far as I can see.

The memvalidate option is still not usable with emmalloc, just as a reminder.

キャロウ　マーク

unread,

Jul 29, 2023, 5:21:42 AM7/29/23

to emscripten-discuss Sam Clegg via

On May 31, 2023, at 10:32, Shlomi Fish <shl...@shlomifish.org> wrote:

please trim the quoted parts of the emails:
https://en.wikipedia.org/wiki/Posting_style . 70 kilibytes emails are appalling

Segmentation faults in wasm workers

Dieter Weidenbrück

Dieter Weidenbrück

Sam Clegg

Dieter Weidenbrück

Dieter Weidenbrück

Sam Clegg

Dieter Weidenbrück

Sam Clegg

Dieter Weidenbrück

Sam Clegg

Jukka Jylänki

Dieter Weidenbrück

Dieter Weidenbrück

Dieter Weidenbrück

Jukka Jylänki

Dieter Weidenbrück

Sam Clegg

Shlomi Fish

Dieter Weidenbrück

Dieter Weidenbrück

Sam Clegg

Dieter Weidenbrück

Jukka Jylänki

Dieter Weidenbrück

Dieter Weidenbrück

キャロウ マーク

キャロウ　マーク