memmove / bulk memory operations proposal status

Lilit Darbinyan

unread,

Jul 12, 2019, 12:01:45 PM7/12/19

to emscripten-discuss

I have a benchmark where I insert into the front of a vector in a loop 1k times, which causes the vector to grow continuously.

This was pretty slow with the Fastcomp generated Wasm binary (~80ms), and sure enough profiling showed that the hot path was memmove.
I have now switched to the new LLVM upstream backend, and it's much faster now but not as fast as I expect (~20ms), and memmove is still showing as the most time consuming thing.

I have inspected the generated wasm binary and don't see any of the new bulk memory operations there, so my questions are:

- Does the new LLVM upstream backend support bulk memory operations?

- If not, then why am I seeing this speedup by switching to the LLVM backend?

The benchmark code can be found here: https://github.com/ldarbi/wasm-scratchpad/tree/master/memmove

Thomas Lively

unread,

Jul 12, 2019, 12:12:45 PM7/12/19

to emscripte...@googlegroups.com

A memory.copy instruction should be emitted if you pass -mbulk-memory while using the LLVM backend. I’d be very interested in how that affects your benchmark.

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/eb5441fc-02d2-4894-8559-6d7ea5ea1e61%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Lilit Darbinyan

unread,

Jul 12, 2019, 12:38:26 PM7/12/19

to emscripten-discuss

oops typo - it's twice as slow WITH the flag.

On Friday, July 12, 2019 at 5:37:52 PM UTC+1, Lilit Darbinyan wrote:

Thanks!

Interestingly it's twice as slow without the flag. It's now 40ms instead of the 20ms. To summarize:

Fastcomp 80ms
LLVM 20ms
LLVM with -mbulk-memory 40ms

The profiler now shows this as the hot path instead of memmove:

std::__2::vector<double, std::__2::allocator<double> >::insert(std::__2::__wrap_iter<double const*>, unsigned long, double const&)

Alon Zakai

unread,

Jul 12, 2019, 6:34:11 PM7/12/19

to emscripte...@googlegroups.com

Interesting, yes, I also see the bulk memory path as slower. By just a little, though, after I changed the benchmark to run 10x more iterations in the html and added -O3 to the emcc command (hoping to reduce noise that way, and enable maximum opts). I see a 2-5% slowdown.

Talking to tlively, the difference might be that with bulk memory we depend on the browser for doing memmove etc., and perhaps the implementation there is not yet as optimized as what the toolchain was emitting inside the wasm (which was heavily optimized over time). But we're not sure.

The wasm binary is around 2% smaller with bulk memory though, which is nice (20,839 bytes).

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/7cf8a87b-bf54-49ea-96a1-cb337c903546%40googlegroups.com.

Thomas Lively

unread,

Jul 16, 2019, 9:31:47 AM7/16/19

to emscripte...@googlegroups.com

Hmm yeah I was worried that might happen. We’re going to have to do some science to figure out where it is actually beneficial to use the bulk memory instructions. Alternatively we can just pester engine implementers to optimize them better.

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/7cf8a87b-bf54-49ea-96a1-cb337c903546%40googlegroups.com.

Reply all

Reply to author

Forward