std::vector seems to be very slow in memory (re)allocation calls

958 views
Skip to first unread message

Sören König

unread,
May 17, 2016, 7:23:46 AM5/17/16
to emscripten-discuss
(re)allocating a large stl vector using the constructor, resize or reserve calls like

std::vector<unsigned char> abc(1920000);

or

std::vector<unsigned char> abc;
abc.reserve(1920000);

 seems to be extremely slow in comparison to a simple malloc with exactly the same size in emscripten 1.35.

 Coming from a native C++ world, I would expect similar timing result as for malloc.
Are there any special optimization flags to enable? Or is this a browser specific issue?











Alon Zakai

unread,
May 17, 2016, 12:56:11 PM5/17/16
to emscripten-discuss
I would be surprised if there is a difference from 1.35, nothing there should have changed? Can you bisect to see where the issue begins? Or provide a minimal but complete testcase including measurement and compilation flags, so we can test?

Comparing to native C++ though, there are expected differences, mainly since we don't have virtual memory. Reallocating memory natively can sometimes map pages and not do an actual copy, but if we need to move things, we always copy. Another issue is that in memory growth mode, allocating or reserving memory might cause memory to grow, which can be slow.

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sören König

unread,
May 18, 2016, 7:47:38 AM5/18/16
to emscripten-discuss
Okay here is some some source code to reproduce the effect (see below) which was compiled with

emcc vectorspeed.cpp -std=c++14 -o index.html  -s TOTAL_MEMORY=67108864

adding -O2 will produce exception/hangs on Microsoft Browsers


results in Firefox 46.0.1:

malloc/free: 0.001s
vector ctor: 0.398s
vector resize: 0.192s
vector reserve: 0s

results in IE 11:
malloc/free: 0.001s
vector ctor: 2.063s       !!!!!
vector resize: 1.321s
vector reserve: 0s

results in Edge:

malloc/free: 0.002s
vector ctor: 1.634s     !!!!!
vector resize: 0.787s
vector reserve: 0.001s





Source vectorspeed.cpp:

#include<iostream>
#include<vector>
#include<chrono>

using double_sec = std::chrono::duration<double, std::ratio<1>>;

template <typename Func,typename... Args>
double profile(Func f, Args&&... args)
{
 using namespace std::chrono;
 auto start = system_clock::now();
 
  f(std::forward<Args>(args)...);
 
 return duration_cast<double_sec>(system_clock::now() - start).count();
}


int main() {

 std::size_t n = 48000000;
 auto t1 = profile([](std::size_t n)
 {
  void* ptr = malloc(n);
  free(ptr);
 }, n);
 auto t2 = profile([](std::size_t n)
  {
   std::vector<unsigned char> vec(n);
  }, n);

 auto t3 = profile([](std::size_t n)
 {
  std::vector<unsigned char> vec;
  vec.resize(n);
 }, n);

 auto t4 = profile([](std::size_t n)
 {
  std::vector<unsigned char> vec;
  vec.reserve(n);
 }, n);

 
 std::cout << "malloc/free: " << t1 << "s"<< std::endl;
 std::cout << "vector ctor: " << t2 << "s" << std::endl;
 std::cout << "vector resize: " << t3 << "s" << std::endl;
 std::cout << "vector reserve: " << t4 << "s" << std::endl;
 
 return 0;
}

AlainC

unread,
May 18, 2016, 8:14:07 AM5/18/16
to emscripten-discuss
For malloc and reserve (with an initially empty array), there is no initialization of the elments in the array.
For std constructor and resize, each element will be initialized individually (to 0) in a loop I think.

Sören König

unread,
May 18, 2016, 10:32:03 AM5/18/16
to emscripten-discuss
hmm this would explain the different timings. but 2 seconds to zero out a larger bunch of memory?!? is the resulting javascriptcode really so slow?
here are some timing on a native build on the same machine:

malloc/free: 0.0002104s
vector ctor: 0.019273s
vector resize: 0.0157814s
vector reserve: 0.0001271s

this is about 20 times faster than firefox and 100 times faster than IE for the vector ctor.

Alon Zakai

unread,
May 18, 2016, 12:13:46 PM5/18/16
to emscripten-discuss
Ok, looks like there are 2 issues here. First, IE and Edge might have a specific bug, worth filing that for them.

Second, native code can avoid initializing memory, as modern OSes have a way to just get an already-zero-initialized page. That means allocating zero'd memory can be essentially free. In JS though we have to actually zero out the memory.

There is actually one case where we don't, the first time memory is used, it is guaranteed to be zerod out since that's how typed arrays start. We might be able to add hooks into dlmalloc (or however it requests those pages), but i'm not sure offhand how easy it would be. But, it would just be for the first use of the memory, so in a long-running app it would not help. I filed https://github.com/kripken/emscripten/issues/4334

--

Floh

unread,
May 19, 2016, 4:55:09 AM5/19/16
to emscripten-discuss
Just my 2ct: I'm not sure it's worth the trouble to optimize for that specific case which is fairly exotic IMHO (a giant dynamic array of a small built-in type). As soon as the type is not a simple builtin type and has a constructor, it will be slow on native platforms as well since it needs to run the constructor 48 million times.

I did a quick test with clang on OSX, and there's indeed no initialization loop over the array elements, only a single allocator call, so I guess this does the 'get zero-initialized memory' trick, but to me this is an under-the-hood optimization of that specific platform that shouldn't be depended on (the C++ standard only says that elements will be default-initialized, which in the case of built-in types yields 0, but implementations are free how this is achieved).

As I said, just my 2ct :)
-Floh.

To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages