embind: confused by, and avoiding, C++ object copies

762 views
Skip to first unread message

lukester1975

unread,
Jun 1, 2017, 6:53:59 AM6/1/17
to emscripten-discuss
Hi

I think I must be doing something obviously wrong here, but embind seems to make temporary copies of C++ objects for reasons I don't understand.

Here's an example. JS code calling a C++ function that then calls back to JS.

em++ Simple.cpp --bind -o Simple.html --post-js test.js

Simple.cpp:
// em++ Simple.cpp --bind -o Simple.html --post-js test.js

#include <emscripten/bind.h>
#include <emscripten/val.h>
#include <emscripten/emscripten.h>
#include <cstdio>
#include <memory>

class Test1
{
public:
   
Test1() :
        m_int
(123)
   
{
        printf
("%s ctor %p\n", __FUNCTION__, this);
   
}

   
Test1(const Test1 &rhs) :
        m_int
(rhs.m_int)
   
{
        printf
("%s copy ctor %p\n", __FUNCTION__, this);
   
}

   
~Test1()
   
{
        printf
("%s dtor %p\n", __FUNCTION__, this);
   
}

   
int    m_int;
};

class Test2
{
public:
   
Test2()
   
{
        printf
("%s ctor %p\n", __FUNCTION__, this);
   
}

   
Test2(const Test2 &rhs) :
        m_test1
(rhs.m_test1)
   
{
        printf
("%s copy ctor %p\n", __FUNCTION__, this);
   
}

   
~Test2()
   
{
        printf
("%s dtor %p\n", __FUNCTION__, this);
   
}

   
int        m_pad;        // Just to get this != &this->m_test1
   
Test1    m_test1;
};

Test2 g_test2;

void InvokeJs(emscripten::val callback)
{
    printf
("In C++ function %s, calling back to JS\n", __FUNCTION__);

    callback
(g_test2);
}

EMSCRIPTEN_BINDINGS
(Main)
{
   
using namespace emscripten;

   
function("invokeJs", &InvokeJs);

    value_object
<Test1>("Test1")
       
.field("int", &Test1::m_int)
       
;

    value_object
<Test2>("Test2")
       
.field("test1", &Test2::m_test1)
       
;
}

void MainLoop()
{
}

int main(int argc, char *argv[])
{
    emscripten_set_main_loop
(&MainLoop, 0, 1);

   
return 0;
}


test.js:
function myInit() {

   
var button = document.createElement("BUTTON");
    button
.appendChild(document.createTextNode("Call C++"));

    button
.onclick = function () {

       
var myCallback = function (test2) {
           
Module.print("In JS callback. test2.test1.int = ", test2.test1.int);

           
// HACK for testing both value_object and class_ bindings.
           
if (test2.hasOwnProperty("delete")) {
                test2
.delete();
           
}
       
}

       
Module.print("\nIn JS, calling C++ function");
       
Module.invokeJs(myCallback);

       
if (typeof Module.invokeJsRef === "function") {
           
Module.print("\nIn JS, calling C++ function using refs");
           
Module.invokeJsRef(myCallback);
       
}

       
Module.print("Done");
   
};

    document
.body.insertBefore(button, document.getElementById("canvas").parentElement);
}

myInit
();

Hitting "Call C++" button gives:

Test1 ctor 0x16f4
Test2 ctor 0x16f0

In JS, calling C++ function
In C++ function InvokeJs, calling back to JS
Test1 copy ctor 0x501f14
Test2 copy ctor 0x501f10
Test1 copy ctor 0x501f20
~Test1 dtor 0x501f20
~Test2 dtor 0x501f10
~Test1 dtor 0x501f14
In JS callback. test2.test1.int =  123
Done

These temporaries are destroyed before even making it in to the JS code, so it seems like they c/should be elided?

I have worked around this by wrapping non-trivial types in a reference_wrapper-like type. Attached as Ref.cpp. I only need read-only access so the write side of things isn't covered, but it just seems like I'm missing something obvious here...

Thanks for any help!

Luke.

Ref.cpp

Jukka Jylänki

unread,
Jun 2, 2017, 8:53:33 AM6/2/17
to emscripte...@googlegroups.com
This might be an effect of the serialization mechanism that embind uses. It has a concept of a "wire type", where data is serialize to pass JS <-> C++ boundary. Not sure how easy it would be to avoid these types of copies, but it sounds like you did find a good workaround?

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alexandre Perrot

unread,
Jun 2, 2017, 11:33:20 AM6/2/17
to emscripten-discuss
Hi,

Did you also track the copies of your Ref objects ?
Seems to me that they should be copied the same way, so this is not really a "workaround".
Also, what is the output when you use your Ref class ?

Anyway, value_object should not be used with objects where copy is expensive, since data needs to be copied back and forth.

lukester1975

unread,
Jun 2, 2017, 12:13:38 PM6/2/17
to emscripten-discuss
Yeah, should've mentioned I did trace it all the way through to GenericBindingType::toWireType...

lukester1975

unread,
Jun 2, 2017, 12:36:18 PM6/2/17
to emscripten-discuss

Hello

In the Ref case, it's only copying a pointer to the original object. The expensive case of creating a copy of the underlying object is never used (it's just there to get it to compile; I imagine it would come in to play trying to write - not tried yet).

Sure, output of the Ref case is no copies:

Test1 ctor 0x1784
Test2 ctor 0x1780

In JS, calling C++ function using refs
In C++ function InvokeJsRef, calling back to JS using ref

In JS callback. test2.test1.int =  123
Done

The thing is the temporaries are destroyed before hitting the JS code, so there's seemingly no need for them. Something is needed to marshal those values to JS, but it can surely be the original objects...

The class_ binding case is slightly different. Copies are still taken but obviously they aren't destroyed as it's up the the JS code to .delete() the passed object. Interestingly though, there does seem to be a leak:

In JS, calling C++ function
In C++ function InvokeJs, calling back to JS
Test1 copy ctor 0x5020c4
Test2 copy ctor 0x5020c0
Test1 copy ctor 0x5020d0

In JS callback. test2.test1.int =  123
~Test2 dtor 0x5020c0
~Test1 dtor 0x5020c4

Looks like a missing ~Test1 there. I've not investigated this yet.

Thanks!

Luke.

lukester1975

unread,
Jun 7, 2017, 7:59:21 AM6/7/17
to emscripten-discuss
Just to answer myself r.e. potential leak in the class_ binding case.

Module.print("In JS callback. test2.test1.int = ", test2.test1.int);

which needs to be:

var test1 = test2.test1;
Module
.print("In JS callback. test2.test1.int = ", test1.int);
test1
.delete();

Alexandre Perrot

unread,
Jun 9, 2017, 3:26:37 AM6/9/17
to emscripten-discuss


Le vendredi 2 juin 2017 18:36:18 UTC+2, lukester1975 a écrit :

Hello

In the Ref case, it's only copying a pointer to the original object. The expensive case of creating a copy of the underlying object is never used (it's just there to get it to compile; I imagine it would come in to play trying to write - not tried yet).

Sure, output of the Ref case is no copies:

Test1 ctor 0x1784
Test2 ctor 0x1780

In JS, calling C++ function using refs
In C++ function InvokeJsRef, calling back to JS using ref
In JS callback. test2.test1.int =  123
Done


Yes, but I'm pretty sure the Ref objects themselves are also copied. Sure, it is less expensive than copying a big object, but the issue is the same.
If you want to avoid, copies, use pointers.
Reply all
Reply to author
Forward
0 new messages