Extracting Builder/Reader from pycapnp objects

394 views
Skip to first unread message

Scott Purdy

unread,
Jan 21, 2015, 12:40:23 PM1/21/15
to capnproto
Hi capnpers,

I am trying to integrate pycapnp and Cap'n Proto into my hybrid Python/C++ project. This was proving a bit tricky so I created a simple hacked-together example to demonstrate what I am trying to do. You can see the project here:


Everything currently works there except that I haven't figured out how to extract the C++ Builder/Reader objects from the pycapnp Cythonized PyObject. See here:


Does anyone know how to do that?

Thanks,
Scott

Jason Paryani

unread,
Jan 21, 2015, 3:19:35 PM1/21/15
to Scott Purdy, capnproto
Unfortunately the answer to this right now is that support for bindings other than Cython are poor/nonexistent. I'd highly recommend using Cython if you plan to integrate deeply with pycapnp. However, I understand if you're not able to just switch frameworks at the drop of a hat :)

If you really want to integrate with SWIG, then we're going to have to dive into Cython just a bit. If you want to follow along, run `python setup.py build` in pycapnp's directory and then open `capnp/lib/capnp.cpp' and look for the definition of the extension class _MessageBuilder. In my file, the name mangling turned it into __pyx_obj_5capnp_3lib_5capnp__MessageBuilder but that may be different for you. Once you find it, you'll see something like:

struct __pyx_obj_5capnp_3lib_5capnp__MessageBuilder {
  PyObject_HEAD
  struct __pyx_vtabstruct_5capnp_3lib_5capnp__MessageBuilder *__pyx_vtab;
   ::capnp::MessageBuilder *thisptr;
};

What this says is that the _MessageBuilder PyObject has two extra fields, the cython vtable (which we don't care about), and the Cap'n Proto Message builder. The layout of this struct is guaranteed to be preserved since this is how Cython will access fields across pyx/pxd files. It's a bit sad that there's no way to get Cython to export this in a re-usable header, but for our purposes you can just paste the following struct into your code:

struct pycapnp_MessageBuilder {
  PyObject_HEAD
  void *__pyx_vtab;
   ::capnp::MessageBuilder *thisptr;
};

And cast to it whenever you receive a _MessageBuilder PyObject from pycapnp. The easiest way is to obtain one is like so:

addresses = addressbook_capnp.AddressBook.new_message()
addresses._parent  # this is always a MessageBuilder when allocated from a new_message like above

Also be very sure to keep the _MessageBuilder's PyObject refcount above 0, otherwise it will be garbage collected and the underlying capnp::MessageBuilder deleted.

--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.
Visit this group at http://groups.google.com/group/capnproto.

Scott Purdy

unread,
Jan 21, 2015, 4:08:33 PM1/21/15
to Jason Paryani, capnproto
This is very helpful, thanks Jason. I will try to get this working in my example and follow up if I get stuck.

Scott Purdy

unread,
Jan 21, 2015, 4:47:56 PM1/21/15
to Jason Paryani, capnproto
Alright, I tried to do a similar translation to what you did with MessageBuilder for the DynamicStruct. I am a bit unclear on how the casting is supposed to work. I tried some different things but can only manage to get Python to crash.

I am hoping that if we can figure this out I can create SWIG macros or typecasts to take care of this stuff automatically.

Also, regarding ref counting, there will be a reference to the object in Python throughout the time that it is used in the C++ function call so I don't think I have to worry about it getting cleaned up. Correct me if I am missing something.

You can see the current state here:

Jason Paryani

unread,
Jan 21, 2015, 5:40:27 PM1/21/15
to Scott Purdy, capnproto
First thing I noticed is you can't cast DynamicStruct::Builder/Reader directly to another type. You usually need to use the `.as` function of DynamicStructs but unfortunately it won't work in this case because it does internal type checking and the schema used to make the original pycapnp struct is different from the compiled in schema you're using in C++. There's a few workarounds for this:

1. Make the SchemaParser/Loader in pycapnp load your compiled in schema. You will have to read quite a bit of the cython code to figure out what it would take to replicate the current module that wraps the underlying schema. The advantage of this method is that `.as()` would work and you would get some basic runtime type checking from it.

2. In the Cap'n Proto C++ library, we could allow `DynamicStruct.as` to cast to AnyStruct, and then it would be simple to do a non type checked `.as` call from AnyStruct to your final struct.

3. Use MessageBuilder/MessageReader directly. This has the major disadvantages that your struct must be the root of the message and no type checking will be performed.

Beyond all that though, I tried running your example and even after commenting out the body of the write function in bindings.i, I still get the error:
Fatal Python error: PyThreadState_Get: no current thread

It seems to be happening on the `from example.bindings import Inner` line of hybrid.py

Scott Purdy

unread,
Jan 21, 2015, 6:37:28 PM1/21/15
to Jason Paryani, capnproto
I pushed a couple bug fixes but I can't replicate the PyThreadState error. What was the command you ran that resulted in that?

Which option seems best to you?

#2 seems like the most flexible option to me since it isolates the C++ implementation details to just those components. I put my assumptions below in case I am missing something (please feel free to correct me).

1. Does this mean that you have to start constructing the message using a SWIG-wrapped schema? In other words, you can't start out using the Python imported schema and then convert it to the compiled in version? That seems like it would work but requires that the Python code know whether any sub-components are C++ objects and, if so, use the compiled-in schema throughout the Python code.

2. This seems like the only option that lets the outer Python code operate naively while subcomponents with C++ parts can switch the builder/reader from the imported version to the compiled in version as needed.

3. I don't think this will be very useful. If you are at the root then you might as well just create the root in C++ directly. Maybe there is a use case I don't see but there are certainly use cases this doesn't cover.

Jason Paryani

unread,
Jan 21, 2015, 7:20:49 PM1/21/15
to Scott Purdy, capnproto
#2 is my favorite as well.

For #1, I didn't mean it that way, but you could definitely get away with always constructing the structs in C++. What I meant was replicating my SchemaParser wrapping (see https://github.com/jparyani/pycapnp/blob/2ee4498318dcfb0e0a5263b88f350f9d6ffe2c67/capnp/lib/capnp.pyx#L2936), but instead use SchemaLoader and load your compiled type in directly instead of parsing the file from disk. Admittedly, this would be quite a lot of work I'd imagine.

I'll take a crack at implementing the .as<AnyStruct> method for libcapnp. Once that's done, I'll wrap it from pycapnp.

Scott Purdy

unread,
Jan 21, 2015, 8:15:15 PM1/21/15
to Jason Paryani, capnproto
K, thanks for all the help and let me know if there is anything I can do to help out.

Jason Paryani

unread,
Jan 21, 2015, 10:38:14 PM1/21/15
to Scott Purdy, capnproto

Scott Purdy

unread,
Jan 22, 2015, 1:01:40 AM1/22/15
to Jason Paryani, capnproto
Thanks for getting to this so fast! And it works!

Here is what I did:

1. Build this new version and compile it into the SWIG shared library that Python loads
2. Update the bindings.i file in the example to do the following:
::capnp::DynamicStruct::Builder builder = ...;
InnerProto::Builder inner = builder.as<AnyStruct>().as<InnerProto>();

And now I can import a .capnp file from pycapnp, start builder a message/outer struct, and then pass an inner struct through SWIG and have it get converted to the right C++ struct.

I have pushed the working code to my sample project.

Kenton Varda

unread,
Jan 23, 2015, 1:16:47 PM1/23/15
to Scott Purdy, Jason Paryani, capnproto
FWIW, casting through AnyStruct of course means you're opting out of type checking. If you can get access to pycapnp's SchemaParser, then all you need to do is:

    schemaParser.loadCompiledTypeAndDependencies<InnerProto>();

From then on, you will be able to use .as<InnerProto>() on DynamicStructs to cast directly, with a type check (will throw an exception on mismatch). You can call loadCompiledTypeAndDependencies() at any time -- before or after parsing the schema, or even when a DynamicStruct of the type already exists.

-Kenton

Scott Purdy

unread,
Jan 23, 2015, 3:07:09 PM1/23/15
to Kenton Varda, Jason Paryani, capnproto
Kenton, it would be great to keep the type-checking. I dug around a little and see how to create a new SchemaParser in pycapnp and I see how to get the underlying C++ schema parser from the Python object. But it sounds like I need to get the SchemaParser that was used when importing the .capnp files which I am not sure how to get.

Jason, if you know how I can get a reference to the schema parser in Python I can try it out. I start coding this in my example but don't know how to get the current schema parser:

Jason Paryani

unread,
Jan 23, 2015, 3:13:32 PM1/23/15
to Scott Purdy, Kenton Varda, capnproto
There's a global schema parser that is used by default for all imports. See https://github.com/jparyani/pycapnp/blob/develop/capnp/lib/capnp.pyx#L3587. You can access it like so:

import capnp
import capnp.lib.capnp as lcapnp
import myschema_capnp # the _global_schema_parser is lazy loaded, so it won't be accessible until after an import/load

lcapnp._global_schema_parser

Scott Purdy

unread,
Jan 23, 2015, 4:30:21 PM1/23/15
to Jason Paryani, Kenton Varda, capnproto
Thanks Kenton and Jason, I got that in place in my example project but am hung up on the current error. I have capnp/schema-parser.h included and you can see that I have -lcapnp specified to the linker but I'm getting a missing symbol still. Any ideas?

Code:

Output:

clang++ -lpython -lkj -lcapnp -shared bindings_wrap.o -o example/_bindings.so
Undefined symbols for architecture x86_64:
  "capnp::SchemaParser::getLoader()", referenced from:
      void capnp::SchemaParser::loadCompiledTypeAndDependencies<InnerProto>() in bindings_wrap.o
  "capnp::schemas::s_ea420346ed5080d1", referenced from:
      capnp::_::RawSchema const& capnp::_::rawSchema<InnerProto, InnerProto::_capnpPrivate, false>() in bindings_wrap.o
ld: symbol(s) not found for architecture x86_64

Kenton Varda

unread,
Jan 23, 2015, 4:53:17 PM1/23/15
to Scott Purdy, Jason Paryani, capnproto
Hi Scott,

You'll need to link -lcapnpc (note the final 'c'), which is the library containing the Cap'n Proto compiler implementation (i.e. the schema parser).

-Kenton

Scott Purdy

unread,
Jan 23, 2015, 5:28:52 PM1/23/15
to Kenton Varda, Jason Paryani, capnproto
Ahh got it, that did the trick. Looks like it is all working now and I don't need to convert to AnyStruct anymore. Thanks for the help guys! I will try to clean up my example and make a more reusable SWIG solution so hopefully my example repo will be useful to others trying to do the same thing.

Scott Purdy

unread,
Jan 23, 2015, 5:58:32 PM1/23/15
to Kenton Varda, Jason Paryani, capnproto
Alright, I created a simple header file with templated getReader<T> and getBuilder<T> to make it easy. Just point people here if they ask about SWIG and pycapnp:


Again, thanks to both of you for all the help!

Scott Purdy

unread,
May 30, 2015, 1:48:39 PM5/30/15
to Kenton Varda, Jason Paryani, capnproto
Reviving an old thread here. This has worked really well for converting from the pycapnp format into C++ builders/readers. Now I need to go the other way. I am trying to work this out but curious if you (Jason) know of existing code that will do this. I want the opposite of getBuilder and getReader - templated functions that take custom builders/readers and return pycapnp PyObject instances that I can interact with in Python.

Jason Paryani

unread,
May 30, 2015, 2:40:27 PM5/30/15
to Scott Purdy, Kenton Varda, capnproto
You're going to want to use the Cython API. I whipped up a quick example at https://github.com/jparyani/pycapnp_init_from_c. Take a look at the pyx file specifically to see how to use the Cython api. The one confusing bit is that in order to get strong memory safety guarantees, you need to pass in the parent _MessageBuilder/Reader pycapnp object to ensure that the message's memory is never freed while using the message (this is accessible as `_parent` from any Reader/Builder).

Also worth noting, you can get away with not adding Cython as a build dependency for everyone by checking in the generated cpp/h files if you want.

Scott Purdy

unread,
May 30, 2015, 2:57:46 PM5/30/15
to Jason Paryani, Kenton Varda, capnproto

Awesome thanks for the quick response! I'll try integrating it in the next few days.

chet...@gmail.com

unread,
Oct 13, 2015, 6:29:14 PM10/13/15
to Cap'n Proto, sc...@fer.io, ken...@sandstorm.io, jpar...@sandstorm.io
Hi Jason,

Thank you for this example. I've been helping Scott with this project, and I tried out your example. I found that before calling `create_builder()` I have to call `initcreate_example()` or I get a segfault. Do you know when in my C++ code I should call `initcreate_example()`, and whether I can call it multiple times? If I can safely call it multiple times I was thinking of just calling it within `create_builder()`.

Thanks,
Chetan

Chetan Surpur

unread,
Oct 13, 2015, 6:44:28 PM10/13/15
to Cap'n Proto, sc...@fer.io, ken...@sandstorm.io, jpar...@sandstorm.io
Also, I forgot to mention in my email, if I do call that function, it all works! Thanks for your help!
You received this message because you are subscribed to a topic in the Google Groups "Cap'n Proto" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/capnproto/1dJ97vebYdw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to capnproto+...@googlegroups.com.

Jason Paryani

unread,
Oct 15, 2015, 7:59:59 PM10/15/15
to chet...@gmail.com, Cap'n Proto, Scott Purdy, Kenton Varda
Sorry about that, I forgot to mention `initcreate_example`. It should only be called once. I think it's actually idempotent, but it has a lot of setup code that would be inefficient to do again and again.

Chetan Surpur

unread,
Oct 16, 2015, 6:50:39 PM10/16/15
to Jason Paryani, Cap'n Proto, Scott Purdy, Kenton Varda
Great, thanks Jason!
Reply all
Reply to author
Forward
0 new messages