Need help making pycapnp/capnproto work across python and extension boundaries

868 views
Skip to first unread message

vitaly numenta

unread,
Feb 14, 2017, 7:47:00 PM2/14/17
to Cap'n Proto
I am experiencing binary compatibility issues trying to get pycapnp serialization/deserialization working with C extensions. There appear to be ABI compatibility issues when passing C++ structs compiled in pycapnp into our C extensions that are compiled in a different environment.

When serializing an instance of a class that's implemented in NuPIC, we create a message builder via pycapnp and pass it to the corresponding instance's write method, which in turn invokes write methods of its own contained members. This works fine for members whose classes are implemented in python, but doesn't always work for those implemented in the nupic.bindings extension due to ABI issues.

For example, when serializing the TemporalMemory class, we might employ the following sequence:

from nupic.proto import TemporalMemoryProto_capnp

builder = TemporalMemoryProto_capnp.TemporalMemoryProto.new_message()

temporal_memory.write(builder)

Inside TemporalMemory.write(builder), we have something along these lines:

class TemporalMemory(object):
  def write(self, builder):
    builder.columnDimensions = list(self.columnDimensions)
    self.connections.write(builder.connections) # pure python
    self._random.write(builder.random) # C++ Random class from extension


The Random class that's implemented inside the nupic.bindings extension needs to rely on our own build of capnproto that's linked into the extension, but this doesn't seem to be compatible with the object constructed in pycapnp.

We learned the hard way, after much trial and error, that we can't simply pass the underlying message builders that were instantiated by pycapnp's capnp.so module to our own build of capnproto contained in the nupic.bindigns extension. This was particularly evident when working on the manylinux wheel for nupic.bindings, which needs to be compiled using the toolchain and c/c++ runtimes from CentOS-6. This resulted in ABI incompatibilities when the capnproto code compiled into the extension attempts to operate on a message builder that was constructed by pycapnp's build of capnp.so. The message builder instance created by pycapnp's capnp.so appears corrupted when operated upon by the capnproto code linked into the extension.


Is there any recommendation for handling this dual Python/C-extension scenario that avoids the ABI compatibility problem with C++ objects?

Kenton Varda

unread,
Feb 15, 2017, 5:59:56 PM2/15/17
to vitaly numenta, Cap'n Proto
Hi Vitaly,

For ABI compatibility, you'd need pycapnp built against exactly the same version of Cap'n Proto which you're using elsewhere in the process. Ideally both would link against the same libcapnp.so, although I *think* loading two copies of the library should not create problems as long as they are the same version. (This differs from libprotobuf, which definitely can't handle being loaded multiple times in the same process.)

You may also need to make sure both copies are built with the same compiler. We're aware of at least one ABI incompatibility issue between Clang and GCC that affects Cap'n Proto.

Of course, if you can't make anything work, you can always fall back to transferring byte buffers, at the expense of possibly needing to make a copy to merge the sub-messages into one overall message.

-Kenton

--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.

vitaly numenta

unread,
Feb 15, 2017, 9:18:13 PM2/15/17
to Cap'n Proto, vitaly.kru...@gmail.com
Unfortunately, we don't have any control over the toolchain and toolchain version or the compilation/linking flags that pycapnp uses, since it's a 3rd-party project for us. We also don't have any control over which version of capnproto a given version pf pycapnp would use or which patches it might apply to it, if needed. pycapnp evolves completely independently from our software. Furthermore, pycapnp is built automatically upon installation from PyPi, using whatever version/type compiler that the user happens to have. E.g., pip install pycapnp==0.5.8 automatically downloads and builds pycapnp, including the build of some version of capnproto module that's specific to that version of pycapnp.

Our own software is the nupic.bindings package distributed as binary wheels for windows, osx, and linux. The linux wheel in particular is a "manylinux" wheel per PEP-513, which is built by definition on an old CentOS system using an old toolchain.

As you can see, there is absolutely no way to guarantee ABI compatibility between the compiled version of capnproto in pycapnp and the one used by the python extensions in nupic.bindings. It's well known that C++ is not a robust interface, since - unlike C - it does not have a stable ABI. There is variability introduced by version of ABI supported by a given toolchain as well as the type of toolchain (e.g., clang vs. g++). Compatibility is also affected by build flags. Finally, the version of capnproto sources in pycapnp is outside our control and could easily differ from the version of capnproto compiled/linked into the nupic.bindings extension.

Scott Purdy

unread,
Feb 16, 2017, 3:51:56 PM2/16/17
to Cap'n Proto, vitaly.kru...@gmail.com
Kenton, thanks for helping bring some clarity to this. It sounds like our two options are:

1. Require pycapnp and our extensions to be compiled in the same environment. We could potentially do this. We could make the install process easy for end users by forking pycapnp and putting wheels up on PyPI but we'd like to avoid that if possible.
2. Pass the byte buffers, incurring a memory copy for anything that we pass across the boundary.

I'd like to explore #2 a bit more. Would this involve extracting the segments from the pycapnp builder/reader, passing that to our extension, and constructing a new builder/reader around the byte buffer? Or would we have to construct a new message in the extension, pass the segments from that back and find a way to copy that buffer into the pycapnp message builder/reader?

We are also happy to put together a little demo project once we figure this out so others that want to do something similar have a starting place.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.

Kenton Varda

unread,
Feb 16, 2017, 5:29:23 PM2/16/17
to Scott Purdy, Cap'n Proto, vitaly numenta
On Thu, Feb 16, 2017 at 12:51 PM, Scott Purdy <sc...@fer.io> wrote:
Kenton, thanks for helping bring some clarity to this. It sounds like our two options are:

1. Require pycapnp and our extensions to be compiled in the same environment. We could potentially do this. We could make the install process easy for end users by forking pycapnp and putting wheels up on PyPI but we'd like to avoid that if possible.

I would argue that pycapnp should somehow export its version of libcapnp so that other Python extensions that also use libcapnp are able to reuse the same one. It makes sense for any Python extension that uses libcapnp.so to declare a dependency on pycapnp, I would think.

But I have no idea what this looks like logistically.

2. Pass the byte buffers, incurring a memory copy for anything that we pass across the boundary.

I'd like to explore #2 a bit more. Would this involve extracting the segments from the pycapnp builder/reader, passing that to our extension, and constructing a new builder/reader around the byte buffer? Or would we have to construct a new message in the extension, pass the segments from that back and find a way to copy that buffer into the pycapnp message builder/reader?

There's no good way to share builders, since there would be no way for them to synchronize memory allocation. So, once a buffer has been passed, it needs to be read-only.

If you are trying to build a message in Python code but have one branch of the message be built in C++ code, I think what you'll need to do is create a brand new MessageBuilder in C++, build just the C++ branch of the message there, and then pass this message to Python. In Python, you could read the message with a MessageReader and then copy the contents into the branch of the final message. This is where the copy is incurred -- when moving data from one message into another message. Presumably you can transmit individual messages between languages without any copies.

-Kenton
 
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscribe@googlegroups.com.

vitaly numenta

unread,
Feb 16, 2017, 7:27:15 PM2/16/17
to Cap'n Proto, sc...@fer.io, vitaly.kru...@gmail.com
Hi Kenton - thank you for the insights. Regarding

In Python, you could read the message with a MessageReader and then copy the contents into the branch of the final message.

So, the Python side would have a MessageBuilder. To read the branch message with a MessageReader in python, we would use the from_segments method, which would make use of the SegmentReader under the covers. However, I am a bit stomped how to do the last part, namely how to copy the contents of the MessageReader into the branch of the final message. I couldn't find the way to do that in both pycapnp and capnproto. How would you copy the contents of the MessageReader into a branch of a MessageBuilder in capnproto in c++?

Many thanks!

Vitaly

Scott Purdy

unread,
Feb 16, 2017, 7:30:54 PM2/16/17
to Kenton Varda, Cap'n Proto, vitaly numenta
On Thu, Feb 16, 2017 at 2:29 PM, Kenton Varda <ken...@sandstorm.io> wrote:
On Thu, Feb 16, 2017 at 12:51 PM, Scott Purdy <sc...@fer.io> wrote:
Kenton, thanks for helping bring some clarity to this. It sounds like our two options are:

1. Require pycapnp and our extensions to be compiled in the same environment. We could potentially do this. We could make the install process easy for end users by forking pycapnp and putting wheels up on PyPI but we'd like to avoid that if possible.

I would argue that pycapnp should somehow export its version of libcapnp so that other Python extensions that also use libcapnp are able to reuse the same one. It makes sense for any Python extension that uses libcapnp.so to declare a dependency on pycapnp, I would think.

But I have no idea what this looks like logistically.

I think it would be great to have pycapnp export a pure-C interface. Numpy, for instance, has Python function call that returns the paths needed to enable you to build extensions against numpy. But I'm pretty sure this has to be pure-C to avoid ABI issues if you don't want to enforce that everything is built which the exact same toolchain. Do you think that is possible with capnp? My understanding was that a C interface wasn't on the roadmap but perhaps a more limited interface for this specific use case wouldn't be quite as much work.

--
You received this message because you are subscribed to a topic in the Google Groups "Cap'n Proto" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/capnproto/MG9RijMCpHo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to capnproto+unsubscribe@googlegroups.com.

Kenton Varda

unread,
Feb 27, 2017, 6:24:08 PM2/27/17
to vitaly numenta, Cap'n Proto, Scott Purdy
Does a regular assignment work?

    parent_struct_builder.some_field = some_struct_reader

In C++, we generate a "setFoo()" method for each struct-typed field that takes a Reader as the parameter and makes a copy.

-Kenton

vitaly numenta

unread,
Mar 1, 2017, 4:14:30 PM3/1/17
to Cap'n Proto, vitaly.kru...@gmail.com, sc...@fer.io
Hi Kenton, thank you for the answer concerning Builder field assignment from a Reader. I will check whether this works in pycapnp out of the box, but knowing that it's already supported in C++ is very encouraging.

On Monday, February 27, 2017 at 3:24:08 PM UTC-8, Kenton Varda wrote:

vkrug...@numenta.com

unread,
Apr 25, 2017, 2:17:41 PM4/25/17
to Cap'n Proto, vitaly.kru...@gmail.com, sc...@fer.io
Hi Kenton, I thought I was almost there, but got stuck here:

One of the use cases involves a "Network" class in C++ python extension code that needs to serialize several subordinate "region" instances, some of which are implemented in C++ and some in Python. I am having a problem with the latter. To demonstrate that specific problem, I defined the following schemas:

struct NetworkProto {
  region @2 : RegionProto;
}

struct RegionProto {
  # This stores the data for the RegionImpl. This will be a PyRegionProto
  # instance if it is a PyRegion.
  regionImpl @0 :AnyPointer;
}

struct PyRegionProto {
  regionImpl @0 :AnyPointer;
}

As you recommended, we're passing byte buffers between the python and C++ layers. In this case, I have the C++ method _writePyRegion in the extension that makes the call into the python layer and converts the bytes returned by the python layer into `PyRegionProto::Reader`: `PyRegionProto::Reader Network::_writePyRegion()`.

Then, the following higher level method attempts to stuff the result of `Network::_writePyRegion` into "NetworkProto:: RegionProto:: regionImpl", but the compilation fails with "error: no member named 'setRegionImpl' in 'RegionProto::Builder'":

void Network::write(NetworkProto::Builder& proto) const
{
  // Serialize the python region
  auto regionProto = proto.initRegion();
  regionProto.setRegionImpl(_writePyRegion()); // copy
}

Kenton Varda

unread,
Apr 25, 2017, 2:29:12 PM4/25/17
to vkrug...@numenta.com, Cap'n Proto, vitaly.kru...@gmail.com, sc...@fer.io
Hi,

Since regionImpl is an AnyPointer, it doesn't have a direct setter. Instead, do:

    regionProto.getRegionImpl().setAs<PyRegionProto>(_writePyRegion());

-Kenton

vkrug...@numenta.com

unread,
Apr 25, 2017, 2:46:36 PM4/25/17
to Cap'n Proto, vkrug...@numenta.com, vitaly.kru...@gmail.com, sc...@fer.io
And we have compilation with `regionProto.getRegionImpl().setAs<PyRegionProto>(_writePyRegion());` !

Many thanks Kenton!

vkrug...@numenta.com

unread,
Apr 26, 2017, 5:35:02 PM4/26/17
to Cap'n Proto, vkrug...@numenta.com, vitaly.kru...@gmail.com, sc...@fer.io
Hi Kenton, I have good news - my basic prototype of serializing across C++/Python boundaries (in both directions) via capnp byte buffer passing is working.  I am shifting to optimizing memory utilization. In NuPIC, our core machine learning algorithm objects may get huge - upwards of GBs, and we run many on same machine, serializing periodically. So, memory utilization is critical for minimizing compute resource cost.

Presently, I am focusing on the C++ extension => python deserialization control flow. In this scenario, the C++ python extension layer has a message reader that contains a python object encoding. So, we need to extract the byte buffer representing the python-native object in the C++ code in order to pass it to Python layer. This is what the relevant code in C++ looks like:

PyObject* Network::_readPyRegion(const std::string& moduleName,
                                 const std::string& className,
                                 const RegionProto::Reader& proto)
{
  // Extract data bytes from reader to pass to python layer
  capnp::MallocMessageBuilder builder;

  builder.setRoot(pyRegionImplProto); // copy

  auto array = capnp::messageToFlatArray(builder); // copy

  // Copy from array to PyObject so that we can pass it to the Python layer
  py::String pyRegionImplBytes((const char *)array.begin(),
                               sizeof(capnp::word)*array.size()); // copy

}

As you can see, this involves a lot of copies of potentially huge amounts of data. The python layer will then reconstruct a reader from those bytes using pycapnp (yet another copy).

Ideally, I would like to extract the data segment(s) directly from RegionProto::Reader, but that doesn't appear to be supported. I think that we need to find/create some way to handle this efficiently in order to support serialization/deserialization across C++/Python boundaries.

Thank you,
Vitaly

vkrug...@numenta.com

unread,
Apr 26, 2017, 5:40:09 PM4/26/17
to Cap'n Proto, vkrug...@numenta.com, vitaly.kru...@gmail.com, sc...@fer.io
Here is more complete C++ code snippet for my prior post:

PyObject* Network::_readPyRegion(const std::string& moduleName,
                                 const std::string& className,
                                 const RegionProto::Reader& proto)
{
  capnp::AnyPointer::Reader implProto = proto.getRegionImpl();

  PyRegionProto::Reader pyRegionImplProto = implProto.getAs<PyRegionProto>(); // no copy here, right?

Kenton Varda

unread,
Apr 26, 2017, 6:44:58 PM4/26/17
to vkrug...@numenta.com, Cap'n Proto, vitaly numenta, Scott Purdy
Hi,

I think what you want here is for pycapnp to be extended with some API that other Python extensions can use to interact with it in order to wrap and unwrap builders. pycapnp builders are actually wrapping a capnp::DynamicStruct::Builder under the hood, which is easy to cast back and forth to your native builder type. You just need pycapnp to give you access somehow.

I unfortunately do not know very much about how pycapnp and cython work, so I'm not sure I can help. This may be a question for Jason Paryani.

By the way, if you guys are in the Bay Area, you should come to our Cap'n Proto 0.6 release party on May 18 at Cloudflare: https://www.meetup.com/Sandstorm-SF-Bay-Area/events/239341254/

-Kenton

--

Hedge Hog

unread,
May 3, 2017, 8:00:15 PM5/3/17
to Cap'n Proto, sc...@fer.io, vitaly.kru...@gmail.com
Hi,
I'm contemplating working on the Ruby binding.  It seems reasonable to anticipate that I or others will strike this same issue. Some further questions below...


On Friday, 17 February 2017 09:29:23 UTC+11, Kenton Varda wrote:
On Thu, Feb 16, 2017 at 12:51 PM, Scott Purdy <sc...@fer.io> wrote:
Kenton, thanks for helping bring some clarity to this. It sounds like our two options are:

1. Require pycapnp and our extensions to be compiled in the same environment. We could potentially do this. We could make the install process easy for end users by forking pycapnp and putting wheels up on PyPI but we'd like to avoid that if possible.

I would argue that pycapnp should somehow export its version of libcapnp so that other Python extensions that also use libcapnp are able to reuse the same one. It makes sense for any Python extension that uses libcapnp.so to declare a dependency on pycapnp, I would think.

I'm pretty sure I don't understand this correctly ;)

Is it correct that issue only applies to CP's struct types (the case cited in the OP)?
So when using all the other CP types we're good to go across different environments?
I recall from the distant past some sensitivity issues around ABI compatibility and `enum` types. 
Now I'm not sure if the enum in CP's language maps that closely to the compiler's `enum`, and if they too will expose the issue raised here.

I know it is a lot to ask, but could the doc here [1] be updated to warn users of these issues for each of CP's types?

Is guidance to users as simple as 'use only the built in types in your messages to minimise ABI compatibility risks/issues'?
i.e. are `List`, `Data` and `Text` subject to this same issue?

[1]: https://capnproto.org/language.html#interfaces
 
Best wishes

Kenton Varda

unread,
May 4, 2017, 1:21:24 AM5/4/17
to Hedge Hog, Cap'n Proto, Scott Purdy, vitaly numenta
Hi,

I'm not sure I understand your message.

The Cap'n Proto encoding is binary-compatible across all implementations (it wouldn't be a very good serialization format otherwise).

The ABI issue we're discussing here is that of the libcapnp library -- that is, the C++ interfaces. pycapnp is implemented as a wrapper around libcapnp. Vitaly was discussing a case where there is a second Python extension loaded into the same program which *also* uses libcapnp and wishes to interact with pycapnp as well. Hence they would be passing C++ objects (not just serialized messages) back and forth, which requires C++ ABI compatibility (not just binary message encoding compatibility).

-Kenton

To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscribe@googlegroups.com.

Hedge Hog

unread,
May 4, 2017, 4:12:45 AM5/4/17
to Kenton Varda, Cap'n Proto, Scott Purdy, vitaly numenta
Thanks, you're right I had misunderstood where the issue was.
Best wishes.
Hedge
πόλλ' οἶδ ἀλώπηξ, ἀλλ' ἐχῖνος ἓν μέγα
[The fox knows many things, but the hedgehog knows one big thing.]
Archilochus, Greek poet (c. 680 BC – c. 645 BC)
http://hedgehogshiatus.com

vitaly numenta

unread,
May 8, 2017, 7:38:16 PM5/8/17
to Cap'n Proto, vkrug...@numenta.com, vitaly.kru...@gmail.com, sc...@fer.io
pycapnp builders are actually wrapping a capnp::DynamicStruct::Builder under the hood, which is easy to cast back and forth to your native builder type. You just need pycapnp to give you access somehow.

Dear Kenton, regarding the above: we're working with your earlier suggestion to pass byte buffers across Python and C++ extension environments. We believe that this results in a more robust and portable implementation, since we have no control over which version of pycapnp the user desires to use, including which version of capnproto that pycapnp includes, and the compiler toolchain that built that pycapnp's capnproto .so on the user's machine, which build flags, etc. versus the build of capnproto in our own binary wheel containing our python extension.

To this end, we often need to convert between capnproto readers/builders and flat array encodings (from messageToFlatArray) encapsulated as python byte string. Since our machine learning models may be huge (GBs), the multiple levels of copying is prohibitively expensive in memory resources (and possibly in time). So, it's pertinent to eliminate as many levels of copying as possible. Presently, pycapnp only exposes `to_bytes`, which is a method that extracts data bytes from a builder via `capnp::messageToFlatArray` and then copies to a python byte string. Unfortunately, capnproto doesn't provide `capnp::messageToFlatArray` for readers, so when a reader is involved, yet another level of copy is necessitated to convert the reader to a build before applying `capnp::messageToFlatArray`.

I believe that the problem is not unique to our extension, and anyone attempting to implement this type of binding would run against this issues, especially if they are cognizant of the memory resource and performance implications.

Ideally, I think it would be great to be able to use something like `capnp::messageToFlatArray` on readers as well as builders and also have it copy the output efficiently to a user-provided byte-aligned buffer instead of returning `kj::Array<capnp::word>`. This way, several levels of copying would be eliminated, and instantaneous memory utilization would be cut several-fold.

Kenton Varda

unread,
Jun 4, 2017, 6:28:17 PM6/4/17
to vitaly numenta, Cap'n Proto, Vitaly Kruglikov, Scott Purdy
Hi Vitaly,

You can direct Cap'n Proto to write bytes to an arbitrary target by creating a custom subclass of kj::OutputStream which does whatever you need, then pass that to capnp::writeMessage().

You can also use MessageBuilder::getSegmentsForOutput() to get direct pointers to the message content without any copies. You can construct a SegmentArrayMessageReader from these segments elsewhere to read them.

It sounds like the limitations here are on the Python side, which I don't know very much about.

-Kenton

--
Reply all
Reply to author
Forward
0 new messages