Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Boost serialization of intrusive_ptr?

53 views
Skip to first unread message

Bradley Wilson

unread,
Aug 28, 2009, 12:59:00 AM8/28/09
to
A few days ago I stumbled upon the fact that the boost serialization
library does not support intrusive_ptr, and no one else seems to have
done this. I've spent days poking around google to no avail.

Has anyone done this? I've looked at the code for serializing
shared_ptr in the hopes I'd be able to write my own wrapper, but it
looks like I'm not quite up to par yet. (Recent C -> C++ convert for a
personal project.)

Extra credit: I also need to... uh... *hangs head in shame*...
serialize boost memory pools.

Yeah. Really.

If I can't find a way to do that I'm going to be bogged down in
translating my containers of memory-pool-allocated, intrusive_ptr-using
objects to regularly-allocated objects for serialization and I'm not
looking forward to debugging those kind of shennanigans.

Yannick Tremblay

unread,
Aug 28, 2009, 7:42:35 AM8/28/09
to
In article <2009082721590075249-themindwalrus@gmailcom>,

Could you explain what you expect to happen when you serialise an
intrusive pointer and then at a later point in time, somewhere else,
you deserialise it?

What do you intend to do with the serialised data? Do you send it or
store it? What receives it?

Yannick

Bradley Wilson

unread,
Aug 29, 2009, 3:53:44 AM8/29/09
to
> Could you explain what you expect to happen when you serialise an
> intrusive pointer and then at a later point in time, somewhere else,
> you deserialise it?

I have some fairly complex state that will be constantly running. I need
to be able to save all this state and load it back up for processing at a
later time. (This is for some experimental game AI I'm playing around
with.)

Profiling my scratch code showed that my major bottlenecks were a
dynamic cast (which I can get around by using a type variable and a
static cast) and malloc, which is where the memory pools come into play.

Smart pointers allocate space for the reference count, so I'm using
intrinsic_ptr to avoid those. I have many fragments of data, and to
save memory (and processing time) I'm using the intrinsic pointers
to avoid replicating data that isn't changing.

There will be several threads running and locking/unlocking for memory
allocation was also killing performance. Since none of the threads ever
use memory from the other threads, each thread will have its own set of
memory pools, allowing me to avoid locking and contention hits.

This particular section of code is the very core of my AI and needs to run
as fast as possible or I wouldn't be jumping through all these hoops. (I'm
aiming for a minimum target population of 1000 free-thinking autonomous
characters.)

> What do you intend to do with the serialised data? Do you send it or
> store it? What receives it?

I just need to save this entire block of state (containers of instrinsic_ptr's
that point to memory allocated via memory pools) to disc ("save game"),
and load it later ("load game").

Yannick Tremblay

unread,
Sep 1, 2009, 10:47:02 AM9/1/09
to
In article <2009082900534416807-themindwalrus@gmailcom>,

Bradley Wilson <the.min...@gmail.com> wrote:
>> Could you explain what you expect to happen when you serialise an
>> intrusive pointer and then at a later point in time, somewhere else,
>> you deserialise it?
>
>I have some fairly complex state that will be constantly running. I need
>to be able to save all this state and load it back up for processing at a
>later time. (This is for some experimental game AI I'm playing around
>with.)
>
>Profiling my scratch code showed that my major bottlenecks were a
>dynamic cast (which I can get around by using a type variable and a
>static cast) and malloc, which is where the memory pools come into play.
>
>Smart pointers allocate space for the reference count, so I'm using
>intrinsic_ptr to avoid those. I have many fragments of data, and to
>save memory (and processing time) I'm using the intrinsic pointers
>to avoid replicating data that isn't changing.
>
>There will be several threads running and locking/unlocking for memory
>allocation was also killing performance. Since none of the threads ever
>use memory from the other threads, each thread will have its own set of
>memory pools, allowing me to avoid locking and contention hits.

Yes, that all sounds reasonable albeit fairly complex. But
intrinsic pointers are used as a way to access data efficiently. What
is important to be saved and restored is the data, not the pointers.

So you missed the point of my question: do you want to save the data
or the pointers? If you pass an intrinsic pointer to a save method,
shouldn't it be the data pointed by it that is saved rather than the
pointer value and the reference count on the data?

If you reload a previously serialised intrinsic pointer, would you
expect it to point to the same address or to point to the same data?

>This particular section of code is the very core of my AI and needs to run
>as fast as possible or I wouldn't be jumping through all these hoops. (I'm
>aiming for a minimum target population of 1000 free-thinking autonomous
>characters.)

Yes, the normal operation needs to be fast. But serialising to disk
hopefully is not in the critical path.

>> What do you intend to do with the serialised data? Do you send it or
>> store it? What receives it?
>
>I just need to save this entire block of state (containers of instrinsic_ptr's
>that point to memory allocated via memory pools) to disc ("save game"),
>and load it later ("load game").

But you don't want to save the pointers value, do you? You want to
save the data that is pointed to.

Unless your "save game" save a game data that will only ever be valid
for the currently running instance (process) of the game, saving the
value of the intrinsic_ptr and the current reference count value will
not be very useful since the next time you run the game, the memory
pool for one thread might be at a totally different address than it
was the previous time.

So it sounds to me that you need to save the content of your memory
pools to disk and boost::serialiser is a useful framework to serialise
the data, then you need to read the data from disk and load it into
your memory pools. The data saved on disk doesn't need to know about
memory pools, only the load data method needs to know where to place
the data it reads from disk.

I guess it would be possible to save the entire process memory state
and completely restore it at a later date or in a different run
instance, the complete restore would restore all intrinsic pointers
and all memory pools correctly. But saving one particular intrinsic
pointer would require a lot of hoop jumping.

Yan


Bradley Wilson

unread,
Sep 1, 2009, 9:42:07 PM9/1/09
to
> If you reload a previously serialised intrinsic pointer, would you
> expect it to point to the same address or to point to the same data?

They have to point to the same data. What I want to do is take this
whole structure of and save it all to disk.


> Yes, the normal operation needs to be fast. But serialising to disk
> hopefully is not in the critical path.

Correct. The trouble is a byproduct of all this crazy stuff I have to do
to make the AI operations fast.


>> I just need to save this entire block of state (containers of instrinsic_ptr's
>> that point to memory allocated via memory pools) to disc ("save game"),
>> and load it later ("load game").
>
> But you don't want to save the pointers value, do you? You want to
> save the data that is pointed to.

Of course!


> So it sounds to me that you need to save the content of your memory
> pools to disk and boost::serialiser is a useful framework to serialise
> the data, then you need to read the data from disk and load it into
> your memory pools. The data saved on disk doesn't need to know about
> memory pools, only the load data method needs to know where to place
> the data it reads from disk.

I am using the boost::serialize framework. However, it A) Doesn't by
default support intrinsic pointers (although I'm staring to see that
adding serialization support should be fairly easy), and:

B) The real trouble is that the data pointed to by the intrinsic pointers
is allocated from a memory pool, and memory pools don't appear to be
serializable.


Here's the issue:

When saving via boost::serialize, it remembers which data has been written
via storing the pointer to that data in some kind of container (I'm
guessing a map), so if a pointer to that data is encountered again, it just
marks which data is being pointed to instead of writing out the data again.
Pointer swizzling, I think it's called.

But... a memory pool is a block of memory that other pointers point into --
so if the entire block of data is saved via a single pointer that points to
the beginning of the memory pool block, and then another pointer is
encoutered that points to memory _inside_ that block, the serialization
framework has no way of knowing that that particular data has already been
saved: the pointers don't match since the entire memory block was saved through
a pointer to the beginning of the block. The interior pointer won't be in
boost:serialize's pointer-swizzling map.

Now... lets say I pull some major trickery and after the memory pool's block
has been saved, I iterate through its list of allocated chunks and store those
interior pointers by adding them directly to the swizzle map container. When my
intrinsic pointers are encountered, those pointers will already be in the
swizzle map so the data won't be written more than once.

This is all fine and dandy, but it won't work for deserializtion.
There's no way
to tell it that a particular incoming pointer actually points to the
middle of an
allocated block of memory that was previously loaded.

This also means you can't serialize pointers that point into an array:

my_object *a = new my_object[10]; // Array
my_object *b = a + 5; // Pointer to middle of array

// Open archive for saving
...
archive << a; // Assume we store a+5 in the swizzle map directly in
// my_object's serialize routine

archive << b; // Data is not written, due to previous swizzle trickery

// Open archive for loading
my_object *c, *d;

archive >> c; // Should work correctly, entire array is loaded

archive >> d; // Will not work as advertised - new memory is
// allocated because archive doesn't know that d
// points into the interior of c.

So unless some wizard out there has already figured out a way around
this problem,
I'm going to have to alter my approach to how to save state. (I have an
idea, but
it's madness.)

Francesco

unread,
Sep 1, 2009, 10:41:13 PM9/1/09
to
On 2 Set, 03:42, Bradley Wilson <the.mind.wal...@gmail.com> wrote:

[snip]

> So unless some wizard out there has already figured out a way around
> this problem,
> I'm going to have to alter my approach to how to save state. (I have an
> idea, but
> it's madness.)

What about a table lookup?

Caveat: this is just a theoretical rant, I never used boost or
anything like a memory pool.

Add a unique ID to all of your objects and, when serializing, save to
disk the IDs of the objects along with their owners and along with the
objects that link to that(those) owner(s).

When unserializing, create a table to lookup the ID->new_pointer pairs
then set the pointers accordingly as you proceed.

This should work fine as long as you respect the order of
serialization-unserialization (owners first, linkers last), and this
becomes really tricky during the unserialization if the objects link
to each other in a cyclic manner.

Just half a cent, I know. I could expose my idea better with an
example, if you want, but my example would take complete care of the
serialization-unserialization process, then it would be your task to
fit this technique into your program.

Hope that helps, anyway.

Best regards,
Francesco

Francesco

unread,
Sep 2, 2009, 5:37:14 AM9/2/09
to

Thinking better - yesterday I missed the point - I suppose you have to
store to disk both the objects' data and the pointers' data (the
addresses they point to, that actually can be used as IDs on disk).

The main problem with the serialization you are currently doing is
that some pointers aren't serialized at all (namely, the "b" pointer
of your latest post, whose data-dumping gets skipped because of the
swizzling). Once you don't serialize the data of that pointer (that
is, the address it points to) there is no chance to recover the links
later.

Another half a cent, I know. I should put my hands on boost to get a
better grip on your problem.

Hope that helps - again, anyway.

Cheers,
Francesco

Bart van Ingen Schenau

unread,
Sep 2, 2009, 7:06:26 AM9/2/09
to
On Sep 2, 3:42 am, Bradley Wilson <the.mind.wal...@gmail.com> wrote:

> Yannick Tremblay wrote:
> > So it sounds to me that you need to save the content of your memory
> > pools to disk and boost::serialiser is a useful framework to serialise
> > the data, then you need to read the data from disk and load it into
> > your memory pools.  The data saved on disk doesn't need to know about
> > memory pools, only the load data method needs to know where to place
> > the data it reads from disk.
>
> I am using the boost::serialize framework. However, it A) Doesn't by
> default support intrinsic pointers (although I'm staring to see that
> adding serialization support should be fairly easy), and:
>
> B) The real trouble is that the data pointed to by the intrinsic pointers
> is allocated from a memory pool, and memory pools don't appear to be
> serializable.

But why would you want to serialisa a complete memory pool?
Would it not be sufficient if, on serialisation, the individual
objects are serialised and on deserialisation it is ensured the new
objects are created within a suitable memory pool?

>
> Here's the issue:
>
<snip>


> This is all fine and dandy, but it won't work for deserializtion.
> There's no way
> to tell it that a particular incoming pointer actually points to the
> middle of an
> allocated block of memory that was previously loaded.
>
> This also means you can't serialize pointers that point into an array:
>
>         my_object *a = new my_object[10];       // Array
>         my_object *b = a + 5;                           // Pointer to middle of array
>
>         // Open archive for saving
>         ...
>         archive << a;     // Assume we store a+5 in the swizzle map directly in
>                                         // my_object's serialize routine
>
>         archive << b;     // Data is not written, due to previous swizzle trickery
>
>         // Open archive for loading
>         my_object *c, *d;
>
>         archive >> c;     // Should work correctly, entire array is loaded
>
>         archive >> d;     // Will not work as advertised - new memory is
>                                         // allocated because archive doesn't know that d
>                                         // points into the interior of c.

Have you actually verified that this does not work in the way
described here?
Because I would expect that the swizzle trickery (if implemented
correctly) writes out a marker to indicate: on deserialisation, ensure
this pointer refers to the object that was reconstructed from the data
'over there'.

>
> So unless some wizard out there has already figured out a way around
> this problem,
> I'm going to have to alter my approach to how to save state. (I have an
> idea, but
> it's madness.)

I would say that trying to serialise a memory pool is a sure way to
madness. And it is not needed.
The exact way in which the various objects are allocated within the
memory pools should not be part of the state of your game. The fact
that objects A and B are located in pool P and object C is in pool Q
is information that should be known statically at the places where the
objects are created, and if you change your mind and move B to pool Q
as well, then reloading an old state (from before the change) should
reflect the new reality.

Bart v Ingen Schenau

0 new messages