Can I access H2's MVStore API in a plugin serializer?

34 views
Skip to first unread message

Matthew Phillips

unread,
Dec 21, 2017, 4:42:29 AM12/21/17
to H2 Database
I want to use H2's MVStore with a large number of objects (maps), many of which are derived from each other, often differing in a single property.

I'd like to write a plugin serializer that can write an object as a reference to another object plus a delta from that object. This would reduce storage but, more importantly, since I'm using Clojure's persistent data structures, it would mean a derived object would share most of its structure with its parent, greatly reducing heap usage.

The serializer would need therefore to potentially access the H2 MVStore while reading an object, calling back 'up' the stack as it were to read parent object(s). This strikes me as an unexpected thing to do, and I wonder if H2 would have a problem with that?

Another approach might be to store derived objects in a different map, so the serializer's access to it looks like just an another external access.

Any feedback?

Cheers,

Matt.

Noel Grandin

unread,
Dec 21, 2017, 4:59:58 AM12/21/17
to h2-da...@googlegroups.com, Matthew Phillips


On 2017/12/21 11:42 AM, Matthew Phillips wrote:
>
> The serializer would need therefore to potentially access the H2 MVStore while reading an object, calling back 'up' the
> stack as it were to read parent object(s). This strikes me as an unexpected thing to do, and I wonder if H2 would have a
> problem with that?
>
I think that is quite likely to lead to infinite loops.

You're probably better off building some kind of serialisation layer on top of MVStore.
That layer can load an object, check if it needs to load a parent logic, and do so.

Strikes me as a very slow way to store data, your latency is likely to be terrible.

Personally, I would just waste the space - disk is cheap until you get past the terabyte mark.

Matthew Phillips

unread,
Dec 21, 2017, 5:55:58 AM12/21/17
to Noel Grandin, h2-da...@googlegroups.com

> On 21 Dec 2017, at 8:29 pm, Noel Grandin <noelg...@gmail.com> wrote:
>
> On 2017/12/21 11:42 AM, Matthew Phillips wrote:
>> The serializer would need therefore to potentially access the H2 MVStore while reading an object, calling back 'up' the stack as it were to read parent object(s). This strikes me as an unexpected thing to do, and I wonder if H2 would have a problem with that?
> I think that is quite likely to lead to infinite loops.
>
> You're probably better off building some kind of serialisation layer on top of MVStore.
> That layer can load an object, check if it needs to load a parent logic, and do so.

Yes, that’s what I originally planned, but then I realised I’d also have to add my own caching layer, so I’m looking for a way to be lazy and use H2’s.

> Strikes me as a very slow way to store data, your latency is likely to be terrible.
>
> Personally, I would just waste the space - disk is cheap until you get past the terabyte mark.

You’re right, but it’s not so much about the disk space, as the memory. It’s also far faster to diff Clojure maps when they’re derived this way (rather than being entirely separate copies) because you can skip diffing any values that are identical reference-wise. The reduced memory usage and fast diffs are why I think storing the data this way should be a net win for the application I have.

Thanks very much for your feedback. Will keeping thinking.

Cheers,

Matt.
Reply all
Reply to author
Forward
0 new messages