I started working on NOM (cf. "NOHM"), and had a chat with soveran
(author of Ohm for Ruby) about something interesting.
In Ohm, the level of abstraction is (at least in my understanding)
raised up a level from individual keys and values to the level of
"objects" which are aggregations of keys and values in Redis. The
attractive part of this concept to me was that it should offer what I
call "object-level atomicity"; either all pending changes to an object
are committed or none are depending on whether the user-provided
object-level validation passes/fails.
Unfortunately, that's not how Ohm works. In Ohm, changes to scalar
object properties are buffered while list/set operations are immediate-
mode. That is, when one changes a :string property to "foobar" that
change does not immediately result in a call to SET in Redis. When
one appends a value to a :list property, that _does_ immediately
result in a call to RPUSH in Redis. When one calls save on an object
the scalar properties are updated in one operation via MSET (or in
multiple operations via SET if you have an old version of Redis).
Thus, should your object pre-save validation fail, your lists and sets
are not rolled back. You are neither here nor there.
For NOM, I went the route of buffering _all_ object changes client
side to provide object-level atomicity. Thus, appending an item to
list does not immediately commit that operation in Redis. Instead,
the Redis-equivalent operation is performed client side on an Array,
and the corresponding Redis command needed to realize the change is
buffered in the same order. This strategy allows a pre-save
validation routine to validate what an object *will* look like when
saved to Redis. If what the object will look like is invalid to the
user (pre-save validation fails), none of the pending changes to that
object are committed. This provides object-level atomicity.
This does imply something. NOM is essentially re-implementing Redis
semantics on supported value types. E.g. LREM removes the first or
last N elements of a list. This has been re-implemented on the client
side in the method List.prototype.removeFirstN, etc. What I stopped
at was zset or sorted sets. These are implemented in skip lists in
Redis. Skip lists are not difficult to implement, but I started to
question my strategy.
Why? Well it implies that the user first pulls down the full object
to the client side first, makes local modifications, then pushes the
buffered commands back to Redis to realize the (valid) changes. This
means that it will not be appropriate for "large" objects (e.g.
objects that contain a list property whose value has 100K elements)
due to the bandwidth overhead.
There are a number paths to follow that I can see.
1. Drop the concept of validations. Instead, just route object
properties to the equivalent key, value, and value type in Redis.
This implies that no Redis commands need be buffered client side.
2. Continue with the full object atomicity by finishing skip lists and
mirroring Redis value type semantics on the client, checking that the
local state is valid, and if so, sending/replaying buffered Redis
commands needed to make the changes remotely in Redis.
3. Do not re-implement lists, sets, and zsets (and HASH soon
thereafter) on the client side in Javascript (too late for lists and
sets). Instead, simply buffer the commands to perform the same
operations in Redis. We're already on this path. The difference
would be that the client wouldn't really have the ability to inspect
the full collection for validation. But, how often do you do this
anyway? Never, right? You tend to perhaps do an LTRIM after a RPUSH
but that's probably it. I bet no one says "oh, I have to have the
whole list because it can only be saved if the stddev is less than 1.5
across all elements" or similar. Thus, perhaps it does not make much
sense to mirror the Redis value type semantics locally but instead
just buffer the commands. When the rest of the object is validated,
send the buffered commands. Thus, you get object-level atomicity and
avoid the overhead of mirroring what Redis does.
4. Store the JSON representation of objects in string values in Redis
instead of mapping object properties to Redis keys, values, and value
types (list, set, etc.). Why do this? One, NOM could provide
equivalent data types on the client side for Redis data types to
start. Since the point would be to provide object level atomicity, we
wouldn't be doing the work twice, and wouldn't have to ensure that the
semantics of Redis are mirrored client side. Also note that in the
near future Redis will support paging (ala virtual memory) which will
only happen for large-ish values. Storing objects as JSON would work
better with this scheme than individual properties as lists/sets/
strings/numbers since the whole-object-as-JSON would be larger. That
being said, you don't always want to fetch the whole object to make a
change (I'm looking at you Riak). So there's a tradeoff there too.
Number 4 would no longer be an object mapper but really an object
storage "engine". I hereby call "dibs" on NOM and NOSE (Node Object
Storage Engine) until the dust settles :)
I'm leaning towards #3. It provides object-level atomicity and
doesn't require a significant maintenance overhead to match Redis'
value type semantics.
Thanks,
Brian