best way to store Complex object in redis?

10,987 views
Skip to first unread message

michael he

unread,
Feb 14, 2012, 10:40:18 PM2/14/12
to Redis DB
hi,

i need to store tens of millions of complex objects with an average
size of 3-5k into redis.
so memory consumption is a big concern.
the straight way i think is to serialize the object into byte[] with
compression.
i have considered using hash but as the object is very complex and has
many attributes of nest objects , so this may not be a good idea.
can anyone share some kind of best practise in dealing with this?

thanks in advance.

Markus Pilman

unread,
Feb 15, 2012, 5:08:46 AM2/15/12
to redi...@googlegroups.com
hi,

I would use Google Protocol Buffers to serialize the object (you can
configure it in a way that it does not store stuff like introspection
information to make the objects smaller). Afterwards you can compress
the data - I would use snappy (http://code.google.com/p/snappy/) which
is extremely fast. If speed is not a big issue but only memory
consumption, I would use another library. There are several projects
out there (for example BigTable or leveldb) which uses this kind of
technique.

Hope this helps
Markus

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>

Tim Lossen

unread,
Feb 15, 2012, 5:35:56 AM2/15/12
to redi...@googlegroups.com
messagepack might be another good option:

http://msgpack.org/

--
http://tim.lossen.de

Salvatore Sanfilippo

unread,
Feb 15, 2012, 5:42:47 AM2/15/12
to redi...@googlegroups.com
When size really matters, your own encoding may be the best pick in
the case other well known serialization formats are not good enough,
so please can you provide us with informations about your objects?

Redis 2.6 supports Lua scripting, that supports the "struct" extension
of Lua. In short you can do interesting things with binary encoded
objects.

Salvatore

On Wed, Feb 15, 2012 at 4:40 AM, michael he <php...@gmail.com> wrote:

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Haoran Chen

unread,
Feb 16, 2012, 3:29:02 AM2/16/12
to Redis DB
messagepack + compress(like gzip) will reduce data size to transfer
through network remarkably.

On Feb 15, 6:35 pm, Tim Lossen <t...@lossen.de> wrote:
> messagepack might be another good option:
>
> http://msgpack.org/
>
> On 2012-02-15, at 11:08 , Markus Pilman wrote:
>
>
>
>
>
>
>
>
>
> > hi,
>
> > I would use Google Protocol Buffers to serialize the object (you can
> > configure it in a way that it does not store stuff like introspection
> > information to make the objects smaller). Afterwards you can compress
> > the data - I would use snappy (http://code.google.com/p/snappy/) which
> > is extremely fast. If speed is not a big issue but only memory
> > consumption, I would use another library. There are several projects
> > out there (for example BigTable or leveldb) which uses this kind of
> > technique.
>
> > Hope this helps
> > Markus
>
> > On Wed, Feb 15, 2012 at 4:40 AM, michael he <phpg...@gmail.com> wrote:
> >> hi,
>
> >> i need to store tens of millions of complex objects with an average
> >> size of 3-5k into redis.
> >> so memory consumption is a big concern.
> >> the straight way i think is to serialize the object into byte[] with
> >> compression.
> >> i have considered using hash but as the object is very complex and has
> >> many  attributes of nest objects , so this may not be a good idea.
> >> can anyone share some kind of best practise in dealing with this?
>
> >> thanks in advance.
>
> >> --
> >> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> >> To post to this group, send email to redi...@googlegroups.com.
> >> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> >> For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

Felix Gallo

unread,
Feb 16, 2012, 6:36:54 AM2/16/12
to redi...@googlegroups.com
Much also depends on the languages, use case and speed requirements on both sides of the encoding.

If you are using ruby, for example, it's common to use symbols as keys in hashes.  Msgpack, JSON and some other popular encodings don't have direct support for these and end up converting them into strings; this can be a nuisance, especially with deep nested structures.  Other languages suffer from various other shortcomings peculiar to those languages.  And, almost all encodings except for JSON trade readability for speed; there's no way to visually understand objects in redis-cli if you're trying to save space or time.

Speaking particularly for the case of ruby data structures I found that the Marshal class in the standard library offered my facebook social game (large # of objects; very high write rate; objects between 1 and 40k in size) the best price/performance/usability; it maintains symbols, it has a pretty compact representation for my objects, it's pretty fast, and I don't need to read objects in redis-cli.

F.

Salvatore Sanfilippo

unread,
Feb 16, 2012, 6:51:53 AM2/16/12
to redi...@googlegroups.com
On Thu, Feb 16, 2012 at 12:36 PM, Felix Gallo <felix...@gmail.com> wrote:

> there's no way to visually understand objects in redis-cli if you're trying
> to save space or time.

With scripting this may start to be a bit simpler at least. For
debugging one can load a script and then dump keys with: EVALSHA
<sha1> <key> or something like that.

Salvatore

Felix Gallo

unread,
Feb 16, 2012, 7:02:49 AM2/16/12
to redi...@googlegroups.com
Yes...if only there were plans for arbitrary libraries to be loadable in the scripting implementation... :)

catwell

unread,
Feb 16, 2012, 10:33:05 AM2/16/12
to Redis DB
On Feb 16, 1:02 pm, Felix Gallo <felixga...@gmail.com> wrote:

> Yes...if only there were plans for arbitrary libraries to be loadable in
> the scripting implementation... :)

For those (like me) who store MessagePack-encoded objects in Redis,
Kengo Nakajima has ported my LuaJIT implementation of MessagePack to
plain Lua (https://github.com/kengonakajima/lua-msgpack), so it should
actually be possible to write a (large) Lua script that can decode
MessagePack blobs and use it with EVALSHA. Obviously this would not be
efficient but maybe it can help for debugging.

Salvatore Sanfilippo

unread,
Feb 16, 2012, 10:35:02 AM2/16/12
to redi...@googlegroups.com
If there is a clean, C coded implementation fo MessagePack for Lua,
that is believed to be stable, we can include it in Redis before 2.6
release.

Salvatore

catwell

unread,
Feb 16, 2012, 11:13:03 AM2/16/12
to Redis DB
On Feb 16, 4:35 pm, Salvatore Sanfilippo <anti...@gmail.com> wrote:
> If there is a clean, C coded implementation fo MessagePack for Lua,
> that is believed to be stable, we can include it in Redis before 2.6
> release.

Sadly there is no such thing as far as I know.

Javier Guerra Giraldez

unread,
Feb 16, 2012, 4:05:17 PM2/16/12
to redi...@googlegroups.com
On Thu, Feb 16, 2012 at 10:35 AM, Salvatore Sanfilippo
<ant...@gmail.com> wrote:
> If there is a clean, C coded implementation fo MessagePack for Lua,
> that is believed to be stable, we can include it in Redis before 2.6
> release.

wouldn't that somehow 'bless' MessagePack? (well, JSON is there already)

maybe a little more generic would be to include some (de)compression;
that would enhance both 'simple binary' (struct) and 'readable' (JSON)
encodings

--
Javier

Salvatore Sanfilippo

unread,
Feb 16, 2012, 4:26:33 PM2/16/12
to redi...@googlegroups.com
On Thu, Feb 16, 2012 at 10:05 PM, Javier Guerra Giraldez
<jav...@guerrag.com> wrote:

> wouldn't that somehow 'bless' MessagePack?  (well, JSON is there already)

Well... JSON is already here, but we are talking about Redis, an *in
memory* data store.
I think we are in the sphere of use cases where a more compact
representation can be very valuable, and apparently, message pack is
the thing nearest to a standard that we have.

I don't know it, but I can (probably) code a decent Lua message pack
implementation in little time, so I've the following plan: I'll look
at the specification, if it is simple, beautiful, and cool, I'll
implement a C binding for Lua in the Redis spirt of minimality and
speed, and I will ship it with 2.6. If I don't like it we can try to
revaluate it in N months.

Cheers,
Salvatore

Salvatore Sanfilippo

unread,
Feb 16, 2012, 4:37:20 PM2/16/12
to redi...@googlegroups.com
Checked the specification, I think it is very good. It's also very
similar to a few things Redis already does in ziplists, RDB format,
and so forth. I think I'll give it a try.

I've a question: in this specification there is direct support for
different kind of integers, I'm not sure how this should be handled in
Lua where there are actually only floats... I think this is one of the
biggest Lua mistakes btw.

Basically my feeling is that it's not possible to handle the
conversion in a liberal way, but that a format specifier should be
provided to the library, in a similar form to what the Lua struct
extension does, so that I can specify something like:

messagepack.encode("u16i32",foo,bar) and so forth.

Cheers,
Salvatore

Salvatore Sanfilippo

unread,
Feb 16, 2012, 4:39:32 PM2/16/12
to redi...@googlegroups.com
p.s. when decoding this is suboptimal, we should likely return a Lua
table so that both the format of the decoded object and the fields are
provided.

This means that the encoding should work likewise as well, so that the
object is the same, and I can: decode, alter a field, re-encode, with
minimal efforts.

Yes I think this is the best interface we can provide at all.
Do you agree?

Salvatore

On Thu, Feb 16, 2012 at 10:37 PM, Salvatore Sanfilippo

Javier Guerra Giraldez

unread,
Feb 16, 2012, 4:44:16 PM2/16/12
to redi...@googlegroups.com
On Thu, Feb 16, 2012 at 4:26 PM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
> On Thu, Feb 16, 2012 at 10:05 PM, Javier Guerra Giraldez
> <jav...@guerrag.com> wrote:
>
>> wouldn't that somehow 'bless' MessagePack?  (well, JSON is there already)
>
> Well... JSON is already here, but we are talking about Redis, an *in
> memory* data store.

I meant that JSON is already 'blessed' in Redis' Lua, not as in "why
MP, there's JSON" but as in "why not? nobody complained when JSON was
blessed"

it's an habit of mine to include counterarguments to what i'm trying
to say, usually in parenthesis. (confusing, i know...)


--
Javier

Salvatore Sanfilippo

unread,
Feb 17, 2012, 4:47:58 AM2/17/12
to redi...@googlegroups.com
On Thu, Feb 16, 2012 at 10:44 PM, Javier Guerra Giraldez
<jav...@guerrag.com> wrote:

> it's an habit of mine to include counterarguments to what i'm trying
> to say, usually in parenthesis.  (confusing, i know...)

Sorry for the misunderstanding ;)

catwell

unread,
Feb 17, 2012, 4:52:56 AM2/17/12
to Redis DB
On Feb 16, 10:37 pm, Salvatore Sanfilippo <anti...@gmail.com> wrote:
> Checked the specification, I think it is very good. It's also very
> similar to a few things Redis already does in ziplists, RDB format,
> and so forth. I think I'll give it a try.

Yes, having implemented it I can confirm the specification is simple.

> I've a question: in this specification there is direct support for
> different kind of integers, I'm not sure how this should be handled in
> Lua where there are actually only floats... I think this is one of the
> biggest Lua mistakes btw.

This is not completely true: Lua has a single number type that happens
to be double by default, but lots of people use it with an integer
number type (especially in the embedded world).

> Basically my feeling is that it's not possible to handle the
> conversion in a liberal way, but that a format specifier should be
> provided to the library, in a similar form to what the Lua struct
> extension does, so that I can specify something like:
>
> messagepack.encode("u16i32",foo,bar) and so forth.


Usually MessagePack encoders encode to the smallest possible number
type, determined automatically depending on the value of the number.
Check out msgpack_pack_real_int64 in pack_template.h in the original
MessagePack implementation (C/C++). Note: you need to *build*
MessagePack first to get this file if you pull it from Git.

My own Lua version of this algorithm is here:
https://github.com/catwell/luajit-msgpack-pure/blob/master/luajit-msgpack-pure.lua#L102
This costs CPU but ensures minimal use of RAM and a simple interface.

By the way just making a (good) lua binding on top of this
implementation is another solution, however this would add a
dependency on a C++ compiler.

Salvatore Sanfilippo

unread,
Feb 17, 2012, 4:54:39 AM2/17/12
to redi...@googlegroups.com
On Fri, Feb 17, 2012 at 10:52 AM, catwell <catwell...@catwell.info> wrote:
> By the way just making a (good) lua binding on top of this
> implementation is another solution, however this would add a
> dependency on a C++ compiler.

I could rater die :)

Tim Lossen

unread,
Feb 17, 2012, 5:04:50 AM2/17/12
to redi...@googlegroups.com
On 2012-02-17, at 10:52 , catwell wrote:
> On Feb 16, 10:37 pm, Salvatore Sanfilippo <anti...@gmail.com> wrote:
>> Checked the specification, I think it is very good. It's also very
>> similar to a few things Redis already does in ziplists, RDB format,
>> and so forth. I think I'll give it a try.
>
> Yes, having implemented it I can confirm the specification is simple.

actually, when i was hacking on the redis dump file parser
a few weeks ago, i was kinda wishing that redis were using
messagpack instead of a proprietary format for dumps, as there
are so many messagepack implementations out there ... but maybe
too late now to introduce such a breaking change.

tim

--
http://tim.lossen.de

Salvatore Sanfilippo

unread,
Feb 17, 2012, 5:06:17 AM2/17/12
to redi...@googlegroups.com
On Fri, Feb 17, 2012 at 10:52 AM, catwell <catwell...@catwell.info> wrote:

> This is not completely true: Lua has a single number type that happens
> to be double by default, but lots of people use it with an integer
> number type (especially in the embedded world).

Yes, and that's even worse ;)

Btw the Redis implementation uses the default, double. So that's the problem:

foo = mp.decode("...");
foo += 10;
new = mp.encode(foo);

foo used to be as 32 bit integer, but when encoding we need to check
if foo can be translated back into integer without losing precision.
When that's not possible we'll have a type switch, that's not very
cool.
In a language where you have floats and integers this is not going to
be an issue of course.

Our best approach could be probably the following:

double foo = 10.0000001;

if (foo == floor(foo)) {
printf("Encode as int\n");
} else {
printf("Encode as double\n");
}

If this is good enough we can just have a simple mp.encode / decode
without additional complexities, and maybe an *optional* interface for
fine control of the encoding process.

Salvatore Sanfilippo

unread,
Feb 17, 2012, 5:09:37 AM2/17/12
to redi...@googlegroups.com
On Fri, Feb 17, 2012 at 11:04 AM, Tim Lossen <t...@lossen.de> wrote:

> actually, when i was hacking on the redis dump file parser
> a few weeks ago, i was kinda wishing that redis were using
> messagpack instead of a proprietary format for dumps, as there
> are so many messagepack implementations out there ... but maybe
> too late now to introduce such a breaking change.

Soon or later we may do it actually because we break RDB format all
the times is needed, just modifying the RDB version.
But anyway we'll need to implement a *modified* message pack since we
also support LZF compression (there are free opocodes in the
specification). However this will not mean that a random parser can
read the Redis RDB dump, but still, reusing a specification as much as
possible given the extreme similarities we already have, can be a good
idea.

Now we are full of other priorities of course ;) But IMHO something to
take in the back of the mind for the next RDB refactoring.

catwell

unread,
Feb 17, 2012, 5:15:14 AM2/17/12
to Redis DB
On Feb 17, 11:09 am, Salvatore Sanfilippo <anti...@gmail.com> wrote:

> Soon or later we may do it actually because we break RDB format all
> the times is needed, just modifying the RDB version.
> But anyway we'll need to implement a *modified* message pack since we
> also support LZF compression (there are free opocodes in the
> specification). However this will not mean that a random parser can
> read the Redis RDB dump, but still, reusing a specification as much as
> possible given the extreme similarities we already have, can be a good
> idea.
>
> Now we are full of other priorities of course ;) But IMHO something to
> take in the back of the mind for the next RDB refactoring.

I don't know if this could be a problem but there is something I
really dislike in the MessagePack specification: endianness. We live
in a world where most CPUs are little endian but the MessagePack spec
is big endian.

Salvatore Sanfilippo

unread,
Feb 17, 2012, 5:51:23 AM2/17/12
to redi...@googlegroups.com
On Fri, Feb 17, 2012 at 11:15 AM, catwell <catwell...@catwell.info> wrote:
> I don't know if this could be a problem but there is something I
> really dislike in the MessagePack specification: endianness. We live
> in a world where most CPUs are little endian but the MessagePack spec
> is big endian.

+1, if you noticed every in-memory data structure into Redis is now
little endian and we convert from big endian to little endian in
endian.[ch] if the arch is bigendian. Otherwise macros expand to
nothing in little endian arch.

catwell

unread,
Feb 17, 2012, 8:16:07 AM2/17/12
to Redis DB
On Feb 17, 11:51 am, Salvatore Sanfilippo <anti...@gmail.com> wrote:

> +1, if you noticed every in-memory data structure into Redis is now
> little endian and we convert from big endian to little endian in
> endian.[ch] if the arch is bigendian. Otherwise macros expand to
> nothing in little endian arch.

Yes, whereas with MessagePack you have to do the opposite. I prefer
the Redis way.

Salvatore Sanfilippo

unread,
Feb 17, 2012, 8:36:25 AM2/17/12
to redi...@googlegroups.com
On Fri, Feb 17, 2012 at 2:16 PM, catwell <catwell...@catwell.info> wrote:

> Yes, whereas with MessagePack you have to do the opposite. I prefer
> the Redis way.

Yep indeed... I started an implementation, hope to find more time to
work on it today and during the WE, now have to go to retire the
analysis of my wife, apparently we are going to fork(2) in 8 months
from now :)

catwell

unread,
Feb 17, 2012, 8:42:12 AM2/17/12
to Redis DB
On Feb 17, 2:36 pm, Salvatore Sanfilippo <anti...@gmail.com> wrote:

> now have to go to retire the
> analysis of my wife, apparently we are going to fork(2) in 8 months
> from now :)

Related to the YouPorn news? ;)

Congrats!

Tim Lossen

unread,
Feb 17, 2012, 9:27:32 AM2/17/12
to redi...@googlegroups.com
On 2012-02-17, at 14:36 , Salvatore Sanfilippo wrote:

> [...] apparently we are going to fork(2) in 8 months
> from now :)

wow, congratulations!

--
http://tim.lossen.de

Javier Guerra Giraldez

unread,
Feb 17, 2012, 10:11:22 AM2/17/12
to redi...@googlegroups.com
On Fri, Feb 17, 2012 at 5:06 AM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
> Our best approach could be probably the following:
>
>    double foo = 10.0000001;
>
>    if (foo == floor(foo)) {
>        printf("Encode as int\n");
>    } else {
>        printf("Encode as double\n");
>    }

Lua does a similar test in some places; mostly for the array
optimization of tables. (a table with reasonably dense integer keys
stores those values in an array, so every table access first checks if
the key is an integer within a per-table range)


--
Javier

Josiah Carlson

unread,
Feb 17, 2012, 11:31:10 AM2/17/12
to redi...@googlegroups.com
On Fri, Feb 17, 2012 at 5:36 AM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
> On Fri, Feb 17, 2012 at 2:16 PM, catwell <catwell...@catwell.info> wrote:
>
>> Yes, whereas with MessagePack you have to do the opposite. I prefer
>> the Redis way.
>
> Yep indeed... I started an implementation, hope to find more time to
> work on it today and during the WE, now have to go to retire the
> analysis of my wife, apparently we are going to fork(2) in 8 months
> from now :)

Congratulations on your fork success with fsck :)

- Josiah

Marc Byrd

unread,
Feb 17, 2012, 12:24:39 PM2/17/12
to redi...@googlegroups.com
Congratulations Salvatore!

I guess you could say your wife is going to do a pull request in 8 months? ;)

Cheers,


m

Felix Gallo

unread,
Feb 17, 2012, 12:28:13 PM2/17/12
to redi...@googlegroups.com
Can I suggest that when discussing fork(2) with your wife you do not use the word 'COW'...she may take it the wrong way...

On the topic of MessagePack-for-clients -- the fact that no library is completely satisfactory (json is slow and large, messagepack is big-endian and doesn't have representations for base types in ruby/erlang/etc.) is the primary reason why I think having user-loadable libraries in the scripting branch is a good idea -- while ProtoBuf or JSON may be good for one use case, it may be that another calls for BSON or YAML or MP, and that the redis implementor could weigh their own selection criteria against the available Lua, Lua-bound-C, or C libraries...

F.

Jay A. Kreibich

unread,
Feb 17, 2012, 2:38:07 PM2/17/12
to redi...@googlegroups.com
On Fri, Feb 17, 2012 at 02:15:14AM -0800, catwell scratched on the wall:

> I don't know if this could be a problem but there is something I
> really dislike in the MessagePack specification: endianness. We live
> in a world where most CPUs are little endian but the MessagePack spec
> is big endian.

All networking protocols (TCP, IP, UDP, etc.) are in "network byte
order", which happens to be big-endian. While there may be no
specific need for an application-level protocol to be in network-byte
order, it is a deeply ingrained best-practice for any standardized,
protocol that regularly flows over an IP network to use network byte
order for any binary objects.

It may not be ideal for today's popular desktop CPUs, but remember
that there are a lot of other devices out there on the Internet.
While the application environments of Android and iOS are also
little-endian, almost all game systems are big-endian, including all
the popular consoles (Xbox 360, PlayStation 3, Wii). Along the
similar lines, many low-power embedded processors are also
big-endian-- the type of thing you'd expect to show up in
environmental monitoring, WiFi enabled light switches and other
"throw away" devices we are likely to start seeing in our homes
and offices by the dozens. Plenty of network nodes are little endian.

From a memory, speed, and processing standpoint, it makes all the
sense in the world for these low-power types of devices to favor
formats like MessagePack over JSON. And if endian translation must
be done, I'd rather do it on my beefy server than on an ultra-low-
power embedded devices with a few kilobytes of memory.


Of course, that doesn't help when you're trying to transfer
multi-gigabyte data sets between two high performance servers, but
that type of environment also has a lot more flexibility. At the end
of the day, you need to pick SOME default, and for good or for bad
that networking-centric default was picked many years ago-- and
isn't as completely irrelevant as one might first assume.

-j

--
Jay A. Kreibich < J A Y @ K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
but showing it to the wrong people has the tendency to make them
feel uncomfortable." -- Angela Johnson

Alexander Gladysh

unread,
Feb 17, 2012, 6:34:13 PM2/17/12
to redi...@googlegroups.com
On Fri, Feb 17, 2012 at 14:06, Salvatore Sanfilippo <ant...@gmail.com> wrote:
> On Fri, Feb 17, 2012 at 10:52 AM, catwell <catwell...@catwell.info> wrote:
>
>> This is not completely true: Lua has a single number type that happens
>> to be double by default, but lots of people use it with an integer
>> number type (especially in the embedded world).
>
> Yes, and that's even worse ;)
>
> Btw the Redis implementation uses the default, double. So that's the problem:
>
> foo = mp.decode("...");
> foo += 10;
> new = mp.encode(foo);
>
> foo used to be as 32 bit integer, but when encoding we need to check
> if foo can be translated back into integer without losing precision.
> When that's not possible we'll have a type switch, that's not very
> cool.
> In a language where you have floats and integers this is not going to
> be an issue of course.
>
> Our best approach could be probably the following:
>
>    double foo = 10.0000001;
>
>    if (foo == floor(foo)) {

What wrong happened with good old foo % 1 == 0? (Assuming Lua code,
not C, of course.)

>        printf("Encode as int\n");
>    } else {
>        printf("Encode as double\n");
>    }
>
> If this is good enough we can just have a simple mp.encode / decode
> without additional complexities, and maybe an *optional* interface for
> fine control of the encoding process.

FWIW, I would prefer the fixed format approach — always use double
encoding so that I would always know the resulting format beforehand
(or, at least a run-time option to tune this). Finer control, as you
say, to be provided via format string or something like that.

My 2c.

Alexander.

Josiah Carlson

unread,
Feb 17, 2012, 6:48:37 PM2/17/12
to redi...@googlegroups.com
Incidentally, one of the features of doubles is that integers between
a range of values are exactly represented. That is, you will never
have a precision issue such that 10 is represented as 10.000001 (that
range is +/- 2**55 - 1; 53 bits of precision, a 54th bit that is
assumed 1 thanks to the float format). Beyond that range of integers
you will get truncation, but never trailing digits in the fractional
region.

Regards,
- Josiah

Salvatore Sanfilippo

unread,
Feb 19, 2012, 11:23:55 AM2/19/12
to redi...@googlegroups.com
Thanks Josiah, it's always better to try to understand what are the
certain things that may happen with floats ;)

Btw I just released my message pack implementation for Lua:

https://github.com/antirez/lua-msgpack

In the README I documented the tradeoffs I used about the integer
types, and tables, that are the two "interesting" topics about message
pack and Lua. I'll include this implementation in Redis unstable
branch tomorrow.

I wrote tests for the lib but I currently consider it beta quality
because of the lack of accurate testing and usage in a real project.

Cheers,
Salvatore

--

Tim Lossen

unread,
Feb 19, 2012, 11:34:25 AM2/19/12
to redi...@googlegroups.com
wow, cool .... the rate at which you crank out quality code really
amazes me, salvatore. you are like a machine! :)

tim

> --
> Salvatore 'antirez' Sanfilippo
> open source developer - VMware
>
> http://invece.org
> "We are what we repeatedly do. Excellence, therefore, is not an act,
> but a habit." -- Aristotele
>

> --
> You received this message because you are subscribed to the Google
> Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com
> .
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en
> .
>

--
http://tim.lossen.de

Salvatore Sanfilippo

unread,
Feb 19, 2012, 11:55:36 AM2/19/12
to redi...@googlegroups.com
Thanks Tim, those little < 1000 lines of code projects sometimes are
really fun to code :) What's good about weekends coding is that I've
other stuff to do with family / friends, so maybe I stay just two
hours in front of the computer but those are the "right" two hours,
mentally rested, and eager to write some code. In normal days to
sustain this speed is impossible for me for 8 hours, so the percentage
of code produce per hour in weekend fun projects can really be the
best!

Salvatore

>> --
>> Salvatore 'antirez' Sanfilippo
>> open source developer - VMware
>>
>> http://invece.org
>> "We are what we repeatedly do. Excellence, therefore, is not an act,
>> but a habit." -- Aristotele
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Redis DB" group.
>> To post to this group, send email to redi...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> redis-db+u...@googlegroups.com.

>> For more options, visit this group at
>> http://groups.google.com/group/redis-db?hl=en.
>>
>
> --

> http://tim.lossen.de


>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.
>

--

Tim Lossen

unread,
Feb 20, 2012, 8:51:56 AM2/20/12
to redi...@googlegroups.com
On 2012-02-17, at 11:09 , Salvatore Sanfilippo wrote:

> On Fri, Feb 17, 2012 at 11:04 AM, Tim Lossen <t...@lossen.de> wrote:
>
>> actually, when i was hacking on the redis dump file parser
>> a few weeks ago, i was kinda wishing that redis were using
>> messagpack instead of a proprietary format for dumps, as there
>> are so many messagepack implementations out there ... but maybe
>> too late now to introduce such a breaking change.
>

> But anyway we'll need to implement a *modified* message pack since we
> also support LZF compression (there are free opocodes in the
> specification). However this will not mean that a random parser can
> read the Redis RDB dump

it could also be an interesting option to compress complex values
(sets, hashes etc.) instead of single strings, like in the current
implementation -- i.e. first convert to binary, then compress.

this would be both easier to parse, and result in equal (or possibly
even bigger?) savings i think.

--
http://tim.lossen.de

Salvatore Sanfilippo

unread,
Feb 20, 2012, 8:58:24 AM2/20/12
to redi...@googlegroups.com
On Mon, Feb 20, 2012 at 2:51 PM, Tim Lossen <t...@lossen.de> wrote:
> t could also be an interesting option to compress complex values
> (sets, hashes etc.) instead of single strings, like in the current
> implementation -- i.e. first convert to binary, then compress.
>
> this would be both easier to parse, and result in equal (or possibly
> even bigger?) savings i think.

We do that now, finally. Basically if something is
intset/zipmap/ziplist-encoded in memory, it gets dumped as it is in
RDB files. This improved some saving/loading times of 10x in Redis
2.4.

Example:

edis 127.0.0.1:6379> flushall
OK
redis 127.0.0.1:6379> sadd set 1 2 3 4 5 6 7 8 9 10
(integer) 10
redis 127.0.0.1:6379> save
OK
redis 127.0.0.1:6379> quit
bash-3.2$ hexdump -C dump.rdb
00000000 52 45 44 49 53 30 30 30 33 fe 00 0b 03 73 65 74 |REDIS0003....set|
00000010 1c 02 00 00 00 0a 00 00 00 01 00 02 00 03 00 04 |................|
00000020 00 05 00 06 00 07 00 08 00 09 00 0a 00 ff |..............|
0000002e
bash-3.2$ ls -l dump.rdb
-rw-r--r-- 1 antirez staff 46 Feb 20 14:57 dump.rdb

We encoded the RDB header, the key name, type, and 10 integers in 46 bytes.

Salvatore

Dvir Volk

unread,
Feb 20, 2012, 9:55:11 AM2/20/12
to redi...@googlegroups.com
that is such an awesome idea, maybe as a transitional stage it can be introduced as an optional thing.
 

--
http://tim.lossen.de



--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.




--
Dvir Volk
System Architect, The Everything Project (formerly DoAT)

Jak Sprats

unread,
Feb 21, 2012, 9:21:20 AM2/21/12
to Redis DB

Salvatore,

that was a pretty cool demo using redis-cli and then hexdump to show
that the SET and the binary dump were the same :)

-jak
Reply all
Reply to author
Forward
0 new messages