I would use Google Protocol Buffers to serialize the object (you can
configure it in a way that it does not store stuff like introspection
information to make the objects smaller). Afterwards you can compress
the data - I would use snappy (http://code.google.com/p/snappy/) which
is extremely fast. If speed is not a big issue but only memory
consumption, I would use another library. There are several projects
out there (for example BigTable or leveldb) which uses this kind of
technique.
Hope this helps
Markus
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
Redis 2.6 supports Lua scripting, that supports the "struct" extension
of Lua. In short you can do interesting things with binary encoded
objects.
Salvatore
On Wed, Feb 15, 2012 at 4:40 AM, michael he <php...@gmail.com> wrote:
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
--
Salvatore 'antirez' Sanfilippo
open source developer - VMware
http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele
> there's no way to visually understand objects in redis-cli if you're trying
> to save space or time.
With scripting this may start to be a bit simpler at least. For
debugging one can load a script and then dump keys with: EVALSHA
<sha1> <key> or something like that.
Salvatore
Salvatore
wouldn't that somehow 'bless' MessagePack? (well, JSON is there already)
maybe a little more generic would be to include some (de)compression;
that would enhance both 'simple binary' (struct) and 'readable' (JSON)
encodings
--
Javier
> wouldn't that somehow 'bless' MessagePack? (well, JSON is there already)
Well... JSON is already here, but we are talking about Redis, an *in
memory* data store.
I think we are in the sphere of use cases where a more compact
representation can be very valuable, and apparently, message pack is
the thing nearest to a standard that we have.
I don't know it, but I can (probably) code a decent Lua message pack
implementation in little time, so I've the following plan: I'll look
at the specification, if it is simple, beautiful, and cool, I'll
implement a C binding for Lua in the Redis spirt of minimality and
speed, and I will ship it with 2.6. If I don't like it we can try to
revaluate it in N months.
Cheers,
Salvatore
I've a question: in this specification there is direct support for
different kind of integers, I'm not sure how this should be handled in
Lua where there are actually only floats... I think this is one of the
biggest Lua mistakes btw.
Basically my feeling is that it's not possible to handle the
conversion in a liberal way, but that a format specifier should be
provided to the library, in a similar form to what the Lua struct
extension does, so that I can specify something like:
messagepack.encode("u16i32",foo,bar) and so forth.
Cheers,
Salvatore
This means that the encoding should work likewise as well, so that the
object is the same, and I can: decode, alter a field, re-encode, with
minimal efforts.
Yes I think this is the best interface we can provide at all.
Do you agree?
Salvatore
On Thu, Feb 16, 2012 at 10:37 PM, Salvatore Sanfilippo
I meant that JSON is already 'blessed' in Redis' Lua, not as in "why
MP, there's JSON" but as in "why not? nobody complained when JSON was
blessed"
it's an habit of mine to include counterarguments to what i'm trying
to say, usually in parenthesis. (confusing, i know...)
--
Javier
> it's an habit of mine to include counterarguments to what i'm trying
> to say, usually in parenthesis. (confusing, i know...)
Sorry for the misunderstanding ;)
I could rater die :)
actually, when i was hacking on the redis dump file parser
a few weeks ago, i was kinda wishing that redis were using
messagpack instead of a proprietary format for dumps, as there
are so many messagepack implementations out there ... but maybe
too late now to introduce such a breaking change.
tim
> This is not completely true: Lua has a single number type that happens
> to be double by default, but lots of people use it with an integer
> number type (especially in the embedded world).
Yes, and that's even worse ;)
Btw the Redis implementation uses the default, double. So that's the problem:
foo = mp.decode("...");
foo += 10;
new = mp.encode(foo);
foo used to be as 32 bit integer, but when encoding we need to check
if foo can be translated back into integer without losing precision.
When that's not possible we'll have a type switch, that's not very
cool.
In a language where you have floats and integers this is not going to
be an issue of course.
Our best approach could be probably the following:
double foo = 10.0000001;
if (foo == floor(foo)) {
printf("Encode as int\n");
} else {
printf("Encode as double\n");
}
If this is good enough we can just have a simple mp.encode / decode
without additional complexities, and maybe an *optional* interface for
fine control of the encoding process.
> actually, when i was hacking on the redis dump file parser
> a few weeks ago, i was kinda wishing that redis were using
> messagpack instead of a proprietary format for dumps, as there
> are so many messagepack implementations out there ... but maybe
> too late now to introduce such a breaking change.
Soon or later we may do it actually because we break RDB format all
the times is needed, just modifying the RDB version.
But anyway we'll need to implement a *modified* message pack since we
also support LZF compression (there are free opocodes in the
specification). However this will not mean that a random parser can
read the Redis RDB dump, but still, reusing a specification as much as
possible given the extreme similarities we already have, can be a good
idea.
Now we are full of other priorities of course ;) But IMHO something to
take in the back of the mind for the next RDB refactoring.
+1, if you noticed every in-memory data structure into Redis is now
little endian and we convert from big endian to little endian in
endian.[ch] if the arch is bigendian. Otherwise macros expand to
nothing in little endian arch.
> Yes, whereas with MessagePack you have to do the opposite. I prefer
> the Redis way.
Yep indeed... I started an implementation, hope to find more time to
work on it today and during the WE, now have to go to retire the
analysis of my wife, apparently we are going to fork(2) in 8 months
from now :)
> [...] apparently we are going to fork(2) in 8 months
> from now :)
wow, congratulations!
Lua does a similar test in some places; mostly for the array
optimization of tables. (a table with reasonably dense integer keys
stores those values in an array, so every table access first checks if
the key is an integer within a per-table range)
--
Javier
Congratulations on your fork success with fsck :)
- Josiah
I guess you could say your wife is going to do a pull request in 8 months? ;)
Cheers,
m
> I don't know if this could be a problem but there is something I
> really dislike in the MessagePack specification: endianness. We live
> in a world where most CPUs are little endian but the MessagePack spec
> is big endian.
All networking protocols (TCP, IP, UDP, etc.) are in "network byte
order", which happens to be big-endian. While there may be no
specific need for an application-level protocol to be in network-byte
order, it is a deeply ingrained best-practice for any standardized,
protocol that regularly flows over an IP network to use network byte
order for any binary objects.
It may not be ideal for today's popular desktop CPUs, but remember
that there are a lot of other devices out there on the Internet.
While the application environments of Android and iOS are also
little-endian, almost all game systems are big-endian, including all
the popular consoles (Xbox 360, PlayStation 3, Wii). Along the
similar lines, many low-power embedded processors are also
big-endian-- the type of thing you'd expect to show up in
environmental monitoring, WiFi enabled light switches and other
"throw away" devices we are likely to start seeing in our homes
and offices by the dozens. Plenty of network nodes are little endian.
From a memory, speed, and processing standpoint, it makes all the
sense in the world for these low-power types of devices to favor
formats like MessagePack over JSON. And if endian translation must
be done, I'd rather do it on my beefy server than on an ultra-low-
power embedded devices with a few kilobytes of memory.
Of course, that doesn't help when you're trying to transfer
multi-gigabyte data sets between two high performance servers, but
that type of environment also has a lot more flexibility. At the end
of the day, you need to pick SOME default, and for good or for bad
that networking-centric default was picked many years ago-- and
isn't as completely irrelevant as one might first assume.
-j
--
Jay A. Kreibich < J A Y @ K R E I B I.C H >
"Intelligence is like underwear: it is important that you have it,
but showing it to the wrong people has the tendency to make them
feel uncomfortable." -- Angela Johnson
What wrong happened with good old foo % 1 == 0? (Assuming Lua code,
not C, of course.)
> printf("Encode as int\n");
> } else {
> printf("Encode as double\n");
> }
>
> If this is good enough we can just have a simple mp.encode / decode
> without additional complexities, and maybe an *optional* interface for
> fine control of the encoding process.
FWIW, I would prefer the fixed format approach — always use double
encoding so that I would always know the resulting format beforehand
(or, at least a run-time option to tune this). Finer control, as you
say, to be provided via format string or something like that.
My 2c.
Alexander.
Regards,
- Josiah
Btw I just released my message pack implementation for Lua:
https://github.com/antirez/lua-msgpack
In the README I documented the tradeoffs I used about the integer
types, and tables, that are the two "interesting" topics about message
pack and Lua. I'll include this implementation in Redis unstable
branch tomorrow.
I wrote tests for the lib but I currently consider it beta quality
because of the lack of accurate testing and usage in a real project.
Cheers,
Salvatore
--
tim
> --
> Salvatore 'antirez' Sanfilippo
> open source developer - VMware
>
> http://invece.org
> "We are what we repeatedly do. Excellence, therefore, is not an act,
> but a habit." -- Aristotele
>
> --
> You received this message because you are subscribed to the Google
> Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com
> .
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en
> .
>
Salvatore
>> --
>> Salvatore 'antirez' Sanfilippo
>> open source developer - VMware
>>
>> http://invece.org
>> "We are what we repeatedly do. Excellence, therefore, is not an act,
>> but a habit." -- Aristotele
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Redis DB" group.
>> To post to this group, send email to redi...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> redis-db+u...@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/redis-db?hl=en.
>>
>
> --
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.
>
--
> On Fri, Feb 17, 2012 at 11:04 AM, Tim Lossen <t...@lossen.de> wrote:
>
>> actually, when i was hacking on the redis dump file parser
>> a few weeks ago, i was kinda wishing that redis were using
>> messagpack instead of a proprietary format for dumps, as there
>> are so many messagepack implementations out there ... but maybe
>> too late now to introduce such a breaking change.
>
> But anyway we'll need to implement a *modified* message pack since we
> also support LZF compression (there are free opocodes in the
> specification). However this will not mean that a random parser can
> read the Redis RDB dump
it could also be an interesting option to compress complex values
(sets, hashes etc.) instead of single strings, like in the current
implementation -- i.e. first convert to binary, then compress.
this would be both easier to parse, and result in equal (or possibly
even bigger?) savings i think.
We do that now, finally. Basically if something is
intset/zipmap/ziplist-encoded in memory, it gets dumped as it is in
RDB files. This improved some saving/loading times of 10x in Redis
2.4.
Example:
edis 127.0.0.1:6379> flushall
OK
redis 127.0.0.1:6379> sadd set 1 2 3 4 5 6 7 8 9 10
(integer) 10
redis 127.0.0.1:6379> save
OK
redis 127.0.0.1:6379> quit
bash-3.2$ hexdump -C dump.rdb
00000000 52 45 44 49 53 30 30 30 33 fe 00 0b 03 73 65 74 |REDIS0003....set|
00000010 1c 02 00 00 00 0a 00 00 00 01 00 02 00 03 00 04 |................|
00000020 00 05 00 06 00 07 00 08 00 09 00 0a 00 ff |..............|
0000002e
bash-3.2$ ls -l dump.rdb
-rw-r--r-- 1 antirez staff 46 Feb 20 14:57 dump.rdb
We encoded the RDB header, the key name, type, and 10 integers in 46 bytes.
Salvatore
--
http://tim.lossen.de
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.