compressing strings before inserting to redis, how would this be done?

1,664 views
Skip to first unread message

S Ahmed

unread,
Mar 4, 2011, 10:50:29 AM3/4/11
to redi...@googlegroups.com
I was reading that some people compress string data that they are storing in redis.

What kind of string compression can be done? (in Ruby preferably).

Obviously I would have to test compression/uncompress times, but in general the strings are probably going to be fairly small.

Demis Bellot

unread,
Mar 4, 2011, 10:53:29 AM3/4/11
to redi...@googlegroups.com
As #redis is binary safe, it can support any kind of compression.

I believe Proto Buf is the best (most efficient/compact) data format, but don't know if there is a ruby provider for it.

IMHO I would just use the recommended compression routine in Ruby, since any binary is supported.

Cheers,


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.



--
- Demis


Salvatore Sanfilippo

unread,
Mar 4, 2011, 10:58:38 AM3/4/11
to redi...@googlegroups.com, S Ahmed
A common pattern consists in storing the first byte as "type", so you
don't need to have everything compressed, but only when there is
actually some memory win.

Cheers,
Salvatore

> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.
>

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware

http://invece.org
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit." -- Aristotele

Jak Sprats

unread,
Mar 5, 2011, 8:31:59 AM3/5/11
to Redis DB

I did some research on compression algorithms lately. If you want
quick but not optimal compression, use lzf. If you need high
compression, the quickest is bzip2.
I would say go w/ lzf, it is BSD licensed and very fast (redis uses it
internally).
Compressing strings not only saves memory server side, it save network
I/O in both directions (i.e on SET and subsequent GET), it is a winner

S Ahmed

unread,
Mar 5, 2011, 11:39:28 AM3/5/11
to redi...@googlegroups.com, Jak Sprats
would storing articles be a good idea in redis?
What about compressing them?

Best thing is to test, but just wondering if storing large amounts of text (like an article) is just a bad idea to begin with :)

Josiah Carlson

unread,
Mar 5, 2011, 2:41:49 PM3/5/11
to redi...@googlegroups.com
Thought I would toss in my opinion here...

Though it is fairly old, gzip (and the underlying zlib compression
scheme) is actually fairly fast and is available on literally every
platform you could want to use (never mind that the zlib license is
about as liberal as you can get, even more so than BSD). Unless you
have huge numbers of processor cycles to spare, I would recommend
against bzip2; it runs roughly 10-20x slower than gzip, with typically
a modest reduction in compressed data size.

If your platform has no existing compression libraries, does not
support data compression using zlib/gzip, or has almost no spare
cycles (latency reasons), only then would I recommend using lzf.
Otherwise zlib/gzip generally fits a much broader range of compression
ratio vs. time tradeoff ranges.

- Josiah

Josiah Carlson

unread,
Mar 5, 2011, 2:49:53 PM3/5/11
to redi...@googlegroups.com
Without knowing how many articles you have, what kind of dynamic
content you add/remove from them over time, etc., it's hard to say one
way or another if it would be right to you.

Quite a few frameworks use local caching to store
pre-rendered/pre-compiled templates, which is typically sufficient in
a lot of cases. If I remember correctly (or believe the rumors),
Reddit pre-renders basically all of it's pages and stores them in
Redis.

Storing large chunks of text in Redis isn't necessarily a bad idea.
Heck, in a modestly sized Redis box, you could probably store the
entire back archives of the New York Times (only the article text, not
the markup copied/pasted on every page). Compressing the data would
just push the number of articles you could store even higher.

- Josiah

Reply all
Reply to author
Forward
0 new messages