Experience using redis for time series data?

Patrik Sundberg

unread,

Jun 15, 2010, 8:12:23 AM6/15/10

to Redis DB

Hi,

I'm relatively new to Redis (and other NoSQL systems for that matter)
and I'm considering if it's a good tool for working with time series
data. The sorted set/list data types seem like a good base for working
with time series data so I wanted to ask the group if anyone has
experience in this area? Any links to examples? I'd be interested in
use cases and example queries etc. Searching the google group only
turns up a couple of snippets.

Any suggestions besides Redis for working with time series data? I
know about specialized time series databases like kdb, fame, etc but
I'm looking for open source solutions.

Thanks,
Patrik

Demis Bellot

unread,

Jun 15, 2010, 8:28:49 AM6/15/10

to redi...@googlegroups.com

Using a sorted set ordered by unix timestamp or ticks is the way to go:

I've got a couple of examples of doing this with the 'recent lists' in my article:

Designing a NoSQL Database using Redis

http://code.google.com/p/servicestack/wiki/DesigningNoSqlDatabase

Cheers,

Demis

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Aníbal Rojas

unread,

Jun 15, 2010, 3:52:20 PM6/15/10

to redi...@googlegroups.com

Patrick,

We have spent a couple months putting together a RDF like Redis DB
for realtime analytics.

It works, but it is complex. In our case realtime "queries" are a
strong requirement so far Redis has been a good fit.

Basically you need to set up a set of counters, that you should
keep track of like Demis says, lists and sorted sets are your choice
for keeping track of your set of counters.

Be aware that you can't "query" Redis, so you have to design your
set of keys thinking in what data you want to pull out of Redis. This
means that if you want to handle different levels of aggregation, you
need to denormalize the data in advance.

Redis will handle a LOT of updates, and queries withour much
effort. But take care of purging your data, and avoid a unlimited
growth of it.

Best regards,

--
Aníbal Rojas
Ruby on Rails Web Developer
http://www.google.com/profiles/anibalrojas

Patrik Sundberg

unread,

Jun 15, 2010, 5:51:33 PM6/15/10

to Redis DB

thanks guys. useful info.

In terms of purely numerical data and redis storing everything as
strings (as I understand) - I'd assume that increases storage space by
multiples compared to storing integers/doubles in a binary format?
Storage space isn't really an issue for me but want to verify how it
works.

Also, your answers verifies it may be a viable solution for working
with time series data - but does it seem like the "correct" one? As in
should I be looking at other things as well?

Patrik

On Jun 15, 8:52 pm, Aníbal Rojas <anibalro...@gmail.com> wrote:
> Patrick,
>
> We have spent a couple months putting together a RDF like Redis DB
> for realtime analytics.
>
> It works, but it is complex. In our case realtime "queries" are a
> strong requirement so far Redis has been a good fit.
>
> Basically you need to set up a set of counters, that you should
> keep track of like Demis says, lists and sorted sets are your choice
> for keeping track of your set of counters.
>
> Be aware that you can't "query" Redis, so you have to design your
> set of keys thinking in what data you want to pull out of Redis. This
> means that if you want to handle different levels of aggregation, you
> need to denormalize the data in advance.
>
> Redis will handle a LOT of updates, and queries withour much
> effort. But take care of purging your data, and avoid a unlimited
> growth of it.
>
> Best regards,
>
> --
> Aníbal Rojas

> Ruby on Rails Web Developerhttp://www.google.com/profiles/anibalrojas

Aníbal Rojas

unread,

Jun 15, 2010, 6:13:52 PM6/15/10

to redi...@googlegroups.com

Patrick,

> In terms of purely numerical data and redis storing everything as
> strings (as I understand) - I'd assume that increases storage space by
> multiples compared to storing integers/doubles in a binary format?
> Storage space isn't really an issue for me but want to verify how it
> works.

Redis interprets strings as integers when appropiated in the case of
counters: INCR and INCRBY

But everything you store in Redis is a String, and everything you read from too.

Most of Redis RAM usage is devoted to pointers to keep data structures
going on, unless you are putting really big values inside Redis you
can forget about them in the ecuations.

If you want to optimize RAM usage compile Redis in 32 bits and get a
64 bit machine, shard the data in different instances. This will rise
the management burden, but Redis in my experience is so stable you can
just forget it is running.

> Also, your answers verifies it may be a viable solution for working
> with time series data - but does it seem like the "correct" one? As in
> should I be looking at other things as well?

That's more difficult to answer, we explored different options for
this problem like MongoDB and Cassandra. You will find pros and cons
in every piece of NoSQL around, it will depend on heavily in you
problem, restrictions, etc, so you better option is to try different
solutions.

Konstantin Merenkov

unread,

Jun 15, 2010, 6:18:39 PM6/15/10

to redi...@googlegroups.com

2010/6/16 Aníbal Rojas <aniba...@gmail.com>:

> Patrick,
>
>> In terms of purely numerical data and redis storing everything as
>> strings (as I understand) - I'd assume that increases storage space by
>> multiples compared to storing integers/doubles in a binary format?
>> Storage space isn't really an issue for me but want to verify how it
>> works.
>
> Redis interprets strings as integers when appropiated in the case of
> counters: INCR and INCRBY
>
> But everything you store in Redis is a String, and everything you read from too.

I thought that internally if you set a "12345" as a value to a key
using SET command (for example),
redis will try to interpret it as integer and if it is possible it
will store it that way (to save memory).

Also there is an array of integers (configurable, 10M by default as
far as I remember) that is populated on start
of server instance. And if value you supplied is integer and already
contained in this array, redis will make a reference key<->ints array
index.
It helps to save you memory when you have a lot of small integers in
your database.

If I am wrong please correct me.

--
Best Regards,
Konstantin Merenkov

Patrik Sundberg

unread,

Jun 15, 2010, 6:35:29 PM6/15/10

to Redis DB

Understood on the storage.

I envision using it together with the VM feature since in my case I'd
have many keys but only a few used at a time, and the associated
values being potentially large. I'd have keys being symbols
identifying time series and values being sorted sets (that could
become large, millions of entries). I doubt that I'd have values
bigger than the 1Gb limit, but it's something for me to keep in mind.

Will the VM always swap in complete values? i.e. I have a 500mb time
series as a value on disk and want to access a few entries at the
start of the series, will the full series to read into memory?

It doesn't feel like redis is a 100% fit for the task of storing very
large sets of permanent and growing time series data, but it may be
workable for my specific use cases. I shall ponder further.

Thanks for all the info, very helpful.

On Jun 15, 11:18 pm, Konstantin Merenkov <kmeren...@gmail.com> wrote:
> 2010/6/16 Aníbal Rojas <anibalro...@gmail.com>:

> > For more options, visit this group athttp://groups.google.com/group/redis-db?hl=en.

Aníbal Rojas

unread,

Jun 15, 2010, 7:26:23 PM6/15/10

to redi...@googlegroups.com

Konstantin,

> I thought that internally if you set a "12345" as a value to a key
> using SET command (for example),
> redis will try to interpret it as integer and if it is possible it
> will store it that way (to save memory).

Yes. Some months ago I explored base 64 encoding ids to lower memory
footprint, and Salvatore pointed me to the fact that most memory is
used in internal pointers. But yes, you are right that Redis will
encode strings that hold only numbers to optimize their footprint.

This is why is you have sets or lists of keys, it is much better to
store just a numeric id instead of the whole key with its text prefix:
456789 instead of book:45678

> Also there is an array of integers (configurable, 10M by default as
> far as I remember) that is populated on start
> of server instance. And if value you supplied is integer and already
> contained in this array, redis will make a reference key<->ints array
> index.

Didn't know this one, thanks.

> It helps to save you memory when you have a lot of small integers in
> your database.

A common pattern, I think.

--
Aníbal

Jan Rychter

unread,

Jun 16, 2010, 3:20:19 AM6/16/10

to redi...@googlegroups.com

On Jun 16, 2010, at 12:18 AM, Konstantin Merenkov wrote:

> 2010/6/16 Aníbal Rojas <aniba...@gmail.com>:
>> Patrick,
>>
>>> In terms of purely numerical data and redis storing everything as
>>> strings (as I understand) - I'd assume that increases storage space by
>>> multiples compared to storing integers/doubles in a binary format?
>>> Storage space isn't really an issue for me but want to verify how it
>>> works.
>>
>> Redis interprets strings as integers when appropiated in the case of
>> counters: INCR and INCRBY
>>
>> But everything you store in Redis is a String, and everything you read from too.
>
> I thought that internally if you set a "12345" as a value to a key
> using SET command (for example),
> redis will try to interpret it as integer and if it is possible it
> will store it that way (to save memory).
>
> Also there is an array of integers (configurable, 10M by default as
> far as I remember) that is populated on start
> of server instance. And if value you supplied is integer and already
> contained in this array, redis will make a reference key<->ints array
> index.
> It helps to save you memory when you have a lot of small integers in
> your database.

For storing sequences of integers it might be worth taking a look at Golomb coding (see http://en.wikipedia.org/wiki/Golomb_coding). In general, these days RAM is much more precious than the CPU, so the space-speed tradeoff is different than what it used to be.

More generally, though — I'm also looking at Redis as a tool for storing and manipulating numerical data. It seems to me it could do much better. In particular, I would love to see real floats and doubles, passed in binary over the wire, as well as Golomb-coded lists and sets of integers. Think sparse vectors and matrices here, where you have lists of pairs <index> <value>.

The problem with pretending everything is a string is that you eventually hit performance problems — reading and writing floating point numbers from/to strings is very expensive.

--J.

Patrik Sundberg

unread,

Jun 16, 2010, 5:27:17 AM6/16/10

to Redis DB

Yep, if it had efficient storage of lists of integers and doubles then
a key/value solution such as redis could be quite cool for working
with time series. It'd give both storage efficiency and also fixed
length record lists for fast search/retrieval of windows in time. You
could partition the data into different keys for different periods to
avoid values getting too large (i.e. symbol:day1, symbol:day2) and add
an abstraction layer to stitch different keys together.

btw, is my assumption that whole keys are always read into memory
correct? repeating the paragraph from before for completeness:

Will the VM always swap in complete values? i.e. I have a 500mb time
series as a value on disk and want to access a few entries at the

start of the series, will the full series be read into memory?

On Jun 16, 8:20 am, Jan Rychter <j...@rychter.com> wrote:
>
> For storing sequences of integers it might be worth taking a look at Golomb coding (seehttp://en.wikipedia.org/wiki/Golomb_coding). In general, these days RAM is much more precious than the CPU, so the space-speed tradeoff is different than what it used to be.

Aníbal Rojas

unread,

Jun 16, 2010, 11:21:41 AM6/16/10

to redi...@googlegroups.com

Jan,

> For storing sequences of integers it might be worth taking a look at Golomb coding (see http://en.wikipedia.org/wiki/Golomb_coding). In general, these days RAM is much more precious than the CPU, so the space-speed tradeoff is different than what it used to be.
>

Interesting, thanks for the pointer. And yes, agreed.

> More generally, though — I'm also looking at Redis as a tool for storing and manipulating numerical data. It seems to me it could do much better. In particular, I would love to see real floats and doubles, passed in binary over the wire, as well as Golomb-coded lists and sets of integers. Think sparse vectors and matrices here, where you have lists of pairs <index> <value>.
> The problem with pretending everything is a string is that you eventually hit performance problems — reading and writing floating point numbers from/to strings is very expensive.

Umm, client side or server side, you mean? Also Redis protocol has to
be considered for "native" float, doubles, support.

> --J.

Aníbal Rojas

unread,

Jun 16, 2010, 11:25:58 AM6/16/10

to redi...@googlegroups.com

Patrick,

> length record lists for fast search/retrieval of windows in time. You
> could partition the data into different keys for different periods to
> avoid values getting too large (i.e. symbol:day1, symbol:day2) and add
> an abstraction layer to stitch different keys together.

Yes, there are *lots* of approaches for data sharding in the context
of this problem.

Something I found interesting in my research about scalability, is
design your backed so you can turn on/off features to handle load
spikes, etc.

> btw, is my assumption that whole keys are always read into memory
> correct? repeating the paragraph from before for completeness:
> Will the VM always swap in complete values? i.e. I have a 500mb time
> series as a value on disk and want to access a few entries at the
> start of the series, will the full series be read into memory?

As far as I understand Redis, yes, but check with the Sempai and Koai ;)

You can also "tune" object size for swapping considering this.

>
> On Jun 16, 8:20 am, Jan Rychter <j...@rychter.com> wrote:
>>
>> For storing sequences of integers it might be worth taking a look at Golomb coding (seehttp://en.wikipedia.org/wiki/Golomb_coding). In general, these days RAM is much more precious than the CPU, so the space-speed tradeoff is different than what it used to be.
>>
>> More generally, though — I'm also looking at Redis as a tool for storing and manipulating numerical data. It seems to me it could do much better. In particular, I would love to see real floats and doubles, passed in binary over the wire, as well as Golomb-coded lists and sets of integers. Think sparse vectors and matrices here, where you have lists of pairs <index> <value>.
>>
>> The problem with pretending everything is a string is that you eventually hit performance problems — reading and writing floating point numbers from/to strings is very expensive.
>>
>> --J.
>

Jan Rychter

unread,

Jun 16, 2010, 12:23:35 PM6/16/10

to redi...@googlegroups.com

On Jun 17, 2010, at 10:51 AM, Aníbal Rojas wrote:

> Jan,
>
>> For storing sequences of integers it might be worth taking a look at Golomb coding (see http://en.wikipedia.org/wiki/Golomb_coding). In general, these days RAM is much more precious than the CPU, so the space-speed tradeoff is different than what it used to be.
>>
>
> Interesting, thanks for the pointer. And yes, agreed.

The more I think about it, the more Golomb coding makes sense for Redis.

>> More generally, though — I'm also looking at Redis as a tool for storing and manipulating numerical data. It seems to me it could do much better. In particular, I would love to see real floats and doubles, passed in binary over the wire, as well as Golomb-coded lists and sets of integers. Think sparse vectors and matrices here, where you have lists of pairs <index> <value>.
>> The problem with pretending everything is a string is that you eventually hit performance problems — reading and writing floating point numbers from/to strings is very expensive.
>
> Umm, client side or server side, you mean? Also Redis protocol has to
> be considered for "native" float, doubles, support.

Well, both — the current protocol is text-based and the server stores floats as strings. I think both should change. While I appreciate the simplicity and "debuggability" of the text-based protocol, I think it will become limiting soon.