timeseries processing in redis

348 views
Skip to first unread message

Stefan Parvu

unread,
Aug 21, 2014, 6:19:14 PM8/21/14
to redi...@googlegroups.com
Hi,

We are considering using Redis for our future Analytics project, where we need to process time series data, for example
computer performance data. We plan to record data every 1 minute from lots of hosts and send this over HTTP or HTTPs
to our analytics platform. All raw data, is CSV format like:

timestap (seconds since Epoch): value : value : value ...

1. Per-CPU data:

1408658702:0:2.06:0.65:2.25:0.10:94.95:5.05
1408658702:1:2.04:0.52:2.12:0.16:95.16:4.84
1408658702:2:2.49:0.56:2.03:0.05:94.87:5.13
1408658702:3:2.48:0.55:1.99:0.02:94.97:5.03

1408658703:0:0.00:0.00:0.79:0.00:99.21:0.79
1408658703:1:2.36:0.00:1.57:0.79:95.28:4.72
1408658703:2:0.00:0.00:2.33:0.00:97.67:2.33
1408658703:3:0.00:0.00:0.79:0.00:99.21:0.79

2. Overall System data:

1408659072:4.99:19.94:380.06:2.27:0.56:2.08:0.08:95.01:0.00:0.00:0.00:0.00:125.00:353164.00:860568.00:871112.00:29916.00:3848060.00:4738544.00:0.00:0.00:0.00:0.00:0.00:0.00:0.00:0.00:0.00:0.30:0.30:0.28
1408659073:0.59:2.35:397.65:0.00:0.00:0.59:0.00:99.41:0.00:0.00:0.00:0.00:125.00:353168.00:860568.00:871192.00:29916.00:3847976.00:4738460.00:0.00:0.00:0.00:154.48:648.83:0.02:0.12:0.00:3.20:0.30:0.30:0.28

We are using OpenResty (NGINX + Lua) and RRDtool at the moment. We are thinking to process all raw data into Redis and
make simple our processing without using RRDtool.

Anyone can comment on the following:

1. Is it possible to maintain sort of last 3hrs, last 12hrs, last 24hrs , etc ... time window frames within Redis and
    present the statistics associated ? We would like to have AVG, MIN, MAX, LAST ? Do we need to
    calculate each statistic function ?

2. We basically will not be able to keep lots of raw data inside Redis, if we want to keep 1 or 2 years of data. We
    want to somehow store on a flat files the raw data after some periods of time has passed. How can we do that ?
    How one could extract or move parts of the raw data to flat files ?

3. Do we need to process all raw data before displaying the stats ? Normalization, like RRDtool does ?

4. How about if we need to offer different types of statistics functions, where and how would we calculate these ?
    Redis directly ? Lua ?

Im reading Redis currently and found it very interesting for our project. Sorry if these questions have been asked again.

Thanks a lot,
Stefan

Josiah Carlson

unread,
Aug 21, 2014, 8:57:48 PM8/21/14
to redi...@googlegroups.com
Replies inline.

On Thu, Aug 21, 2014 at 3:19 PM, Stefan Parvu <spa...@systemdatarecorder.org> wrote:
Hi,

We are considering using Redis for our future Analytics project, where we need to process time series data, for example
computer performance data. We plan to record data every 1 minute from lots of hosts and send this over HTTP or HTTPs
to our analytics platform. All raw data, is CSV format like:

timestap (seconds since Epoch): value : value : value ...

1. Per-CPU data:

1408658702:0:2.06:0.65:2.25:0.10:94.95:5.05
1408658702:1:2.04:0.52:2.12:0.16:95.16:4.84
1408658702:2:2.49:0.56:2.03:0.05:94.87:5.13
1408658702:3:2.48:0.55:1.99:0.02:94.97:5.03

1408658703:0:0.00:0.00:0.79:0.00:99.21:0.79
1408658703:1:2.36:0.00:1.57:0.79:95.28:4.72
1408658703:2:0.00:0.00:2.33:0.00:97.67:2.33
1408658703:3:0.00:0.00:0.79:0.00:99.21:0.79

2. Overall System data:

1408659072:4.99:19.94:380.06:2.27:0.56:2.08:0.08:95.01:0.00:0.00:0.00:0.00:125.00:353164.00:860568.00:871112.00:29916.00:3848060.00:4738544.00:0.00:0.00:0.00:0.00:0.00:0.00:0.00:0.00:0.00:0.30:0.30:0.28
1408659073:0.59:2.35:397.65:0.00:0.00:0.59:0.00:99.41:0.00:0.00:0.00:0.00:125.00:353168.00:860568.00:871192.00:29916.00:3847976.00:4738460.00:0.00:0.00:0.00:154.48:648.83:0.02:0.12:0.00:3.20:0.30:0.30:0.28

We are using OpenResty (NGINX + Lua) and RRDtool at the moment. We are thinking to process all raw data into Redis and
make simple our processing without using RRDtool.

Anyone can comment on the following:

1. Is it possible to maintain sort of last 3hrs, last 12hrs, last 24hrs , etc ... time window frames within Redis and
    present the statistics associated ? We would like to have AVG, MIN, MAX, LAST ? Do we need to
    calculate each statistic function ?

Yes you can do time windows (how you do it depends on the features of the statistics you want to gather).

Your second question is not a question.

You will always need to calculate average, the other 3 aggregates (MIN, MAX, LAST) might be computable with Redis internal commands, depending on the data representation.
 

2. We basically will not be able to keep lots of raw data inside Redis, if we want to keep 1 or 2 years of data. We
    want to somehow store on a flat files the raw data after some periods of time has passed. How can we do that ?
    How one could extract or move parts of the raw data to flat files ?

Well, there are 1440 minutes/day. I estimate CPU rows to be about 55 bytes long on the upper end. That's 29 megs/year per CPU. System rows look to be about 250 bytes on the upper end, so 131.5 megs/year per system. That's not a lot if you don't have many machines, but I'm guessing you've got more than a few machines to record.

How to pull data out will depend on how the data is stored in Redis itself. I doubt you will actually be storing the raw rows in Redis (it may make sense to pass through a Lua script for processing/aggregation, but it doesn't make sense to store the non-processed data in Redis for much longer than it takes to write the data to disk), so I would suggest just sending the data into Redis while at the same time appending to a flat file on disk. You can periodically rotate the flat file, backing up the old file anywhere you want.

If you want to keep local filesystems out of the loop, you can have an analytics Lua script analyze your rows and add them to a "pending disk write" LIST after. Then you'd just have an external process that pulls those rows periodically and dumps them to disk (or something like S3). Then if anything goes wrong and Redis dies a horrible death, you can scan over those rows that fit the timeframe you need, and re-inject them into Redis (skipping the part where they get backed up again).

3. Do we need to process all raw data before displaying the stats ? Normalization, like RRDtool does ?

You don't *need* to, but if you don't, then you are going to have to pull down all of your non-normalized data for processing on the client, or you need to write a Lua script to do the same inside Redis. Neither of those are a good idea if you have a nontrivial amount of data to go over during an analytics query.

I would suggest that you process lines as they come in, then your dashboard that displays the data basically just performs a few commands to fetch the data, possibly calculating the average, and displaying it.

4. How about if we need to offer different types of statistics functions, where and how would we calculate these ?
    Redis directly ? Lua ?

That depends on whether you are primarily processing your data outside Redis (using the typical API to update in-Redis stats) or inside Redis using Lua. Both have benefits and drawbacks, but generally I'd suggest sticking with using Lua inside Redis for actually processing your data.

Im reading Redis currently and found it very interesting for our project. Sorry if these questions have been asked again.

In terms of an analytics system design; how you would store your data depends on the API you want for reading the data, how precise you need your sliding window (1-hour granularity is easy, 1-minute granularity is less easy), and a few other things. In section 5.2 of Redis in Action[1], I cover basic statistics and how you can do min, max, average, and standard deviation in Redis. It's primarily focused on small numbers of counters, so won't work as well for many CPU/system counters like what you are looking to solve.

But ultimately, how to store data, compute on the data, etc., will depend on the access patterns you expect to have. For what it's worth, I've built real-time analytics systems using Redis 3 times now, one of which could ingest 40k rows/second of logs on a single Redis server. I don't see a reason why you couldn't build a similar system for your CPU and System analytics... Though it brings up an interesting question: why not just use Graphite?

 - Josiah


Thanks a lot,
Stefan

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Stefan Parvu

unread,
Aug 22, 2014, 4:03:12 AM8/22/14
to redi...@googlegroups.com

Many thanks. See my answers below:

> You will always need to calculate average, the other 3 aggregates (MIN,
> MAX, LAST) might be computable with Redis internal commands, depending on
> the data representation.

Right, I need to read and understand what data structures we will really need
in order to be able to have some useful statistics.

> Well, there are 1440 minutes/day. I estimate CPU rows to be about 55 bytes
> long on the upper end. That's 29 megs/year per CPU. System rows look to be
> about 250 bytes on the upper end, so 131.5 megs/year per system. That's not
> a lot if you don't have many machines, but I'm guessing you've got more
> than a few machines to record.

We could have from 500 up to 5000 hosts to monitor. Of course for the large
configuration we will need to have probable more than 64GB RAM. But we probable
need to get rid of the raw data (not processed one) as soon as almost it arrives
somewhere to a flat file on disk.(we want to keep the CSV records for future
archiving ...)

> How to pull data out will depend on how the data is stored in Redis itself.
> I doubt you will actually be storing the raw rows in Redis (it may make
> sense to pass through a Lua script for processing/aggregation, but it
> doesn't make sense to store the non-processed data in Redis for much longer

Exactly. That's what I was also thinking - we could store the aggregated data in Redis
up to 6 months or whatever else. But only the aggregated data.

> than it takes to write the data to disk), so I would suggest just sending
> the data into Redis while at the same time appending to a flat file on
> disk. You can periodically rotate the flat file, backing up the old file
> anywhere you want.

Right, the aggregation part it could be done 100% in Lua, within Redis.
As I understood Redis has within a Lua interpreter, exactly 5.1. Right ? So we could
process all raw data within Redis /Lua and append every record to a flat file on disk.
Then we could keep the aggregated values within Redis for our dashboards
and the rest off-loaded on disk.

>
> If you want to keep local filesystems out of the loop, you can have an
> analytics Lua script analyze your rows and add them to a "pending disk
> write" LIST after.

interesting. But probable will increase the memory usage if the list grows.

Probable the first approach, is simpler as the raw data record arrives and it is processed
it will be sent to raw data flat file on disk. Is there any consistency regarding file IO
blocking / nonblocking ? I suppose if that Lua function executes it blocks the other
activities until the record is flushed to disk ? Or how does this happen within Redis ?

> I would suggest that you process lines as they come in, then your dashboard
> that displays the data basically just performs a few commands to fetch the
> data, possibly calculating the average, and displaying it.

right.

> That depends on whether you are primarily processing your data outside
> Redis (using the typical API to update in-Redis stats) or inside Redis
> using Lua. Both have benefits and drawbacks, but generally I'd suggest
> sticking with using Lua inside Redis for actually processing your data.

ok, I was thinking as well, we could use Lua within Redis. I dont understand yet,
how Redis will function if data arrives from different hosts at the same time ?
Are they processed one by one ? Redis is single threaded, single process
so there is no form of concurrency within ? Or ?

> In terms of an analytics system design; how you would store your data
> depends on the API you want for reading the data, how precise you need your
> sliding window (1-hour granularity is easy, 1-minute granularity is less

We want to present data last 3hrs, 6hrs, 12hrs, 24hrs, 3days, 7days, 30days, 90days
for example. RRDtool was building its own types of archives (RRA) which was keeping these
stats within. We could vary the granularity based on the type of archive, we want to keep or
to present data:
5min for 3hrs stats
15min for 6hrs stats
30min for 12hrs stats
1hr for 24hrs stats
3hr for 3 days
...

Not sure if it is easy to build something like this in Redis/Lua.

> easy), and a few other things. In section 5.2 of Redis in Action[1], I

super, we have ordered the book already. Im waiting it.

> cover basic statistics and how you can do min, max, average, and standard
> deviation in Redis. It's primarily focused on small numbers of counters, so
> won't work as well for many CPU/system counters like what you are looking
> to solve.

Right. I need to read this part. Thanks for pointer. Waiting the book.

> But ultimately, how to store data, compute on the data, etc., will depend
> on the access patterns you expect to have. For what it's worth, I've built
> real-time analytics systems using Redis 3 times now, one of which could
> ingest 40k rows/second of logs on a single Redis server. I don't see a
> reason why you couldn't build a similar system for your CPU and System
> analytics... Though it brings up an interesting question: why not just use
> Graphite?

I see. Well sounds a bit of work, but worth doing it.

We used for long time RRDtool / Perl, with no big troubles. But lately we been discovering
OpenResty and Lua and the performance was fantastic.We wanted to move away from
RRDtool plotting to a JavaScript JSON based library. So we were thinking how we could
easily do all these things without having many components around and perform as much as
we could within Lua.

Then we started to read about in cache memory databases, and I was thinking some dashboard
numbers we could use to calculate and keep in memory to seep up access.

We want to minimize the number of trips from our authentication layer to the storage
and processing layer and perform as much as possible within Lua.

So somehow I was thinking to perform all numerical processing in memory without
accessing RRDtool via Lua etc ... and pass a JSON record to the plotting JavaSCript
library. We have never used Graphite.

thanks again for explanations.

--
Stefan Parvu <spa...@systemdatarecorder.org>

Josiah Carlson

unread,
Aug 23, 2014, 2:08:11 AM8/23/14
to redi...@googlegroups.com
Replies inline.

On Fri, Aug 22, 2014 at 1:02 AM, Stefan Parvu <spa...@systemdatarecorder.org> wrote:

Many thanks. See my answers below:

> You will always need to calculate average, the other 3 aggregates (MIN,
> MAX, LAST) might be computable with Redis internal commands, depending on
> the data representation.

Right, I need to read and understand what data structures we will really need
in order to be able to have some useful statistics.

What I would recommend for a schema is as follows...

<metric>:<time_block>:<precision>:<type> -> {<timestamp>: <value>, ...}

As an example:

metric = host1.foo.com:CPU0:user
time_block = 391319
precision = 300
type = max

timestamp = 1408750200
value = 0.25

The structure is just a hash, which is sufficient for storing this data. There is at least one other schema that can offer lower memory utilization, though does require packing and unpacking via something like the included struct library (see http://redis.io/commands/eval), and requires additional work on read beyond unpacking.

Generally the method that I like to use pre-fills all precision levels in advance so that reading requires a minimum amount of work. Also, I'm not a fan at taking more precise data and aggregating it into lower-precision data. In particular, some systems would take your 5-minute precision information and squish it into 15 minute precision when the 5 minute precision data expires. I think that is terribly silly, and if you aren't *really* careful with how they get rolled up, you get data aliasing errors (I ran into this issue when I last used Graphite - though your precision requirements are fine for the other types of systems)

Here is an example Lua script that takes one of your per-CPU rows and translates it into analytics that can be read without much difficulty. Note that this assumes a non-sharded, non-clustered Redis, as we construct keys dynamically inside Redis.

-- KEYS = {<partial metric name>}
-- ARGV = {<input row>}

local precision = {
  {3*3600, 5*60},
  {6*3600, 15*60}.
  {12*3600, 30*60},
  {24*3600, 3600},
  {3*24*3600, 3*3600}
}
local metrics = {':user:', ':sys:', ':nice:', ':idle:', ':io:', ':hirq:', ':sint:'}
local pattern = '^([0-9.]+):([0-9.]+):([0-9.]+):([0-9.]+):([0-9.]+):([0-9.]+):([0-9.]+):([0-9.]+)$'
local data = {string.match(ARGV[1], pattern)}
local timestamp = tonumber(table.remove(data, 1))
for i = 1, #precision do
    local ai = precision[i]
    local time_block = math.floor(timestamp / ai[1])
    local timestamp = ai[2] * math.floor(timestamp / ai[2])
    for j, metric in ipairs(metrics) do
        -- prepare a partial key and the current value for the metric
        local fkey = KEYS[1] .. metric .. time_block .. ':' .. timestamp .. ':'
        local value = tonumber(data[j])

        -- handle min
        local key = fkey .. 'min'
        local existing = redis.call('hget', key, timestamp)
        if not existing or tonumber(existing) > value then
            redis.call('hset', key, timestamp, value)
            redis.call('expire', key, 2 * ai[1])
        end

        -- handle max
        key = fkey .. 'max'
        existing = redis.call('hget', key, timestamp)
        if not existing or tonumber(existing) < value then
            redis.call('hset', key, timestamp, value)
            redis.call('expire', key, 2 * ai[1])
        end

        redis.call('hincrbyfloat', fkey .. 'sum', timestamp, data[j])
        redis.call('hincrby', fkey .. 'count', timestamp, 1)
    end
end

I've not tested the above, but it should give you an idea of what's going on (and I'm assuming that those per-CPU lines were the outputs of some fairly-standard CPU reporting tool).

Some features: for data that covers X hours, we always tell Redis to keep the data for 2X hours. As long as log lines make it into Redis in a timely fashion, the oldest data may live 3x as long as is absolutely necessary.


To get data for your 'min' graph at a particular precision, you would pull the full contents of two hashes, (optionally) trim the oldest data, and display it. For the avereage, you would actually pull 4 hashes, two from each of the 'sum' and 'count', again optionally trim, then perform the division for each hash field in the count hash data (if there is an entry, it's nonzero, so you know that the division will succeed, which is a nice bit of insurance even if the lines are right after each other in the Lua script).

> Well, there are 1440 minutes/day. I estimate CPU rows to be about 55 bytes
> long on the upper end. That's 29 megs/year per CPU. System rows look to be
> about 250 bytes on the upper end, so 131.5 megs/year per system. That's not
> a lot if you don't have many machines, but I'm guessing you've got more
> than a few machines to record.

We could have from 500 up to 5000 hosts to monitor. Of course for the large
configuration we will need to have probable more than 64GB RAM. But we probable
need to get rid of the raw data (not processed one) as soon as almost it arrives
somewhere to a flat file on disk.(we want to keep the CSV records for future
archiving ...)

Well, let's do some back of the envelope calculations :)

For your data for a single metric, there are 36 + 4*24 data points for full precision, times 3 for incidentally keeping data around 3x as long as necessary. So 396 data points for a single metric. There are 7 data points per row, and with min, max, sum, and count, that's 4 different aggregates. That leaves us with 396 * 7 * 4 total data points for a single CPU metric on one host = 11088 total data points for one CPU on one host. With ziplist encoding of those hashes, you can probably expect somewhere in the <16 bytes/entry range on average. So roughly 180k of data per CPU. With 5000 CPUs, that's roughly 900 megs of data to store. With 8 CPUs per host on 5000 hosts, that's 7.2 gigs.

Here's the fun part. With an alternate packing scheme, you can easily get your data down to 4 bytes per data point, and only ever store exactly as much data as necessary... for 1/12 the storage space of what I describe above. Though at that point, the "key" portion of the "key/value" part of Redis start taking up nontrivial amounts of space relative to your data.

> How to pull data out will depend on how the data is stored in Redis itself.
> I doubt you will actually be storing the raw rows in Redis (it may make
> sense to pass through a Lua script for processing/aggregation, but it
> doesn't make sense to store the non-processed data in Redis for much longer

Exactly. That's what I was also thinking - we could store the aggregated data in Redis
up to 6 months or whatever else. But only the aggregated data.

Some of it you could definitely store for 6 months. But for 5000 machines, maybe not all of them at the 5 minute precision.

> than it takes to write the data to disk), so I would suggest just sending
> the data into Redis while at the same time appending to a flat file on
> disk. You can periodically rotate the flat file, backing up the old file
> anywhere you want.

Right, the aggregation part it could be done 100% in Lua, within Redis.
As I understood Redis has within a Lua interpreter, exactly 5.1. Right ? So we could
process all raw data within Redis /Lua and append every record to a flat file on disk.
Then we could keep the aggregated values within Redis for our dashboards
and the rest off-loaded on disk.

Yep, and you've got a basic skeleton of what to do farther up in this email :)

The one limitation is that the Lua 5.1 in Redis does not have the full set of Lua libraries. You can see what is available here: http://redis.io/commands/eval . It also can't write to disk, so any disk writing will have to be outside of Redis.

> If you want to keep local filesystems out of the loop, you can have an
> analytics Lua script analyze your rows and add them to a "pending disk
> write" LIST after.

interesting. But probable will increase the memory usage if the list grows.

That's why I like to run *very* aggressive cleanup scripts when I do things like this, and usually start paging people when the list gets past what I expect to see.

Probable the first approach, is simpler as the raw data record arrives and it is processed
it will be sent to raw data flat file on disk. Is there any consistency regarding file IO
blocking / nonblocking ? I suppose if that Lua function executes it blocks the other
activities until the record is flushed to disk ? Or how does this happen within Redis ?

Lua in Redis does not have access to the filesystem, and Lua in Redis does not ever directly write data to disk. If you are using AOF persistence, then the file syncing on command execution will depend on the policies you set in the configuration file.

> I would suggest that you process lines as they come in, then your dashboard
> that displays the data basically just performs a few commands to fetch the
> data, possibly calculating the average, and displaying it.

right.

> That depends on whether you are primarily processing your data outside
> Redis (using the typical API to update in-Redis stats) or inside Redis
> using Lua. Both have benefits and drawbacks, but generally I'd suggest
> sticking with using Lua inside Redis for actually processing your data.

ok, I was thinking as well, we could use Lua within Redis. I dont understand yet,
how Redis will function if data arrives from different hosts at the same time ?
Are they processed one by one ? Redis is single threaded, single process
so there is no form of concurrency within ? Or ?

Single threaded (at least on the command execution side of things). But you have to remember, everything is in memory, there are no data structure locks to get in your way, and everything (in general) is written to perform fast. You can also get LuaJIT in Redis with some minor modifications, from what I understand, so you can get even more speed that way :)

> In terms of an analytics system design; how you would store your data
> depends on the API you want for reading the data, how precise you need your
> sliding window (1-hour granularity is easy, 1-minute granularity is less

We want to present data last 3hrs, 6hrs, 12hrs, 24hrs, 3days, 7days, 30days, 90days
for example. RRDtool was building its own types of archives (RRA) which was keeping these
stats within. We could vary the granularity based on the type of archive, we want to keep or
to present data:
   5min for 3hrs stats
   15min for 6hrs stats
   30min for 12hrs stats
   1hr for 24hrs stats
   3hr for 3 days
   ...

Not sure if it is easy to build something like this in Redis/Lua.

It's about 75% like 2 of the 3 other systems that I've built before.

> easy), and a few other things. In section 5.2 of Redis in Action[1], I

super, we have ordered the book already. Im waiting it.

I hope you find that you've gotten your money's worth :)

> cover basic statistics and how you can do min, max, average, and standard
> deviation in Redis. It's primarily focused on small numbers of counters, so
> won't work as well for many CPU/system counters like what you are looking
> to solve.

Right. I need to read this part. Thanks for pointer. Waiting the book.

If you can't wait to get started, there is code here:

Though I will admit, it is much harder to follow without the book.

> But ultimately, how to store data, compute on the data, etc., will depend
> on the access patterns you expect to have. For what it's worth, I've built
> real-time analytics systems using Redis 3 times now, one of which could
> ingest 40k rows/second of logs on a single Redis server. I don't see a
> reason why you couldn't build a similar system for your CPU and System
> analytics... Though it brings up an interesting question: why not just use
> Graphite?

I see. Well sounds a bit of work, but worth doing it.

We used for long time RRDtool / Perl, with no big troubles. But lately we been discovering
OpenResty and Lua and the performance was fantastic.We wanted to move away from
RRDtool plotting to a JavaScript JSON based library. So we were thinking how we could
easily do all these things without having many components around and perform as much as
we could within Lua.

Then we started to read about in cache memory databases, and I was thinking some dashboard
numbers we could use to calculate and keep in memory to seep up access.

We want to minimize the number of trips from our authentication layer to the storage
and processing layer and perform as much as possible within Lua.

So somehow I was thinking to perform all numerical processing in memory without
accessing RRDtool via Lua etc ... and pass a JSON record to the plotting JavaSCript
library. We have never used Graphite.

Well, Graphite is just a piece of software whose only purpose is to store data for graphs exactly like this. You basically just send it data across a TCP and/or UDP socket, and it handles all of the combining internally. I actually set up a graphite server a few months ago so I wouldn't have to build *another* full stats/metrics system over 2 days.

 - Josiah
 
thanks again for explanations.

--
Stefan Parvu <spa...@systemdatarecorder.org>

Stefan Parvu

unread,
Aug 23, 2014, 3:15:01 PM8/23/14
to redi...@googlegroups.com

Cheers. Monday I should get your book and I need to start reading and dig into all your answers
detailed.

I need to clarify on my side:

- what would be the dashboard. we plan making an analytic product for
weather and climate data and computer performance. 2 different areas.
there will be differences in metrics, granularity, many things.

- how many metrics I really want to show on the dashboard ?

- see how much RAM I will need to keep the counters for both bases

- we plan for light installations (< 5 hosts) to build the solution around
raspberry pi hardware. So we will have to careful plan the metrics
and the dashboard.

About Lua and Redis - so most likely I will need to pre-process the raw
data on disk/flat files within OpenResty and then send the raw data record to Redis for
counter updates.

Thanks again for all good advises. I will start working on this and later I will post my
progress with Redis and OpenResty.

Thanks,

--
Stefan Parvu <spa...@systemdatarecorder.org>

Josiah Carlson

unread,
Aug 25, 2014, 7:43:05 PM8/25/14
to redi...@googlegroups.com
On Sat, Aug 23, 2014 at 12:14 PM, Stefan Parvu <spa...@systemdatarecorder.org> wrote:

Cheers. Monday I should get your book and I need to start reading and dig into all your answers
detailed.

I need to clarify on my side:

 - what would be the dashboard. we plan making an analytic product for
   weather and climate data and computer performance. 2 different areas.
   there will be differences in metrics, granularity, many things.

 - how many metrics I really want to show on the dashboard ?

 - see how much RAM I will need to keep the counters for both bases

 - we plan for light installations (< 5 hosts) to build the solution around
   raspberry pi hardware. So we will have to careful plan the metrics
   and the dashboard.

You don't have to answer, but why raspberry pi hardware? Aside from inexpensive purchase price and low power utilization, choosing the pi will (primarily) strictly limit the data that you can store and process, especially on the Redis side of things, partly due to the low memory, but also due to the generally lower performance compared to more typical desktop or server hardware.

About Lua and Redis - so most likely I will need to pre-process the raw
data on disk/flat files within OpenResty and then send the raw data record to Redis for
counter updates.

More or less, yes.

Thanks again for all good advises. I will start working on this and later I will post my
progress with Redis and OpenResty.

Sounds good :)

 - Josiah
 

Thanks,

--
Stefan Parvu <spa...@systemdatarecorder.org>

Stefan Parvu

unread,
Aug 27, 2014, 9:17:56 AM8/27/14
to redi...@googlegroups.com

sorry for delay. I finally got your book. Well written. I like it very much.

> You don't have to answer, but why raspberry pi hardware? Aside from
> inexpensive purchase price and low power utilization, choosing the pi will
> (primarily) strictly limit the data that you can store and process,
> especially on the Redis side of things, partly due to the low memory, but
> also due to the generally lower performance compared to more typical
> desktop or server hardware.

We wanted to experiment with ARM and light devices for certain types of
data subscriptions related to weather, where, for some conditions, we
expect to have few data feeds to our analytic. And RBPi looked
that was gaining terrain and it was worth testing it. Thats why we ended up
with these boards. But mainly the idea is test on ARM the entire analytic
stack.

We have seen the memory limitation, and now we are thinking about all
these things but we will try to accommodate some decent summary statistics
on it.

What seems amazing is the combination of OpenResty + Redis which kicks a**
hands down: high throughput, low system utilization, easy scripting with Lua.

As a matter of fact I just finished couple of performance tests, looking our
authentication part on RBPi. For example, pushing 50 virtual users, which is
way more than we ever need from this board, returned a response time,
R=40ms with a throughput, X=61req/sec using 30% CPU. Not bad at all.
Redis uses very few memory at this time, for this particular test.

--
Stefan Parvu <spa...@systemdatarecorder.org>

Josiah Carlson

unread,
Aug 28, 2014, 7:15:54 PM8/28/14
to redi...@googlegroups.com
On Wed, Aug 27, 2014 at 6:17 AM, Stefan Parvu <spa...@systemdatarecorder.org> wrote:

sorry for delay. I finally got your book. Well written. I like it very much.

I'm glad you like it :)
I had initially thought that there might be long-term stability issues, as I'd read about other ARM-based boards having stability issues (http://www.systemcall.org/blog/2014/06/trashing-chromebooks/), but further looking into the RBPi suggests that as long as you have a good power supply, they tend to be solid.

If they suit your needs, all the better :)

 - Josiah

Stefan Parvu

unread,
Sep 6, 2014, 6:59:49 AM9/6/14
to redi...@googlegroups.com

Josiah, many thanks again for tips and advice. The book is amazing and very practical.
Next update would be useful to see as well more Lua around :)

We did read and experiment a bit and now we are thinking to implement something like
below. Still some questions, later in the email:

1. The time intervals for our dashboard, we work in seconds
Last 3hrs, precision: 60 seconds (1min)
Last 6hrs, precision: 300 seconds (5min)
Last 12hrs, precision: 900 seconds (15min)
Last 24hrs, precision: 1800 seconds (30min)
Last 3days, precision: 3600 seconds (1hr)
Last 7days precision: 10800 seconds (3hr)

2. We have a JSON file where we describe what we plan to display, basically the metrics and what
stats each metric will display. Example, from our library of monitoring objects,
a config file, cpd.json , the summary statistics per metric, the cpupct metric and its associated stats:

"statistics": {
"cpupct": {
"MIN": [
[10800, 60], --> last 3hrs: total interval seconds, precision
[21600, 300], --> last 6hrs: total interval seconds, precision
[43200, 900], --> last 12hrs: total interval seconds, precision
[86400, 1800], --> last 24hrs: total interval seconds, precision
[259200, 3600], --> last 3days: total interval seconds, precision
[604800, 10800] --> last 7days: total interval seconds, precision
],
"MAX": [
[10800, 60],
[21600, 300],
[43200, 900],
[86400, 1800],
[259200, 3600],
[604800, 10800]
],
"SUM": [
[10800, 60],
[21600, 300],
[43200, 900],
[86400, 1800],
[259200, 3600],
[604800, 10800]
],
"COUNT": [
[10800, 60],
[21600, 300],
[43200, 900],
[86400, 1800],
[259200, 3600],
[604800, 10800]
],
"SUMSQ": [
[10800, 60],
[21600, 300],
[43200, 900],
[86400, 1800],
[259200, 3600],
[604800, 10800]
],
"LAST": [
[10800, 60],
[21600, 300],
[43200, 900],
[86400, 1800],
[259200, 3600],
[604800, 10800]
]
},
...

3. Data Structure

<host_id>:<message_id>:<field_name>:<time_block>:<precision>:<type> -> { timestamp:value, ... }

our key seems around 75chars, since we store host UUID which is large.
the value is a hash to the timestamp and value

Example how this looks in real life:

<host_id> = 18e9e570-db88-aa05-a22c-60a44c06e603
<message_id> = cpd-linux-cpurec
<field_name> = cpupct
<time_block> = 130548
<precision> = 60
<type> = MAX

<timestamp> = 1409921993
<value> = 17.22


Still some questions:

1. Data structure
Does it look ok our data structure for one metric ? More or less it is same thing
you mentioned in your previous email, except we have a bit bigger key.

2. Time_block
I understand we need sort of marker, where the interval starts. This is basically what the
time block is. Right ? How we will calculate the last, the previous time block ? Is it always
timestamp / 10800 (the interval) - 1 ?

3. Roll-up data
You mentioned: "Also, I'm not a fan at taking more precise data and aggregating it into
lower-precision data. In particular, some systems would take your 5-minute precision
information and squish it into 15 minute precision when the 5 minute precision data expires."

Do you mean when we need to roll-up data, older than for example in our case 7 days ...

4. The interval of time

I understand we should be able to keep 2 times the number of hours we
plan to report in Redis. But you mentioned then: 3 times ... why is that ?

> Some features: for data that covers X hours, we always tell Redis to keep
> the data for 2X hours. As long as log lines make it into Redis in a timely
> fashion, the oldest data may live 3x as long as is absolutely necessary.

Do you mean just in case, to be on the safe side, we should keep 3 times
the number of hours we plan to report ?


Thanks a lot,

--
Stefan Parvu <spa...@systemdatarecorder.org>

Marc Gravell

unread,
Sep 6, 2014, 7:43:23 AM9/6/14
to redi...@googlegroups.com

I just want to give a slightly different response, for perspective:

Redis is awesome - I love me some Redis, but that doesn't mean it is the best tool for every job. In the case of long running time-series operations, I *personally* would give serious consideration to things like Cassandra. Not because Redis *can't* be made to do it - but because it is *perhaps* a more natural fit for something like Cassandra, rather than forcing a square peg into a round hole. This should in no way be seen as a criticism of Redis, and is simply a "pick tools for the jobs you need to do, not jobs for the tools you already have" thing...

Just my tuppence.

Marc

--

Stefan Parvu

unread,
Sep 6, 2014, 7:56:55 AM9/6/14
to redi...@googlegroups.com

> Redis is awesome - I love me some Redis, but that doesn't mean it is the
> best tool for every job. In the case of long running time-series
> operations, I *personally* would give serious consideration to things like
> Cassandra. Not because Redis *can't* be made to do it - but because it is

Think of it as an appliance type of analytic. Small, rugged, can be large as well
with zero administration. Im stunned how many hours people spend in setting up
things and *keeping* them running. You turn that thing ON and it should run
itself.

You can't put Cassandra in 1GB RAM and think it will run. Isn't it ?
Its large, it requires nodes, its a cluster architecture. It is Java based, right ? It is good
for something entire else.

Redis comes beautiful hand in hand for an analytic appliance type. Small footprint,
enough simple, without large sfw dependencies, compiles and executes fast. No Java.

--
Stefan Parvu <spa...@systemdatarecorder.org>

Erdoğan Kürtür

unread,
Apr 24, 2015, 9:13:09 AM4/24/15
to redi...@googlegroups.com, spa...@systemdatarecorder.org
As much as I like redis, timeseries is a job for a time series database. I personally used influxdb and it is simply amazing, add grafana to the mix and you got whole system covered.

Stefan Parvu

unread,
Apr 25, 2015, 11:30:04 AM4/25/15
to redi...@googlegroups.com
On 04/24/15 15:58, Erdoğan Kürtür wrote:
> As much as I like redis, timeseries is a job for a time series database.
> I personally used influxdb and it is simply amazing, add grafana to the
> mix and you got whole system covered.

1 year of redis:

* we managed to develop a very powerful analytic platform based on
Redis 2.x handling time series data without the complexity and overhead
of a time series db, like for example InfluxDB.

* we are using FreeBSD as main platform followed by Debian

* we target x86 and ARM platforms, on ARM we develop things which run
on batteries, rugged conditions for environmental monitoring. On
enterprise we develop on a 1RU server

* we value simplicity and speed. Lua/OpenResty and Redis is a very
powerful combination for us

* we are a very intensive data storage, writing to Redis

* we are now entering phase 2, where we work to support large number of
data sources, and here we need to scale and start using sharding or
cluster features

We did evaluate InfluxDB last year: it was not even compiling on
FreeBSD, complex and no real value about being a time series database,
poor summary statistics functions, no data filters, not even minimal
time series analysis features within. We liked the idea of using a
query based language but we found it complex and suitable in general for
large installations with lots of nodes.


--
Stefan Parvu <spa...@kronometrix.org>
Reply all
Reply to author
Forward
0 new messages