Many thanks to all of you who answered my question.
Especially I thank @siculars whose answer is extremely helpful.
From your answers I got some input to add some requirements to my
######## <requirements version="2">
I need a database to log and retrieve sensor data.
- Node = A computer on which on instance of the database
- Blip = one data record send by a sensor
- Bin = A container holding a lot of blips. The database
could store bins instead of blips to reduce the total number
- Blip page = The sorted list of all blips for a specific sensor
and a specific time range.
The scale is as follows:
(01) Tenthousand sensors deliver 1 blip per second each
-> Insert rate = 10kiloblip/s
-> Insert rate = 315 gigablip per Year
(02) They have to be stored for ~3 years
-> Size of database = 1 terablip
(03) Each blip has about 200 bytes
-> Size of database = 200TB
(04) The system will start with just 700 sensors but will
soon increase upto the described volume.
The main operations on the data are:
(05) Append the new blips to the database
(written blips are never changed)!
(06) Return all blips for sensor X with a timestamp
between timestamp_a and timestamp_b!
With other words: Return a blip page.
(07) Return all the blips specified in (06) ordered
(08) Delete all blips older than Y!
Further the following is true:
(09) The database system MUST be free and open source.
(10) The DB SHOULD be easy to administrate.
(11) 99.9% of the blips are inserted in
chronological order, the rest is not.
(12) All data MUST still be writable and readable while less
then the configurable number N of nodes are down (unexpectedly).
(13) The mechanisms to distibute the data to the available
nodes SHOULD be handled by the database.
This means that the database SHOULD automatically
redistribute the data when nodes are added or removed.
The application I am building is a kind of complex event processing
It is mainly written in erlang.
#### Plain logfiles
are an interesting approach but IMO are very
much in conflict with (10, 11, 12, 13).
It would mean to do most of the work by myself.
seems not to be able to fullfill replication (12, 13) with
its free variant.
#### eventmonitor data accelerator
is not free (09).
doesn't seem to be distributed. So again I would have to do
very much by myself.
As it looks I have (at least) the problem that
at 1 key/blip I get to many keys for Bitcask (InnoDB might work).
So I could build records which hold multiple blips; I will call them
One bin might hold 86,400 blips (e.g.) but only has one single key.
This will cause some work for packing and unpacking.
And it would in some cases increase the amount of data that is send
to the coordinating node because as I understand it, a whole bin would
sent even if only one blip is needed. Right? Or is there a mechanism
already filter the content of the bin before sending it to the
I'm still reading the pagination and discoproject stuff.
Give me some time before answering that!
At the moment I am really wondering if there is no off the shelf
solution for my needs. The problem I have seems so natural to me.
I just have blips to be logged away and I have users who later on
want to have a look at specific blip pages. It's just the amount of
data that is unusual.