Hi,
I am a newbie to Mongodb and am experimenting if we could use it to store Time series data. I have several csv files that contain this csv data, each about ~2GB in size and holding about 81 mill records. Each csv has data for one type of time series, timestamp is stored as a number and values are recorded at 5 min intervals for around 4000 objects. I need to load this data into Mongo. What is the fastest way to do this? The data is of the following format:
trf1.csv: contains timestamp & value for each object by 5 min intervals. There are around 4000 objects:
object_id,metric,timestamp,value
197666, trf1,1352985262, 0.944456778
197514,trf1, 1352985272, 0.776543234
197666, trf1,1352987301, 0.944456778
Likewise there are trf2.csv, trf3.csv etc for other types of time series.
Im not sure what the best way to store this is, but here's what Im thinking - A collection called metrics which will contain a document for each type of metric. For ex:
db.metric.insert
{
{ id : trf1
Object: 197666
TS{
{ timestamp : 1352985262,
value : 0.944456778
},
{ timestamp : 1352987301,
value : 0.944456778
},
and so on ~ 50,000 key value pairs
}
Object: 197514
TS{
{ timestamp : 1352985262,
value : 0.944456778
},
{ timestamp : 1352987301,
value : 0.944456778
},
and so on.....
}
},
{
_id : trf2
Object : 197514
TS {
{ timestamp : 1352985262,
value : 0.944456778
},
{
........
}
Please share your suggestions on how to best model & insert this data. Thanks