"ioConfig" : {
"type" : "hadoop",
"inputSpec" : {
"type" : "multi",
"children" : [
{
"type" : "dataSource",
"ingestionSpec" : {
"dataSource" : "DS1",
"intervals" : ["2018-04-05/2018-04-07"]
}
},
{
"type" : "static",
"paths" : "<Path to File number 2>"
}
]
}
},
"dataSchema" : {
"dataSource" : "DS1",
"granularitySpec" : {
"type" : "uniform",
"segmentGranularity" : "hour",
"queryGranularity" : "hour",
"intervals" : [""2018-04-05/2018-04-07"]
},
"parser" : {
.........
},
"metricsSpec" : [
{"name" : "Count2"}
]
},
.....
Question is - When I ingest data for first file - I see all the data and metric column (Count1 with values), but when I ingest data for file 2 using above config I see the new metric column (Count2 with values) but somehow the first metric column (Count1) now has all 0 in it.
What I want to see is the data from all the files get appended to the existing data and the metric columns to have values accordingly so at the end I can see all the metric columns and SUM values for Count1, Count2 and Count3.
Can someone please help to point out what I am doing wrong? I have referred http://druid.io/docs/latest/ingestion/update-existing-data.html
but I fully don't understand the example mentioned there, especially what the dataSource test1 defined in the example or if I need to set the segments in my config and if so how?