druid update existing data (metric)

52 views
Skip to first unread message

Sameer Paradkar

unread,
Apr 11, 2018, 6:04:21 PM4/11/18
to Druid Development
Hi -

I am currently having issues ingesting multiple files in druid (especially having issues with the metric column). I have 3 different files that I want to load in druid periodically (say once a day).So first I did a data ingestion using 'static' input spec with a metric column (say 'Count1'). Then I do a data ingestion again for data in second file using type 'multi' with dataSource = "<dataSource created after ingesting first file>". My config that I am using for ingesting second file looks like this - 

"ioConfig" : {

    "type" : "hadoop",

     "inputSpec" : {

        "type" : "multi",

        "children" : [

            {

               "type" : "dataSource",

               "ingestionSpec" : {

                    "dataSource" : "DS1",

                    "intervals" : ["2018-04-05/2018-04-07"]

               }

            },

            {

                "type" : "static",

                "paths" : "<Path to File number 2>"

            }

        ]

     }

},

"dataSchema" : {

   "dataSource" : "DS1",

   "granularitySpec" : {

       "type" : "uniform",

       "segmentGranularity" : "hour",

       "queryGranularity" : "hour",  

       "intervals" : [""2018-04-05/2018-04-07"]

   },

   "parser" : {

     .........

   },

   "metricsSpec" : [

      {"name" : "Count2"}

   ]

},

.....



Question is - When I ingest data for first file - I see all the data and metric column (Count1 with values), but when I ingest data for file 2 using above config I see the new metric column (Count2 with values) but somehow the first metric column (Count1) now has all 0 in it. 

What I want to see is the data from all the files get appended to the existing data and the metric columns to have values accordingly so at the end I can see all the metric columns and SUM values for Count1, Count2 and Count3. 

Can someone please help to point out what I am doing wrong? I have referred http://druid.io/docs/latest/ingestion/update-existing-data.html 


but I fully don't understand the example mentioned there, especially what the dataSource test1 defined in the example or if I need to set the segments in my config and if so how?


Reply all
Reply to author
Forward
0 new messages