Hi,
I am new to druid and have been exploring the indexing service on some sample data and I have few doubts: I have sample data corresponding to 2 hours (00 and 01). The size of input data is around 1.5 GB per hour. The granularity spec specified in my configuration is hour. I tried running the hadoop index task and the local index task on this data. My hadoop cluster consists of 1NN, 3DN, 1 Resource Manager, 1 Job History Server. Each node has 12 GB RAM. I am running the indexing service in local mode.
1. When I run the hadoop index task on 1 hr's worth of data, it takes around 20-25 mins to complete. On the other hand, when I run the local index task, then it takes around 10 mins to complete.
2. When I run the hadoop index task on 2 hours worth of data, it takes around 1 hr to complete. On the other hand, when I run the local index task, then it takes around 20 mins to complete.Q. Why is there such a big difference between the execution of local index task and the hadoop index task? How does the hadoop index and local index compare in terms of performance? Under what scenarios, I should be using the hadoop index task? Given the same data, is the hadoop index task supposed to run faster than local index task?
3. The above runs were done with no PartitionSpec specified. Later, I modified the hadoop index task to consider hash based partitioning. In that case, the determine_hashed_partitioning job succeeds. However, the index_generator job fails with the following exception:
Caused by: com.metamx.common.ISE: WTF?! No bucket found for row: MapBasedInputRow{…}
Q. Is there something else that needs to specified for considering hash partitioning? When should we be opting for hash partitioning and what advantages can we expect?
Q. I also observed that when I was running the hadoop index task, last two reducers used to take a long time to finish. This increases the job execution time significantly. What can be the reason for such a behavior?
--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/c03c5743-34b2-4e65-8ee8-5b9aac00fdc7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
|
{"GEO_REGION":"00","CURRENT_PAGE_URL":"page_url","REFERRING_URL":"NULL","FA_LAST_INPUT":"NULL","DEVICE_ID":"generic web browser","COUNTRY_OF_PAGE":"glb","ACCOUNT_TYPE":"NULL","ACCOUNT_COUNTRY":"NULL","timestamp":"2014-03-15T00:00:00Z","REFERRER_NAME":"NULL","LINK_URL":"NULL","MOBILE_DEVICE_NAME":"NULL","GEO_STATE":"NULL","PAGE_NAME":"page","TEMPLATE_NAME":"NULL","GEO_COUNTRY":"PR","PAGE_ERROR_STRING":"NULL","INTERNAL_SEARCH_TERM":"NULL","BUSINESS_CHANNEL_NAME":"ec","PAGE_LINK_NAME":"NULL","SERVER_BUSINESSNAME":"main","REFERRER_SEARCH_KEYWORD":"NULL","REFERRER_TYPE":"NULL","PAGE_GROUP":"page_grp","ACCOUNT_VERIFIED":"NULL","PAGEGROUP_LINK_NAME":"NULL","SCREEN_WIDTH":"1920","LINK_NAME":"NULL","DEVICE_TYPE":"COMPUTER","SCREEN_HEIGHT":"1200","CLIENT_OS":"MAC", "count": 1 }
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/c07a8836-8579-4737-a5db-8266784e2a61%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/111ed8c9-b419-42de-b5a1-3b8de10b42c2%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/67ecf154-837a-4928-a980-4698e5821aba%40googlegroups.com.
...--
Nishant <table cellspacing="0" cellpadding="0" border="0" style="border-collapse:collapse;font-size:11px;font-family:Helvetica,Ari
Software Engineer | METAMARKETS
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/c9523064-7624-4c5c-9c80-78e3dcf75c19%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/9baa2b85-06e9-40b8-8ffe-9d1bd3fcef73%40googlegroups.com.