Felix,
The kafka will contain lots of topic, and each topic contains large amount of records, and I need an application to commit those records to hbase as fast as enough: This part itself wasn't that interesting as I already have a storm topology reading from kafka with the high level api at real time and populate to hbase. However, I have requirement that certain topic of kafka need to be read at certain time, thus making it difficult to utilize the storm topology. I thus need to make a stand alone distributable application that can read the topic at specified time, and commit to hbase, hence the map reduce method. (originally i was think of just writting a high level consumer with the same group id and deploy it to multiple machines, but it seems to do the job of map reduce, then I did some research and found that camus might suit my need)
Please do let me know if this makes sense to you, or i could have a better solution.
Thanks,
Chen