data processing tool in hadoop

21 views

Skip to first unread message

orazio

unread,

Jul 2, 2019, 11:10:50 AM7/2/19

to cascalog-user

Hi All,

I'm newbie on Big Data, and i'm starting with hadoop.

I have installed Hortonworks HDP 3.1

I have to design a Big Data Layer that ingests large iot datasets and social media datasets, process data with MapReduce job and produce aggregation to store on HBASE tables.

For now, my focus is addressed on data processing issue. I'm investigating hadoop ecosystem to find a suitable tool for batch data processing.

I found many candidate tools like Apache Beam, Cascalog, Scalding, Spark. What do you think about them ?

Cascalog learning curve is not simple. I need your help to undestand if Cascalog is suitable for this scope and if Cascalog is yet maintened for support.

I would appreciate some help.

Orazio

Reply all

Reply to author

Forward

0 new messages