data processing tool in hadoop

16 views
Skip to first unread message

orazio

unread,
Jul 2, 2019, 11:10:50 AM7/2/19
to cascalog-user
Hi All,

I'm newbie on Big Data, and i'm starting with hadoop.
I have installed Hortonworks HDP 3.1 
I have to design a Big Data Layer that ingests large iot datasets and social media datasets, process data with MapReduce job and produce aggregation to store on HBASE tables.
For now, my focus is addressed on data processing issue. I'm investigating hadoop ecosystem to find  a suitable tool for batch data processing.
I found many candidate tools like Apache Beam, Cascalog, Scalding, Spark. What do you think about them ?
Cascalog learning curve is not simple. I need your help to undestand if Cascalog is suitable for this scope and if Cascalog is yet maintened for support.
I would appreciate some help.

Orazio
Reply all
Reply to author
Forward
0 new messages