Prepare Train Set In Hadoop

18 views
Skip to first unread message

Sanjay Bhosale

unread,
Jul 31, 2013, 10:16:35 AM7/31/13
to chenn...@googlegroups.com
I want to create train set from multiple files using hadoop.
So i want to know which is best to do it - Pig latin, Hive, HBase.
Structures and number of records belonging to it are dynamic and i want to combine those files by calculating ratios to be taken from each file.
any detail provided will be appreciated..
Reply all
Reply to author
Forward
0 new messages