Cassandra + Hadoop JobConf example?

102 views
Skip to first unread message

burtonator

unread,
Jan 11, 2012, 3:43:15 PM1/11/12
to peregrine...@googlegroups.com

I posted this to the Cassandra list but didn't get a reply... could anyone here help?

...

I'm trying to port the Hadoop InputFormat to Peregrine (another map reduce impl I'm working on) …

The problem is that I can't get it to work with my config because the documentation is a bit sparse.

I could probably spend a ton of time tracking this down but I figured I'd be lazy and just ask :)

Can someone post their example Hadoop JobConf so I can use it as a template?

It's hard to figure out which params are required, optional, etc.

Brian O'Neill

unread,
Jan 11, 2012, 5:18:14 PM1/11/12
to peregrine...@googlegroups.com
Is this what you are looking for?

Job job = new Job(getConf(), getConf().get("jobName"));
                job.setJarByClass(RubyMapReduce.class);
                job.setMapperClass(CassandraMapper.class);
                job.setReducerClass(CassandraReducer.class);
                job.setInputFormatClass(ColumnFamilyInputFormat.class);
                job.setMapOutputKeyClass(Text.class);
                job.setMapOutputValueClass(ObjectWritable.class);

                ConfigHelper.setRpcPort(job.getConfiguration(), getConf().get("cassandraPort"));
                ConfigHelper.setInitialAddress(job.getConfiguration(), getConf().get("cassandraHost"));
                ConfigHelper.setPartitioner(job.getConfiguration(), "org.apache.cassandra.dht.RandomPartitioner");
                ConfigHelper.setInputColumnFamily(job.getConfiguration(), getConf().get("inputKeyspace"),
                                getConf().get("inputColumnFamily"));

                ConfigHelper.setOutputColumnFamily(job.getConfiguration(), getConf().get("outputKeyspace"),
                                getConf().get("outputColumnFamily"));
                job.setOutputFormatClass(ColumnFamilyOutputFormat.class);
                SlicePredicate sp = new SlicePredicate();
                SliceRange sr = new SliceRange(ByteBufferUtil.EMPTY_BYTE_BUFFER, ByteBufferUtil.EMPTY_BYTE_BUFFER, false,
                                MAX_COLUMNS_PER_ROW);
                sp.setSlice_range(sr);
                ConfigHelper.setInputSlicePredicate(job.getConfiguration(), sp);

                job.waitForCompletion(true);
                return 0;
--
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/

Kevin Burton

unread,
Jan 11, 2012, 8:39:02 PM1/11/12
to peregrine...@googlegroups.com
Yup… exactly what I needed!  Thanks!
--
--

Founder/CEO Spinn3r.com

Location: San Francisco, CA
Skype: burtonator

Skype-in: (415) 871-0687


Reply all
Reply to author
Forward
0 new messages