about exception :java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.bson.BSONObject

562 views
Skip to first unread message

lida...@gmail.com

unread,
Feb 25, 2014, 9:49:18 AM2/25/14
to mongod...@googlegroups.com
Hi,
Anyone can help me,please?
I use mongo-hadoop component with hadoop2. I want to read data from .bson files  dumped by mongodump.I code as tutorial ( https://github.com/mongodb/mongo-hadoop/blob/master/BSON_README.md ).
I got these exception:
Error: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.bson.BSONObject
at com.lidl.hadoop.BSONRead$TokenzierMapper.map(BSONRead.java:27)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Here is my code:
public class BSONRead {

private static class TokenzierMapper extends Mapper<Object, BSONObject, Text, IntWritable> {

    @Override
    protected void map(Object key, BSONObject value, Context context) throws IOException,
            InterruptedException {
        logger.info("key:{},<--->value:{}", new Object[] {key, value.toString()});
        System.out.println("key-value:" + key + "---" + value);
        /**
         * 
         final int year = ((Date) pValue.get("_id")).getYear() + 1900; double bid10Year =
         * ((Number) pValue.get("bc10Year")).doubleValue();
         */
        int width = ((Number) value.get("stuffWidth")).intValue();
        if (width >= 500) {
            context.write(new Text(String.valueOf(width)), new IntWritable(1));
        }
    }
}
private static class CountSubmmitReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
    @Override
    protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException,
            InterruptedException {
        int count = 0;
        for (IntWritable intWritable : values) {
            count += intWritable.get();
        }
        context.write(key, new IntWritable(count));
    }
}

private static final Logger logger = LoggerFactory.getLogger(WorkerAgeStat.class);

public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
    final Configuration conf = new Configuration();
    // conf.setBoolean("mongo.input.split.create_input_splits", true);
    // conf.setBoolean("bson.split.write_splits", false);
    // conf.setBoolean("bson.split.read_splits", false);
    MongoConfigUtil.setInputFormat(conf, BSONFileInputFormat.class);
    MongoConfigUtil.setOutputURI(conf, DBHandler.getCollection("textInputAssignmentBigWidth"));
    logger.info("conf:{}", conf);
    final Job job = new Job(conf, "submit counter");

    job.setJarByClass(BSONRead.class);

    job.setMapperClass(TokenzierMapper.class);

    job.setReducerClass(CountSubmmitReducer.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

    FileInputFormat.addInputPath(job, new Path("file:///home/tomcat/tmp/zonlolo/textInputAssignment.bson"));
    job.setOutputFormatClass(MongoOutputFormat.class);

    System.exit(job.waitForCompletion(true) ? 0 : 1);
}

Gianfranco

unread,
Mar 10, 2014, 1:13:20 PM3/10/14
to mongod...@googlegroups.com
Hi,

Are you still having this issue?

I believe there error occurs in the map() function, where you have value.toString().
the value variable which is a BSONObject, does not have to String() function.

You'll need to say which key of the object you want to get.
For example:
value.get( "x" ).toString()

See if this other example could help your:

Cheers

lida...@gmail.com

unread,
Apr 6, 2014, 6:03:11 AM4/6/14
to mongod...@googlegroups.com
Hi,
Thanks for ur reply,although it is so late.
We have debug the code ,the conclusion is error occured when mappering values,codes not run in mapper function.
The component is ok when mapper value directly from mongo db online.
Thanks. 

在 2014年3月11日星期二UTC+8上午1时13分20秒,Gianfranco写道:

ghanakota gayatri

unread,
Apr 9, 2014, 12:18:16 AM4/9/14
to mongod...@googlegroups.com
Could you explain what the problem was. I am having the same problem of casting from Text to BSONObject and not able to resolve it since two days.

David Beveridge

unread,
Jul 14, 2014, 8:22:35 PM7/14/14
to mongod...@googlegroups.com
Were either of you able to resolve this? We are having the same issue trying to query BSON files with Hive on Elastic Map Reduce clusters . . . .
Reply all
Reply to author
Forward
0 new messages