Just for grins, I was trying out Kryo serialization support, via the cascading.kryo project.
My first hurdle was that I had another dependency in my project that indirectly pulled in Kryo 1.04, and that older version is missing a method used by cascading.kryo (which depends on 2.04).
And since the older version of Kryo uses a different groupId (com.googlecode vs. com.twitter) it was a little confusing sorting out the problem.
The next issue was that my custom class had to be public, not private (I was using a private static class for the test).
So now I've got code that seems to be writing out the serialized custom object, e.g. when I inspect the SequenceFile I see the object's data.
But when I try to read it back in I get an empty Tuple (no errors, though).
Here's the snippet of code:
public static class MyCustomClass {
private String someValue;
public MyCustomClass() {
// Empty constructor for Kryo
}
public MyCustomClass(String someValue) {
this.someValue = someValue;
}
public String getValue() {
return someValue;
}
}
@Test
public void testSerializationOfCustomTypes() throws IOException {
File tmpDirLHS = new File("build/test/KryoTest/testSerializationOfCustomTypes/in");
tmpDirLHS.deleteOnExit();
JobConf conf = new JobConf();
conf.set("io.serializations", conf.get("io.serializations") + ",cascading.kryo.KryoSerialization");
Properties properties = new Properties();
FlowConnector.setApplicationJarClass(properties, KryoTest.class);
MultiMapReducePlanner.setJobConf(properties, conf);
Fields inFields = new Fields("custom");
String filename = tmpDirLHS.getAbsolutePath();
Lfs in = new Lfs(new SequenceFile(inFields), filename, true);
TupleEntryCollector writer = in.openForWrite(conf);
writer.add(new Tuple(new MyCustomClass("test")));
writer.close();
TupleEntryIterator iter = in.openForRead(conf);
TupleEntry te = iter.next();
System.out.println("Tuple size: " + te.size());
This prints out "Tuple size: 0"
Any ideas what might be going wrong?
Thanks,
-- Ken
--------------------------
Ken Krugler
custom big data solutions & training
Hadoop, Cascading, Mahout & Solr