Kite dataset schema update fails when promoting int to float

97 views
Skip to first unread message

Buntu Dev

unread,
May 6, 2015, 7:59:39 PM5/6/15
to cdk...@cloudera.org
The only difference between the existing and new schema is converting an existing 'int' to 'float':

old schema: { "name" : "column1", "type" : [ "null", "int" ], "default" : null }
new schema: { "name" : "column1", "type" : [ "null", "float" ], "default" : null }

This is the error:

~~~~~~~
Validation error
org.kitesdk.data.IncompatibleSchemaException: Schema cannot read data written using existing schema. Schema: {
....
....  <new schema>
---
}
Existing schema: {
....
... <old schema>
...
}
at org.kitesdk.data.spi.Compatibility.checkCompatible(Compatibility.java:261)
at org.kitesdk.data.spi.Compatibility.checkUpdate(Compatibility.java:233)
at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.update(FileSystemDatasetRepository.java:171)
at org.kitesdk.data.spi.AbstractDatasetRepository.update(AbstractDatasetRepository.java:46)
at org.kitesdk.cli.commands.UpdateDatasetCommand.run(UpdateDatasetCommand.java:92)
at org.kitesdk.cli.Main.run(Main.java:181)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.kitesdk.cli.Main.main(Main.java:263)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
~~~~~~~~~~~~

Based on Avro schema resolution document, int should be promotable to float:

~~~~
 the writer's schema may be promoted to the reader's as follows:
  • int is promotable to long, float, or double
~~~~

Can I update the int to float or is it considered incompatible? I built the kite-dataset from the latest source code.

Thanks!

Joey Echeverria

unread,
May 7, 2015, 4:12:15 AM5/7/15
to Buntu Dev, cdk...@cloudera.org
This looks like a bug in Avro when resolving unions:

https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/parsing/ResolvingGrammarGenerator.java#L494

I think that case statement should include float. I'll see if I can
reproduce this in Avro and then file a JIRA.
> --
> You received this message because you are subscribed to the Google Groups
> "CDK Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdk-dev+u...@cloudera.org.
> For more options, visit https://groups.google.com/a/cloudera.org/d/optout.



--
Joey Echeverria
Senior Infrastructure Engineer

Buntu Dev

unread,
May 7, 2015, 7:55:34 PM5/7/15
to Joey Echeverria, cdk...@cloudera.org
Thanks Joey for looking into the issue.
Reply all
Reply to author
Forward
0 new messages