Impala "COMPUTE STATS" fails after Kite Dataset schema update

65 views
Skip to first unread message

Buntu Dev

unread,
May 1, 2015, 12:55:46 PM5/1/15
to cdk...@cloudera.org
WE updated the Kite Dataset schema adding in a new column. When attempting to use "COMPUTE STATS" against the table we are running into similar error as in IMPALA-1104.
We are on CDH 5.3.2 with Impala 2.1 and according to the JIRA, this is supposed to be fixed in Impala 2.0. So is this something the way the schema update is applied via kite-dataset or something to debug on the Impala end?


~~~~~
Query: compute stats mytable
ERROR: AnalysisException: Cannot COMPUTE STATS on Avro table 'mytable' because its column definitions do not match those in the Avro schema.
Missing column definition corresponding to Avro-schema column 'thefirstcolumn' of type 'STRING' at position '0'.
Please re-create the table with column definitions, e.g., using the result of 'SHOW CREATE TABLE'

~~~~~

Thanks!

Buntu Dev

unread,
May 4, 2015, 1:52:06 PM5/4/15
to cdk...@cloudera.org
Still running into this issue after upgrading to CDH 5.4. This is happening on the dataset which had a schema update. Any ideas on how to fix this?

Thanks!

Joey Echeverria

unread,
May 4, 2015, 3:07:45 PM5/4/15
to Buntu Dev, cdk...@cloudera.org
That sounds like this issue:

https://issues.cloudera.org/browse/CDK-974

This was fixed after 1.0 so I'm not sure that it's in CDH 5.4.

-Joey
> --
> You received this message because you are subscribed to the Google Groups
> "CDK Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdk-dev+u...@cloudera.org.
> For more options, visit https://groups.google.com/a/cloudera.org/d/optout.



--
Joey Echeverria
Senior Infrastructure Engineer

Buntu Dev

unread,
May 4, 2015, 3:29:50 PM5/4/15
to Joey Echeverria, cdk...@cloudera.org
Thanks Joey, CDH 5.4 has v1.0.

I do see the new column I added via dataset update when I use 'show create table <table>' or 'describe extended <table>'. I'm not clear as to what the patch would do.

Buntu Dev

unread,
May 4, 2015, 4:40:35 PM5/4/15
to Joey Echeverria, cdk...@cloudera.org
I'm look at the mvn repo to get the latest kite-tools binary but the latest there is 1.0.0.

I built kite from sources but don't see the kite-tools-xxx-binary.jar, is there a way build from sources to generate the binary?

Thanks!

Joey Echeverria

unread,
May 4, 2015, 5:10:37 PM5/4/15
to Buntu Dev, cdk...@cloudera.org
When you build it form sources you should end up with a `kite-dataset`
binary in the `target` of the `kite-tools-binary` module.

Buntu Dev

unread,
May 4, 2015, 5:33:20 PM5/4/15
to Joey Echeverria, cdk...@cloudera.org
Yes, found the kite-dataset.. thanks.

I attempted to reissue an update on a dummy dataset using the latest kite-dataset and see this error message from Impala:

~~~~
Query: compute stats mytable
ERROR: AnalysisException: Cannot COMPUTE STATS on Avro table 'mytable' because its column definitions do not match those in the Avro schema.
Definition of column 'column20' of type 'int' does not match the Avro-schema column 'column19' of type 'STRING' at position '20'.
Please re-create the table with column definitions, e.g., using the result of 'SHOW CREATE TABLE'
~~~~~


Buntu Dev

unread,
May 4, 2015, 6:23:03 PM5/4/15
to Joey Echeverria, cdk...@cloudera.org
After performing "invalidate metadata <tbl>", I'm able to run "COMPUTE STATS <tbl>" successfully.. thanks a bunch for your input!!
Reply all
Reply to author
Forward
0 new messages