| If the input schema of bigquery sink contains integer field, the bigquery sink is not able to write the record to existing tables. The pipeline will fail with exception like:
org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Long
at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308) ~[avro-1.8.2.jar:1.8.2]
at io.cdap.plugin.gcp.bigquery.sink.AvroRecordWriter.write(AvroRecordWriter.java:90) ~[SYSTEM-google-cloud-0.18.0-SNAPSHOT.jar:na]
at io.cdap.plugin.gcp.bigquery.sink.AvroRecordWriter.write(AvroRecordWriter.java:37) ~[SYSTEM-google-cloud-0.18.0-SNAPSHOT.jar:na]
at io.cdap.plugin.gcp.bigquery.sink.BigQueryRecordWriter.write(BigQueryRecordWriter.java:58) ~[SYSTEM-google-cloud-0.18.0-SNAPSHOT.jar:na]
or
org.apache.avro.file.DataFileWriter$AppendWriteException: org.apache.avro.UnresolvedUnionException: Not in union ["long","null"]: 1
at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308) ~[org.apache.avro.avro-1.8.2.jar:1.8.2]
at io.cdap.plugin.gcp.bigquery.sink.AvroRecordWriter.write(AvroRecordWriter.java:90) ~[SYSTEM-google-cloud-0.18.0-SNAPSHOT.jar:na]
at io.cdap.plugin.gcp.bigquery.sink.AvroRecordWriter.write(AvroRecordWriter.java:37) ~[SYSTEM-google-cloud-0.18.0-SNAPSHOT.jar:na]
at io.cdap.plugin.gcp.bigquery.sink.BigQueryRecordWriter.write(BigQueryRecordWriter.java:58) ~[SYSTEM-google-cloud-0.18.0-SNAPSHOT.jar:na]
at io.cdap.plugin.gcp.bigquery.sink.BigQueryRecordWriter.write(BigQueryRecordWriter.java:32) ~[SYSTEM-google-cloud-0.18.0-SNAPSHOT.jar:na]
at io.cdap.cdap.etl.spark.io.TrackingRecordWriter.write(TrackingRecordWriter.java:41) ~[hydrator-spark-core2_2.11-6.5.0-SNAPSHOT
To reproduce, create a pipeline that contains integer type field as input schema of bigquery sink, and try to write to bigquery sink. If the table does not exist, the pipeline will succeed in first run but fail in subsequent runs. If table exists, the pipeline will always fail. |