import Datastore entities inside BigQuery : problem while reading csv files (regression issue ?)

18 views
Skip to first unread message

Julien Piquot

unread,
Mar 17, 2015, 11:52:19 AM3/17/15
to app-engine-...@googlegroups.com
Hi all,

I am using the pipeline and the map reduce API to import datastore entites inside BigQuery. My configuration is pretty much the same as the bigqueryload exemple :
  • The input is a DatastoreInput.
  • The output is BigQueryGoogleCloudStorageStoreOutput
  • There is an additional job called BigQueryLoadGoogleCloudStorageFilesJob that read the csv files and fill the table
I am using appengine-mapreduce version 0.8.1 and everything works just fine. I tried to update the library to 0.8.2 and the final staging job is failing : 

com.google.appengine.tools.mapreduce.bigqueryjobs.RetryLoadOrCleanupJob run: Job failed while writing to Bigquery. Retrying...#attempt 4 Error details : invalid: Invalid path: gs://my_bucket/Job-57656379-0175-4393-afaa-6d70d86a3322/Shard-0000/file-1426599362203 at null

I noticed that regardless of the version or the number of Datastore entities, some csv files may stay empty :

com.google.appengine.tools.mapreduce.impl.WorkerShardTask run: Ending slice after 0 items read and calling the worker 0 times

I am having hard time understanding what's going on but my guess is that the BigQueryLoadGoogleCloudStorageFilesJob 0.8.2 version doen't like very much empty csv files.
Any idea about that ?

Thanks for the help,

Julien

Arie Ozarov

unread,
Mar 19, 2015, 8:29:59 PM3/19/15
to app-engine-...@googlegroups.com
It does not look like BigQueryLoadGoogleCloudStorageFilesJob between 0.8.1 and 0.8.2 but do you mind to post this issue here: https://github.com/GoogleCloudPlatform/appengine-mapreduce/issues
Both appengine-mapreduce and appengine-pipeline projects were moved to gitHub.

Julien Piquot

unread,
Mar 20, 2015, 5:32:10 AM3/20/15
to app-engine-...@googlegroups.com
Hello,

Thanks for your answer. I just published the same message on Github.

Julien
Reply all
Reply to author
Forward
0 new messages