Ingesting BigQuery as a dataset and training creates an error

208 views
Skip to first unread message

Byron Rogers

unread,
Oct 5, 2023, 7:03:07 PM10/5/23
to cloud-automl-tables-discuss

Hi,

I seem to be running into an issue with getting an AutoML model to train on a BigQuery dataset.

I created a dataset view in BigQuery

bq://velox-horse1.ml_datasets.conformation

using the console, I created a new dataset in VertexAI and it imported it without issue - dataset ID - 2137926692931371008

But, when I kicked off an AutoML train to train a tabular model, after 9 minutes or so I got the error.

Training pipeline failed with error message: Access denied when accessing the BigQuery resources or the data backing the BigQuery resources.

I wnt back to BigQuery, opened up the ml_datasets view and explcitly added service-<myprojectIDwithheld>@gcp-sa-aiplatform.iam.gserviceaccount.com to the list of bigquery admins on velox-horse1.ml_datasets.

I then kicked off another AutoML train and it started but at the same 9 min mark this time produced the error...

Training pipeline failed with error message: BigQuery resource does not exist.

The training pipeline ID is 3342406840385273856

Obviously the resouce exists, as it imported the schema for the dataset. Any ideas as to why this would occur? 

Thanks in advance

Byron

Byron Rogers

unread,
Oct 5, 2023, 8:08:13 PM10/5/23
to cloud-automl-tables-discuss
I also tried it as an AUTOML pipeline as well and got the error

 The DAG failed because some tasks failed. The failed tasks are: [exit-handler-1].; Job (project_id = velox-horse1, job_id = 8224429988892377088) is failed due to the above error.; Failed to handle the job: {project_number = 780742208705, job_id = 8224429988892377088}

On the actual failed node the info said

The DAG failed because some tasks failed. The failed tasks are: [tabular-stats-and-example-gen]. 

Message has been deleted
Message has been deleted

Oxana Golodyuk

unread,
Nov 22, 2023, 4:22:07 PM11/22/23
to cloud-automl-tables-discuss
Hi Byron,

I had a similar problem at around the same runtime (7-9 minutes), and the issue was permissions to access BigQuery. I found a solution that worked!

When you run training on VertexAI, the process uses the default project's service account. You need to grant it BQ Editor or (if you want to be completely sure) Editor permissions to the project to let it manipulate data of that project in BigQuery. The default service account doesn't come up in your usual service accounts list unless you search for it. It will usually look like service-YOUR_PROJECT_NUMBER@gcp-sa-aiplatform.iam.gserviceaccount.com/

Logging was very helpful for troubleshooting all errors throughout the whole process.

Hope that helps!

Oksana

Reply all
Reply to author
Forward
0 new messages