Error: Train split contains no sample.

481 views
Skip to first unread message

Noel Pineiro

unread,
Feb 21, 2022, 6:23:35 PM2/21/22
to cloud-automl-tables-discuss
Hello

I got the error "Train split contains no sample." when I tried to train an AutoML model for structured table data.

Before writing here I tried to do an random train/test split and then creating a column for a manual split, but I got the same error.

Luciano Giavedoni

unread,
Apr 2, 2022, 8:49:15 PM4/2/22
to cloud-automl-tables-discuss
Hi,
I am getting the same error. Have you been able to solve this? Thanks in advance

Chenyu Zhao

unread,
Apr 5, 2022, 2:07:26 PM4/5/22
to Luciano Giavedoni, cloud-automl-tables-discuss
Hi Luciano,

Is it possible for you to share your dataset?
If you're using a manual split, you need to make sure that there are samples in the TRAIN split.


-Chenyu

--
You received this message because you are subscribed to the Google Groups "cloud-automl-tables-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-automl-tables...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-automl-tables-discuss/c6c012d7-cdae-457c-8638-b60f45e01462n%40googlegroups.com.

Luciano Giavedoni

unread,
Apr 6, 2022, 12:11:57 PM4/6/22
to cloud-automl-tables-discuss
Thanks Chenyu for looking into this.

I have tried with both Data Split models, Random and Chronological. No both cases it's says that there is not data, but there are 1.8M  rows and a lot of distinct values form the Date column. 

Find attached the specifications.
Screen Shot 2022-04-06 at 9.08.29 AM.pngScreen Shot 2022-04-06 at 9.08.14 AM.pngScreen Shot 2022-04-06 at 9.07.45 AM.png

Chenyu Zhao

unread,
Apr 7, 2022, 3:52:03 PM4/7/22
to Luciano Giavedoni, cloud-automl-tables-discuss
Are you perhaps overriding the transformations when training?
This can cause certain rows to be invalid and thus excluded from training. e.g. if you use "Numeric" transformations for a column that contains text values. Or if your timestamps don't correspond to a valid timestamp format.



Luciano Giavedoni

unread,
Apr 7, 2022, 4:22:58 PM4/7/22
to Chenyu Zhao, cloud-automl-tables-discuss
Interesting, will validate. 

But I am getting the same error when I select the random option, and I assume that in that case the conversion is not involved, right? 
--
Sent from my iPhone

Chenyu Zhao

unread,
Apr 7, 2022, 4:38:13 PM4/7/22
to Luciano Giavedoni, cloud-automl-tables-discuss
The validation occurs over all of the columns, not just the column used to split the data.

e.g. if you have a dataset like:

Time, Foo, Bar
2022/01/01, abc, 123
2022/01/02, def, 456
2022/01/03, ghi, xyz

And you set the transformation for column "Bar" to be "Numeric", that entire 3rd training example will be thrown out.

This means if you have an entire column that's misconfigured or misformatted, this can cause the entire dataset to be accidentally invalid and result in 0 training examples.

Luciano Giavedoni

unread,
Apr 8, 2022, 5:51:31 PM4/8/22
to Chenyu Zhao, cloud-automl-tables-discuss
Yes, that works!! 

Thanks for your help with this. Seems like the automated conversion was not getting everything right, I made few updates manually and (only for testing) I removed the tricky columns (like TimeStamps)


Chenyu Zhao

unread,
Apr 8, 2022, 6:45:13 PM4/8/22
to Luciano Giavedoni, cloud-automl-tables-discuss
Excellent!

I'm a bit surprised the automated transformation detection didn't get it right though. Can you let me know what you had to manually change to get it to work?
Reply all
Reply to author
Forward
0 new messages