Text Extraction training requires training, validation and test datasets to be non-empty

97 views
Skip to first unread message

Earl Potters

unread,
Oct 12, 2021, 9:22:08 AM10/12/21
to cloud-nl-discuss
I have labeled 56 datapoints for Entity extraction however when I create a  training model and I click 'randomly assigned'

I get this error:

Training pipeline failed with error message: Text Extraction training requires training, validation and test datasets to be non-empty. The number of data items selected for training is 56, for validation is 0, for test is 0.

I gave the dateset ratio to be 80:10:10 for training:validation:Test


I was able to work with manually assigning them but I would like to know what is causing the error?

dikaur

unread,
Oct 13, 2021, 3:26:54 PM10/13/21
to cloud-nl-discuss

Hello,

Please make sure you are meeting the following requirements for your dataset.[1] We cannot use empty string and every row must have a value for column. As stated in documentation[2],rows are selected for a data split randomly, but deterministically. If your generated data splits do not satisfy you, you must use a manual split or change the training data. Training a new model with the same training data results in the same data split.

Best Regards

Dilpreet

[1] https://cloud.google.com/automl-tables/docs/prepare#determining_what_to_include_in_your_dataset

[2] https://cloud.google.com/automl-tables/docs/prepare#ml-use

Reply all
Reply to author
Forward
0 new messages