Text Extraction training requires training, validation and test datasets to be non-empty

97 views

Skip to first unread message

Earl Potters

unread,

Oct 12, 2021, 9:22:08 AM10/12/21

to cloud-nl-discuss

I have labeled 56 datapoints for Entity extraction however when I create a training model and I click 'randomly assigned'

I get this error:

Training pipeline failed with error message: Text Extraction training requires training, validation and test datasets to be non-empty. The number of data items selected for training is 56, for validation is 0, for test is 0.

I gave the dateset ratio to be 80:10:10 for training:validation:Test

I was able to work with manually assigning them but I would like to know what is causing the error?

dikaur

unread,

Oct 13, 2021, 3:26:54 PM10/13/21

to cloud-nl-discuss

Hello,

Please make sure you are meeting the following requirements for your dataset.[1] We cannot use empty string and every row must have a value for column. As stated in documentation[2],rows are selected for a data split randomly, but deterministically. If your generated data splits do not satisfy you, you must use a manual split or change the training data. Training a new model with the same training data results in the same data split.

Best Regards

Dilpreet

[1] https://cloud.google.com/automl-tables/docs/prepare#determining_what_to_include_in_your_dataset

[2] https://cloud.google.com/automl-tables/docs/prepare#ml-use

Reply all

Reply to author

Forward

0 new messages