Hi,
I am training a model where one column contains on average 3000 characters, containing highly descriptive language of a vehicle for sale. My label column is the price, and the column in question has by far the highest feature importance of all columns included in the dataset.
My question: How does AutoML hander embedding for long form natural language for tabular regression? Should I be running this text through an embedding model and using the vectorized column in my training data? Or, is this something that should be left to AutoML?
Thank you.
Sincerely,
Jack