When I call h2o on this dataframe it says:
Test/Validation dataset has a categorical response column 'verdict' with no levels in common with the model
Do I need to remove columns from the dataframe I'm testing on so that it matches the X values in the original call to learn the model? Is there a canonical way to do this? I'd rather not alter the dataframe, but it's huge, so making a copy with the necessary columns isn't really great.
thanks
I'm inferring there are two options:
1. the test frame has the response column with possible values in common with the training data set, in which case I'll probably get r^2 and other evaluation scores back
2. I omit the response column altogether from the training set, in which case I will get back results and no accuracy scores
I was confused at first because I wanted to make sure that prediction wasn't using my response column in some way in the predictions, that it knew that was the response column from training and therefore treating as such in prediction. It wasn't clear to me from the error since I has assumed it would be ignored entirely.
I think this makes sense now. But please do correct me if any of this is wrong. Thanks!!