I'm applying the as.factor() function on the response column of an H2OFrame object but when I apply levels() on that I get a NULL value; however, is.factor() still gives TRUE. Would anyone know why this is happening?
Ultimately, the error I want to resolve is:
"Details: ERRR on field: _hidden: Model is too large: 184861614 parameters. Try reducing the number of neurons in the hidden layers (or reduce the number of categorical factors)"
My model has 67882 predictors and two hidden layers with 10 neurons each (and 4 categorical responses), so I think instead of the model being too big, the error message relates to the number of categorical factors which is somehow not being properly computed as indicated by the levels() returning NULL.
Any help would be much appreciated!
Thank you,
Rajat
Yes I'm using R. The cardinality value for the response factor is indeed 4. and I checked that I'm getting the correct number in the Zeros column. Going back to the original error message then (copied below), it looks like the issue is with the model size after all. Could you please comment on the number of parameters - is it unusually large/more than typically seen?
"Details: ERRR on field: _hidden: Model is too large: 184861614 parameters. Try reducing the number of neurons in the hidden layers (or reduce the number of categorical factors)"
Thanks very much,
Rajat
Thanks Darren. The average cardinality was that high because I was using continuous RPKMs as inputs - now I'm trying with integer ceilings of the RPKMs. The model trains fine but now there's a problem of not being able to evoke newdata properly when trying h2o.predict or confusionMatrix using the H2OFrame of the test object. Could you please have a look at my post for this at https://groups.google.com/forum/#!topic/h2ostream/0taDdN8b0uk when you get a chance?
I tried to run h20.init with the ip and port manually specified, but I had to go with 127.0.0.1 since I'm on a cluster environment so the sys admin said that I don't have a fixed IP address (since there are many nodes available)...
Best regards,
Rajat
I trained stacked autoencoder for dimension reduction. I got the same problem. The input data is 5360x51000 table size. I create the stacked autoencoder with
(12000,6000,3000) layers. It raised the same error. How to handle in this case? did you slove the problem?