Remove constant columns in h2o dataframe

161 views
Skip to first unread message

Hansu GU

unread,
Dec 14, 2016, 1:10:35 PM12/14/16
to H2O Open Source Scalable Machine Learning - h2ostream
Hi,

I would like to do something like this, but it does not seem to be supported in h2o 3.10.1.1

dat <- data.frame(x=1:5, y=rep(1,5))
df = as.h2o(dat)
newDF = df[, apply(df, 2, function(v) var(v, na.rm=TRUE)!=0)]

I know when running h2o models (GLM, DRF), constant columns are ignored. However, I would like to export to cleaned dataframe for other software, so I would like to have this capability. Thank you if I can get some help.

Hansu

Hansu GU

unread,
Dec 15, 2016, 11:45:28 AM12/15/16
to H2O Open Source Scalable Machine Learning - h2ostream
May I get some help from anyone please?

Lauren DiPerna

unread,
Dec 16, 2016, 6:08:25 PM12/16/16
to Hansu GU, H2O Open Source Scalable Machine Learning - h2ostream
There is bug with using h2o.var (var()) with h2o's apply, you can track the jira ticket here: https://0xdata.atlassian.net/browse/PUBDEV-3815

If you are using the dataset with other modeling software, I would recommend using R's dataframe to remove constant columns. At the moment you can't do this in h2o's R.

cheers,

Lauren

--
You received this message because you are subscribed to the Google Groups "H2O Open Source Scalable Machine Learning - h2ostream" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages