That's in the h2o gihub repo. There are a number of demos in testdir_demos that are useful as examples
For instance, here it was using the variable importance from GBM model result:
+ # Access Variable Importance from the built model
+ gbm.VI = my.gbm@model$varimp
+ print("Variable importance from GBM")
+ print(gbm.VI)
+
+ par(mfrow=c(2,2))
+ # Plot variable importance from GBM
+ barplot(t(gbm.VI[1]),las=2,main="VI from GBM")
The print gave this. It sounds like your question is just how to extract the right column/row of information from the VI result here? I believe it's just a data frame, but maybe you can clarify your question, so someone more expert in R will give you the exact right answer.
the test output from above (not the barplot though, although you can see how the barplot above extracts a column.
[1] "Variable importance from GBM"
Relative importance Scaled.Values Percent.Influence
duration 165.803620 1.00000000 53.7546812
nr.employed 116.284200 0.70133692 37.7001425
pdays 23.857216 0.14388839 7.7346746
euribor3m 2.499952 0.01507779 0.8105017
age 0.000000 0.00000000 0.0000000
job 0.000000 0.00000000 0.0000000
marital 0.000000 0.00000000 0.0000000
education 0.000000 0.00000000 0.0000000
default 0.000000 0.00000000 0.0000000
housing 0.000000 0.00000000 0.0000000
loan 0.000000 0.00000000 0.0000000
contact 0.000000 0.00000000 0.0000000
month 0.000000 0.00000000 0.0000000
day_of_week 0.000000 0.00000000 0.0000000
campaign 0.000000 0.00000000 0.0000000
previous 0.000000 0.00000000 0.0000000
poutcome 0.000000 0.00000000 0.0000000
emp.var.rate 0.000000 0.00000000 0.0000000
cons.price.idx 0.000000 0.00000000 0.0000000
cons.conf.idx 0.000000 0.00000000 0.0000000
-kevin
Hi,
The varimp component is just an R data frame, so you can access the names/values in the usual ways. Here's a complete script with iris:
library(h2o)
h <- h2o.init()
hex <- as.h2o(h, iris)
m <- h2o.gbm(x=1:4, y=5, data=hex, importance=T)
m@model$varimp
Relative importance Scaled.Values Percent.Influence
Petal.Width 7.216290000 1.0000000000 51.22833426
Petal.Length 6.851120500 0.9493965043 48.63600147
Sepal.Length 0.013625654 0.0018881799 0.09672831
Sepal.Width 0.005484723 0.0007600474 0.03893596
is.data.frame(m@model$varimp)
# [1] TRUE
names(m@model$varimp)
# [1] "Relative importance" "Scaled.Values" "Percent.Influence"
rownames(m@model$varimp)
# [1] "Petal.Width" "Petal.Length" "Sepal.Length" "Sepal.Width"
m@model$varimp$"Relative importance"
# [1] 7.216290000 6.851120500 0.013625654 0.005484723
etc.
HTH,
Spencer
The variable importance should be plain olde R data frame, I've opened up a jira ticket to track this (https://0xdata.atlassian.net/browse/PUB-1020)
Spencer
Launch R
>library(h2o)
>conn <- ...
>train = h2o.getFrame(conn, '<previously_imported_key>')
>is.data.frame(train)
FALSE
as.data.frame(train)
What should be returned as an R data frame are, e.g., variable importances inside of models.
Thanks,
Spencer