Poisson/Gamma GBM with monotonicity

105 views
Skip to first unread message

Nimrod Maniv

unread,
Jan 18, 2022, 5:42:22 AM1/18/22
to H2O Open Source Scalable Machine Learning - h2ostream
Hello,

I'm trying to run a GBM with monotonicity on some of the predictors, and with the target distribution of Gamma/Poisson. According to the documentation, it can be done by specifying the Tweedie distribution and to set the Tweedie power to 1 (Poisson) or 2 (Gamma). However, when I do that I get an error:  _tweedie_power: Tweedie power must be between 1 and 2 (exclusive). Could you please clarify this issue? 

I'm using Python 3.8 and h2o version 3.32.1.3 (but also got the same issue with 3.36.0.1).

below is code to reproduce the issue:

import h2o
from h2o.estimators import H2OXGBoostEstimator
from h2o.estimators import H2OGradientBoostingEstimator
import numpy as np
import pandas as pd
from sklearn.datasets import fetch_california_housing

cal_housing = fetch_california_housing()
h2o.init()

data = h2o.H2OFrame(cal_housing.data, column_names=cal_housing.feature_names)
data["target"] = h2o.H2OFrame(cal_housing.target)
train, test = data.split_frame([0.6], seed=123)
feature_names = ['MedInc', 'AveOccup', 'HouseAge']
monotone_constraints = {"MedInc": 1, "AveOccup": -1, "HouseAge": 1}
gbm_mono = H2OGradientBoostingEstimator(monotone_constraints=monotone_constraints,
                                         distribution = 'tweedie', tweedie_power=2)
gbm_mono.train(x=feature_names, y="target", training_frame=train, validation_frame=test)

Thanks,
Nimrod

Darren Cook

unread,
Jan 18, 2022, 4:46:13 PM1/18/22
to h2os...@googlegroups.com
> I'm trying to run a GBM with monotonicity on some of the predictors, and
> with the target distribution of Gamma/Poisson.

Set distribution to "gamma" or "poisson" in those cases

https://h2o-release.s3.amazonaws.com/h2o/rel-zipf/3/docs-website/h2o-docs/data-science/algo-params/distribution.html

It also mentions the 1 to 2 range is exclusive, there.

If you want to tune tweedie_power as a hyperparameter, I imagine using
1.0001 and 1.9999 for the two extremes is just as good. If 1.9999 is
found to be the best value, use "gamma" in your final model.

Darren


> According to the documentation
> <https://h2o-release.s3.amazonaws.com/h2o/rel-zipf/3/docs-website/h2o-docs/data-science/gbm.html>,

Michal Kurka

unread,
Jan 18, 2022, 5:08:55 PM1/18/22
to H2O Open Source Scalable Machine Learning - h2ostream
Thanks for answering Darren, the specific error is triggered by this check:

if (_parms._tweedie_power <= 1 || _parms._tweedie_power >= 2) {
error("_tweedie_power", "Tweedie power must be between 1 and 2 (exclusive).");
}

Value 2.0 is out of these bounds. Please follow Darren's suggestion.

Michal Kurka

Erin LeDell

unread,
Jan 18, 2022, 5:28:31 PM1/18/22
to H2O Open Source Scalable Machine Learning - h2ostream
We will update the user guide to be more clear because I agree that "The range is from 1 to 2." would imply that 1 or 2 are valid. Thanks for raising this issue and thanks for your answer, Darren.

-Erin

Nimrod Maniv

unread,
Jan 19, 2022, 5:01:58 AM1/19/22
to H2O Open Source Scalable Machine Learning - h2ostream
Thank you all for answering. 
I don't want to set the distribution to Poisson/gamma because then I won't be able to use the monotone_constraint parameter. I will use tweedie with 1.99999 for gamma and 1.00001 for Poisson.

Erin, the confusing part in the guide is "...For a normal distribution, enter 0. For Poisson distribution, enter 1. For a gamma distribution, enter 2". It made me think that it's actually possible to set the tweedie_power to 1 and 2 to get Poisson and gamma models. Apparently, it's not possible.

Thanks,
Nimrod


ב-יום רביעי, 19 בינואר 2022 בשעה 00:28:31 UTC+2, Erin LeDell כתב/ה:

Michal Kurka

unread,
Jan 27, 2022, 10:51:04 AM1/27/22
to H2O Open Source Scalable Machine Learning - h2ostream
Hello Nimrod,

we clarified the documentation in the latest release: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/tweedie_power.html

After your clarification, I now see that you would like to use monotone constraints also for Poison/Gamma distributions (which is what we don't currently support). It would make sense to add it, if you do want to see the feature added to H2O, please file a request here: https://h2oai.atlassian.net/

Thank you,
MK

Ludwig Beckerling

unread,
Oct 30, 2022, 2:40:15 PM10/30/22
to H2O Open Source Scalable Machine Learning - h2ostream
Hi,

It does not seem as if a request for this functionality was ever filed, so I opened this ticket: https://h2oai.atlassian.net/browse/PUBDEV-8891

Kind Regards,
Ludwig
Reply all
Reply to author
Forward
0 new messages