How to add weight variable in H2O model

1,060 views
Skip to first unread message

Tanay

unread,
Jun 2, 2015, 11:29:07 AM6/2/15
to h2os...@googlegroups.com
I am building a model in H2O (Random forest or GBM or Deep learning) where I don't find an argument to fit the weight variable unlike normal caret models.Can you let me know a way to add a weight variable in a H2O regression of classification model.

Tom Kraljevic

unread,
Jun 2, 2015, 12:39:44 PM6/2/15
to Tanay, h2os...@googlegroups.com

Hi,

Weights is not yet available, but coming...

Thanks
Tom

> On Jun 2, 2015, at 8:29 AM, Tanay <tanaych...@gmail.com> wrote:
>
> I am building a model in H2O (Random forest or GBM or Deep learning) where I don't find an argument to fit the weight variable unlike normal caret models.Can you let me know a way to add a weight variable in a H2O regression of classification model.
>
> --
> You received this message because you are subscribed to the Google Groups "H2O & Open Source Scalable Machine Learning - h2ostream" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Tanay Chowdhury

unread,
Jun 2, 2015, 12:42:38 PM6/2/15
to h2os...@googlegroups.com, tanaych...@gmail.com
Tom,

Any work around you could suggest that I can do in R.
For example my target is loss-ratio and I want to add policy count
as weight variable.

Thanks
Tanay

Tom Kraljevic

unread,
Jun 2, 2015, 12:51:23 PM6/2/15
to Tanay Chowdhury, h2os...@googlegroups.com

Sorry i cant really think of a workaround.
Its just not implemented yet.
It is in progress, though.

Tom

canth...@gmail.com

unread,
Jul 6, 2015, 4:59:03 PM7/6/15
to h2os...@googlegroups.com, tanaych...@gmail.com
A workaround for most models is to duplicate each training entry a number of times based on its weight. This can be practical for small data sets with small weights, but obviously isn't going to work if some of your weights are 1000000.

Example:

Orange[rep(1:nrow(Orange),times=Orange$circumference),]

ma...@0xdata.com

unread,
Jul 6, 2015, 5:11:39 PM7/6/15
to h2os...@googlegroups.com, tanaych...@gmail.com, canth...@gmail.com
Hi Tanay,

The underlying model will react in a way such as canthony is suggesting, so you can test in that way. Observation weights ensure the model pays more attention to those classes for which you have a high policy count, just as duplicating those records will.

However, you can now use weights in H2O.
See, for example, the h2o.gbm documentation here, on pg 36:
http://h2o-release.s3.amazonaws.com/h2o/rel-shannon/26/docs-website/h2o-r/h2o_package.pdf

In your case, you can simply specify the name of the policy count in the weights_column parameter.

Mark

fansi...@gmail.com

unread,
Nov 14, 2016, 11:07:58 PM11/14/16
to H2O Open Source Scalable Machine Learning - h2ostream, tanaych...@gmail.com

Hi Tom,

I wonder if the h2o now can set weights for samples now?

Thanks!
Frank

Tom Kraljevic

unread,
Nov 14, 2016, 11:12:57 PM11/14/16
to fansi...@gmail.com, H2O Open Source Scalable Machine Learning - h2ostream, tanaych...@gmail.com

Yes.

http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/gbm-params/weights_column.html

Tom
> You received this message because you are subscribed to the Google Groups "H2O Open Source Scalable Machine Learning - h2ostream" group.

mikeg...@gmail.com

unread,
Jul 11, 2017, 11:05:13 AM7/11/17
to H2O Open Source Scalable Machine Learning - h2ostream
Is there a way to add a general weight matrix in the loss function? For example (Y-m(X))'W (Y-m(X)). It seems the current implementation of weight column only allows for a diagonal weight matrix (after square rooting weights)

Erin LeDell

unread,
Jul 11, 2017, 1:18:09 PM7/11/17
to mikeg...@gmail.com, H2O Open Source Scalable Machine Learning - h2ostream
A single weight column is the only way to get weights into H2O at present.

-Erin


On 7/11/17 8:05 AM, mikeg...@gmail.com wrote:
> Is there a way to add a general weight matrix in the loss function? For example (Y-m(X))'W (Y-m(X)). It seems the current implementation of weight column only allows for a diagonal weight matrix (after square rooting weights)
>

--
Erin LeDell Ph.D.
Statistician & Machine Learning Scientist | H2O.ai

Reply all
Reply to author
Forward
0 new messages