[ANN] A gradient boosting regressor package

193 views
Skip to first unread message

Sina Siadat

unread,
Jan 29, 2018, 4:25:25 PM1/29/18
to golang-nuts
Hi all!

I just wrote a simple gradient regressor in Go. Gradient boosting is a statistical learning method. Given a number of samples it returns a function that fits the those data and can be used to predict previously unseen data. The usage is simple, here's an example:

    trainSamples := []sample.Sample{
    sample.DefaultSample{Xs: map[string]float64{"x": 0}, Y: 10},
    sample.DefaultSample{Xs: map[string]float64{"x": 1}, Y: 10},
    sample.DefaultSample{Xs: map[string]float64{"x": 2}, Y: 20},
    sample.DefaultSample{Xs: map[string]float64{"x": 3}, Y: 20},
    sample.DefaultSample{Xs: map[string]float64{"x": 4}, Y: 5},
    sample.DefaultSample{Xs: map[string]float64{"x": 5}, Y: 5},
    }
    predictFunc := gradboostreg.Learn(trainSamples, 0.5, 10)
    
    testSamples := []sample.Sample{
    sample.DefaultSample{Xs: map[string]float64{"x": 0.0}, Y: 10},
    sample.DefaultSample{Xs: map[string]float64{"x": 0.5}, Y: 10},
    sample.DefaultSample{Xs: map[string]float64{"x": 2.5}, Y: 20},
    sample.DefaultSample{Xs: map[string]float64{"x": 2.0}, Y: 20},
    sample.DefaultSample{Xs: map[string]float64{"x": 4.5}, Y: 5},
    }
    
    for i := range testSamples {
    predicted, actual := predictFunc(testSamples[i]), testSamples[i].GetY()
    fmt.Printf("predicted=%.1f actual=%.1f\n", predicted, actual)
    }
    
    // Output:
    // predicted=10.0 actual=10.0
    // predicted=10.0 actual=10.0
    // predicted=20.0 actual=20.0
    // predicted=20.0 actual=20.0
    // predicted=5.0 actual=5.0


Let me know what you think! :)
Thanks,
Sina

Sebastien Binet

unread,
Jan 29, 2018, 4:48:43 PM1/29/18
to Sina Siadat, golang-nuts
hi Sina,

nice!

I have one "drive-by-comment" and a question:
you could have perhaps used gonum for the stats stuff :)

and the question: did you compare your package with XGBoost (which is now kind of a standard candle nowadays) in terms of accuracy, speed and memory usage?

cheers,
-s

Sina Siadat

unread,
Jan 29, 2018, 5:43:38 PM1/29/18
to Sebastien Binet, golang-nuts
Hi Sebastien,

​Thanks for your comment and question :)

> I have one "drive-by-comment" and a question:
> you could have perhaps used gonum for the stats stuff :)

​Actually, I did start with gonum :)) but I thought it was a large dependency and I only needed a few funcs from it, so I decided to copy/write them!

> and the question: did you compare your package with XGBoost (which is now kind of a standard candle nowadays) in terms of accuracy, speed and memory usage?

​I haven't used XGBoost yet, it looks pretty cool, thanks for mentioning it. I will install it and see if I can compare the two.

Cheers!
Sina

matthe...@gmail.com

unread,
Jan 29, 2018, 6:59:11 PM1/29/18
to golang-nuts
Hi Sina, here’s a general code review.

gradboostreg/tree, stat, sample have no tests.

These sub-packages may be better in package gradboostreg since they seem straightforward and contained to this library. Keeping those things in sub-packages doesn’t seem to add much value over just having separate files in my opinion.

If you do keep the sub-packages I suggest changing sample.DefaultSample to just sample.Default.

I’m not sure I understand why tree.Decider as an interface is necessary. The only use in the package is to return a Decider from NewDecisionTree. I would define such an interface where it’s used.

The godoc has no documentation.

Do you intend to add an open source license? If not, be sure to read the GitHub default license.

Matt

Sina Siadat

unread,
Jan 30, 2018, 1:47:33 AM1/30/18
to matthe...@gmail.com, golang-nuts
Hi Matt,

Thank you for the code review!


> gradboostreg/tree, stat, sample have no tests.

Yes, thanks for reminder, I should write tests for them!​

> These sub-packages may be better in package gradboostreg since they seem straightforward and contained to this library. Keeping those things in sub-packages doesn’t seem to add much value over just having separate files in my opinion.

​I put them in separate packages in an attempt to structure the code to be readable and isolate the intermediary functions and show that they are only helper funcs by unexporting them. I know it's not interesting to have a package with one file and that only one other package using it, but the benefits outweigh the uninterestingness for me. Maybe because I have no problem jumping around the files.​

> If you do keep the sub-packages I suggest changing sample.DefaultSample to just sample.Default.

​Great idea! I ran out of idea what to call it.

> I’m not sure I understand why tree.Decider as an interface is necessary. The only use in the package is to return a Decider from NewDecisionTree. I would define such an interface where it’s used.

I was thinking that you can implement deciders other than decision trees, e.g. a Gaussian decider!

> Do you intend to add an open source license? If not, be sure to read the GitHub default license.


​I have no idea or preferences about licenses, except that I felt that MIT looked like the most permissive one. I will add a license later today.

Thanks again for your time reviewing the project :)
​Sina​

> --
> You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/bi7WEHeeNWs/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to golang-nuts+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

matthe...@gmail.com

unread,
Jan 30, 2018, 10:00:59 AM1/30/18
to golang-nuts
Thanks for sharing gradboostreg here.

Matt
> To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

Aliaksandr Valialkin

unread,
Feb 5, 2018, 7:51:38 AM2/5/18
to golang-nuts

Sina Siadat

unread,
Feb 6, 2018, 6:34:32 AM2/6/18
to Aliaksandr Valialkin, golang-nuts
Thanks for the link!
--
Reply all
Reply to author
Forward
0 new messages