Interface for the learned output of regression algorithms

23 views
Skip to first unread message

Justin Basilico

unread,
Apr 11, 2012, 12:47:50 AM4/11/12
to cognitiv...@googlegroups.com
I wanted to get some thoughts on what you all thought of adding an
interface for the learned output of regression algorithms, similar to
the Categorizer interface for the output of categorization learning
algorithms. I've started to put together some basic regression
ensembles and noticed that our regression code seems to lack unity so
having a simple interface for those types of models could be useful,
if at the very least to help organize things. My thought was to create
a Regressor interface in the learning package that would extend
Evaluator<T, Double> with at least one method public double
evaluateAsDouble(final T input) for convenience/efficiency of removing
the creation of the extra Double object when we can.

We could also potentially add other functions to such an interface,
like to get the upper/lower bounds of the regressor for certain cases
where the output is known. Thus, bounded regression models like
logistic regression could be well understood.

I realize we have the UnivariateScalarFunction interface, but that is
just between two doubles; this new interface would have a generic
input type and output a double.

I have a basic implementation of this change but I wanted to run it by
people to see if the name Regressor makes sense or if we could come up
with a better one or a better idea of how we could unify some of the
different regression support.

Thanks,
Justin

Dixon, Kevin R

unread,
Apr 11, 2012, 8:45:26 AM4/11/12
to cognitiv...@googlegroups.com
I agree that the Regression (and also the parameter-minimization) codes aren't tied together well.

My only concern with an interface that forces Regression onto a Double/scalar is that I/we also do a lot of multivariate regression... Could we have an interface, perhaps "Regressor", which has two sub-interfaces, UnivariateRegressor and MultivariateRegressor?

Thoughts?

--
Kevin R. Dixon
Sandia National Laboratories (05635)
MS1248, TA-I: 770/202
tel: (505) 284-5615
fax: (505) 284-3977
________________________________________
From: cognitiv...@googlegroups.com [cognitiv...@googlegroups.com] on behalf of Justin Basilico [jbas...@gmail.com]
Sent: Tuesday, April 10, 2012 10:47 PM
To: cognitiv...@googlegroups.com
Subject: [EXTERNAL] [Cognitive Foundry] Interface for the learned output of regression algorithms

Justin Basilico

unread,
Apr 12, 2012, 1:49:31 AM4/12/12
to cognitiv...@googlegroups.com
Yeah, I've been trying to understand the interplay between some of the
existing interfaces we have here. For instance, we have
UnivariateScalarFunction that is double to double and VectorFunction
that is vector to vector. What I think is missing is to have a generic
X to double interface and/or (univariate) regression, though I think
normally when people say univariate regression they mean one input and
one output and multivariate is multiple input and one output. Is that
what you mean for MultivariateRegressor as well or do you mean
multiple outputs (a vector)? If so, I had been thinking of that case
as more some kind of generalization of X to vector interface, which we
do have the VectorOutputFunction for at least.

Speaking of univariate linear regression, I also noticed that we were
missing a "learner" type of simple y=mx+b type of linear regression,
so I've put together a learner for that as well.

Thanks, : )
Justin

Dixon, Kevin R

unread,
Apr 12, 2012, 9:18:00 AM4/12/12
to cognitiv...@googlegroups.com
I think of multivariate regression as something-to-many outputs... Usually Vector to Vector. When I thought you were talking about a "Regressor" I thought you meant something like

UnivariateRegressor<InputType> extends SupervisedBatchLearner<InputType,Double>

MultivariateRegressor<InputType> extends SupervisedBatchLearner<InputType,Vector>

That is, the algorithm that returns an Evaluator, not the Evaluator itself.

We do have arbitrary polynomial regression... but it's buried and could definitely be cleaned up:

gov.sandia.cognition.learning.function.scalar.PolynomialFunction.Regression

And each subclass of PolynomialFunction.ClosedForm has a fit() function.

--
Kevin R. Dixon
Sandia National Laboratories (05635)
MS1248, TA-I: 770/202
tel: (505) 284-5615
fax: (505) 284-3977
________________________________________
From: cognitiv...@googlegroups.com [cognitiv...@googlegroups.com] on behalf of Justin Basilico [jbas...@gmail.com]

Sent: Wednesday, April 11, 2012 11:49 PM
To: cognitiv...@googlegroups.com
Subject: Re: [EXTERNAL] [Cognitive Foundry] Interface for the learned output of regression algorithms

Justin Basilico

unread,
Apr 13, 2012, 12:59:41 AM4/13/12
to cognitiv...@googlegroups.com
No, I think having the SupervisedBatchLearner<InputType, Double>
interface for learning regression is fine. I am talking more about the
output of the learning. While Categorizer<InputType, CategoryType>
makes sense, I guess I'm just hoping that Regressor<InputType> extends
Evaluator<InputType, Double> also makes sense, though I know the
terminology for regression doesn't really have a super term to refer
to the function that it learns, so I was thinking of just asserting it
was regressor in Foundry lingo. But I wanted to run it by you to see
if there is some better terminology that I'm overlooking. : )

One of the main reasons I'm suggesting the Regressor is to just have
an interface to unify the convention that we've used in some other
places of having a double evaluateAsDouble(InputType input) method in
addition to the one from Evaluator<InputType, Double> for avoiding the
boxing/unboxing when evaluating lots of regression to avoid the
memory/speed overhead, especially for use in regression ensembles.

Yeah, that PolynomiralRegression.Linear is pretty buried and putting
one together for the simple linear case seems a little convoluted,
though powerful when you do want the polynomials. Anyway, I think
having a simple class for doing the basic univariate linear regression
that does avoid some of the conversion stages inside it could be
helpful. May also be interesting to have an incremental version of it.

Thanks, : )
Justin

Dixon, Kevin R

unread,
Apr 16, 2012, 10:57:20 AM4/16/12
to cognitiv...@googlegroups.com
Actually, I'm kind of liking the idea of unifying the regression algorithms under an interface umbrella, regardless of adding a "Regressor" interface for Evaluators.

Thoughts?

--
Kevin R. Dixon
Sandia National Laboratories (05635)
MS1248, TA-I: 770/202
tel: (505) 284-5615
fax: (505) 284-3977
________________________________________
From: cognitiv...@googlegroups.com [cognitiv...@googlegroups.com] on behalf of Justin Basilico [jbas...@gmail.com]

Sent: Thursday, April 12, 2012 10:59 PM

Reply all
Reply to author
Forward
0 new messages