23 views

Skip to first unread message

Apr 11, 2012, 12:47:50 AM4/11/12

to cognitiv...@googlegroups.com

I wanted to get some thoughts on what you all thought of adding an

interface for the learned output of regression algorithms, similar to

the Categorizer interface for the output of categorization learning

algorithms. I've started to put together some basic regression

ensembles and noticed that our regression code seems to lack unity so

having a simple interface for those types of models could be useful,

if at the very least to help organize things. My thought was to create

a Regressor interface in the learning package that would extend

Evaluator<T, Double> with at least one method public double

evaluateAsDouble(final T input) for convenience/efficiency of removing

the creation of the extra Double object when we can.

interface for the learned output of regression algorithms, similar to

the Categorizer interface for the output of categorization learning

algorithms. I've started to put together some basic regression

ensembles and noticed that our regression code seems to lack unity so

having a simple interface for those types of models could be useful,

if at the very least to help organize things. My thought was to create

a Regressor interface in the learning package that would extend

Evaluator<T, Double> with at least one method public double

evaluateAsDouble(final T input) for convenience/efficiency of removing

the creation of the extra Double object when we can.

We could also potentially add other functions to such an interface,

like to get the upper/lower bounds of the regressor for certain cases

where the output is known. Thus, bounded regression models like

logistic regression could be well understood.

I realize we have the UnivariateScalarFunction interface, but that is

just between two doubles; this new interface would have a generic

input type and output a double.

I have a basic implementation of this change but I wanted to run it by

people to see if the name Regressor makes sense or if we could come up

with a better one or a better idea of how we could unify some of the

different regression support.

Thanks,

Justin

Apr 11, 2012, 8:45:26 AM4/11/12

to cognitiv...@googlegroups.com

I agree that the Regression (and also the parameter-minimization) codes aren't tied together well.

My only concern with an interface that forces Regression onto a Double/scalar is that I/we also do a lot of multivariate regression... Could we have an interface, perhaps "Regressor", which has two sub-interfaces, UnivariateRegressor and MultivariateRegressor?

Thoughts?

--

Kevin R. Dixon

Sandia National Laboratories (05635)

MS1248, TA-I: 770/202

tel: (505) 284-5615

fax: (505) 284-3977

________________________________________

From: cognitiv...@googlegroups.com [cognitiv...@googlegroups.com] on behalf of Justin Basilico [jbas...@gmail.com]

Sent: Tuesday, April 10, 2012 10:47 PM

To: cognitiv...@googlegroups.com

Subject: [EXTERNAL] [Cognitive Foundry] Interface for the learned output of regression algorithms

Apr 12, 2012, 1:49:31 AM4/12/12

to cognitiv...@googlegroups.com

Yeah, I've been trying to understand the interplay between some of the

existing interfaces we have here. For instance, we have

UnivariateScalarFunction that is double to double and VectorFunction

that is vector to vector. What I think is missing is to have a generic

X to double interface and/or (univariate) regression, though I think

normally when people say univariate regression they mean one input and

one output and multivariate is multiple input and one output. Is that

what you mean for MultivariateRegressor as well or do you mean

multiple outputs (a vector)? If so, I had been thinking of that case

as more some kind of generalization of X to vector interface, which we

do have the VectorOutputFunction for at least.

existing interfaces we have here. For instance, we have

UnivariateScalarFunction that is double to double and VectorFunction

that is vector to vector. What I think is missing is to have a generic

X to double interface and/or (univariate) regression, though I think

normally when people say univariate regression they mean one input and

one output and multivariate is multiple input and one output. Is that

what you mean for MultivariateRegressor as well or do you mean

multiple outputs (a vector)? If so, I had been thinking of that case

as more some kind of generalization of X to vector interface, which we

do have the VectorOutputFunction for at least.

Speaking of univariate linear regression, I also noticed that we were

missing a "learner" type of simple y=mx+b type of linear regression,

so I've put together a learner for that as well.

Thanks, : )

Justin

Apr 12, 2012, 9:18:00 AM4/12/12

to cognitiv...@googlegroups.com

I think of multivariate regression as something-to-many outputs... Usually Vector to Vector. When I thought you were talking about a "Regressor" I thought you meant something like

UnivariateRegressor<InputType> extends SupervisedBatchLearner<InputType,Double>

MultivariateRegressor<InputType> extends SupervisedBatchLearner<InputType,Vector>

That is, the algorithm that returns an Evaluator, not the Evaluator itself.

We do have arbitrary polynomial regression... but it's buried and could definitely be cleaned up:

gov.sandia.cognition.learning.function.scalar.PolynomialFunction.Regression

And each subclass of PolynomialFunction.ClosedForm has a fit() function.

--

Kevin R. Dixon

Sandia National Laboratories (05635)

MS1248, TA-I: 770/202

tel: (505) 284-5615

fax: (505) 284-3977

________________________________________

From: cognitiv...@googlegroups.com [cognitiv...@googlegroups.com] on behalf of Justin Basilico [jbas...@gmail.com]

Sent: Wednesday, April 11, 2012 11:49 PM

To: cognitiv...@googlegroups.com

Subject: Re: [EXTERNAL] [Cognitive Foundry] Interface for the learned output of regression algorithms

Apr 13, 2012, 12:59:41 AM4/13/12

to cognitiv...@googlegroups.com

No, I think having the SupervisedBatchLearner<InputType, Double>

interface for learning regression is fine. I am talking more about the

output of the learning. While Categorizer<InputType, CategoryType>

makes sense, I guess I'm just hoping that Regressor<InputType> extends

Evaluator<InputType, Double> also makes sense, though I know the

terminology for regression doesn't really have a super term to refer

to the function that it learns, so I was thinking of just asserting it

was regressor in Foundry lingo. But I wanted to run it by you to see

if there is some better terminology that I'm overlooking. : )

interface for learning regression is fine. I am talking more about the

output of the learning. While Categorizer<InputType, CategoryType>

makes sense, I guess I'm just hoping that Regressor<InputType> extends

Evaluator<InputType, Double> also makes sense, though I know the

terminology for regression doesn't really have a super term to refer

to the function that it learns, so I was thinking of just asserting it

was regressor in Foundry lingo. But I wanted to run it by you to see

if there is some better terminology that I'm overlooking. : )

One of the main reasons I'm suggesting the Regressor is to just have

an interface to unify the convention that we've used in some other

places of having a double evaluateAsDouble(InputType input) method in

addition to the one from Evaluator<InputType, Double> for avoiding the

boxing/unboxing when evaluating lots of regression to avoid the

memory/speed overhead, especially for use in regression ensembles.

Yeah, that PolynomiralRegression.Linear is pretty buried and putting

one together for the simple linear case seems a little convoluted,

though powerful when you do want the polynomials. Anyway, I think

having a simple class for doing the basic univariate linear regression

that does avoid some of the conversion stages inside it could be

helpful. May also be interesting to have an incremental version of it.

Thanks, : )

Justin

Apr 16, 2012, 10:57:20 AM4/16/12

to cognitiv...@googlegroups.com

Actually, I'm kind of liking the idea of unifying the regression algorithms under an interface umbrella, regardless of adding a "Regressor" interface for Evaluators.

Thoughts?

--

Kevin R. Dixon

Sandia National Laboratories (05635)

MS1248, TA-I: 770/202

tel: (505) 284-5615

fax: (505) 284-3977

________________________________________

From: cognitiv...@googlegroups.com [cognitiv...@googlegroups.com] on behalf of Justin Basilico [jbas...@gmail.com]

Sent: Thursday, April 12, 2012 10:59 PM

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu