Discriminant Analysis

59 views
Skip to first unread message

Lyron Winderbaum

unread,
Mar 8, 2017, 10:30:15 PM3/8/17
to gonum-dev
Hi All,

I've been thinking for a fair while about adding some discriminant analysis/ classification 
stuff to 'stat', but I thought I would ask for opinions on where/ how this should be added
and on maybe some design and API considerations people might have? Roughly what I 
had in mind for a start was an interface 'Classifier' with 'Test' and 'Train' methods, a 'CV' 
function that does cross validation of some data on a 'Classifier', and a few Classifier 
types to start with, say starting with Fishers LDA and the Naive Bayes version thereof.
Later more methods such as support vector machines and random forests should be able 
to fit into the same interface easily. 

Any opinions/ suggestions? Also, any thoughts on including a separate 
interface 'BinaryClassifier' for two-class cases, as they often have more efficient 
implementations, and could take []bool instead of []int for class labels?

Cheers,
Lyron

Brendan Tracey

unread,
Mar 15, 2017, 11:42:11 PM3/15/17
to gonum-dev
We've talked a few times about having a `gonum/fit`, and there's general agreement in favor. My past experience with such a library are that there are a lot of tricky issues, and so I think we should have a design document before we start just implementing things. The basic questions are how to
a) General organization of where types live and interfaces
b) Enable basic optimization routines, while also allowing type-specific optimizations
c) Answering how to work well with database data
d) How to work in regularizers and loss functions in a generic and yet efficient way

Lyron Winderbaum

unread,
Mar 27, 2017, 8:05:34 PM3/27/17
to gonum-dev
Yeah I agree having a design document before we start would be a good idea, which is why I was asking because 
I don't really have much experience with such things and so I'm not quite sure what to suggest with regards to 
questions such as those you mention. I have some experience with a few specific methods, and I have ideas 
about optimizations thereof but I have no idea about how these things would generalize, hence the inquiry.

If you do ever make a start on such a `gonum/fit` let me know I'd be interested in trying to contribute. 
Reply all
Reply to author
Forward
0 new messages