Assignment 1

11 views

Skip to first unread message

Adam Sällergård

unread,

Jan 24, 2012, 6:44:42 AM1/24/12

to ml_chalmers

Hi I’m struggling to understand this bayes classifier with mean and
variance.
In the sge function it seems like there are only one variance in the
return data. Shouldn’t there be one variance for each feature like
with the mean values?
How do the the bayes classifier look like when we have a mean and a
variance? When I look at the equation in the lecture notes I can’t
understand what the exponent T stands for or what b stands for? I also
thought that x was a vector of many features, so should I make a
summation for the different features before applying sign()? At
Wikipedia there are a bayes classifier equation that uses the variance
of every feature but it does not look at all like the one in the
lecture notes. I obviously missed something big. Can someone please
point me in the right direction.

Emil Falk

unread,

Jan 24, 2012, 7:10:07 AM1/24/12

to ml_chalmers

I might be able to answer some questions.

In the assignment we are considering the simplified case when the
variance of all features are the same
and they are all independent. Thus the covariance matrix of the
features can be simplified to variance*identity
matrix. This will lead to a simpler discriminating functions when
constructing the classifier.

A^t is the transpose of the matrix A.

I don't have the lecture notes in front of me but I would guess that
the b is the bias of the function.

Hope it helps!

On Jan 24, 12:44 pm, Adam Sällergård <adam.sallerg...@gmail.com>
wrote:

Vinay Jethava

unread,

Jan 24, 2012, 7:15:01 AM1/24/12

to ml_ch...@googlegroups.com

Dear Adam,

1. You are correct that in general there will be a variance associated with each feature and a covariance between two features. However, the spherical Gaussian distribution assumes the following:

a. All covariances between different features are zero
b. The variances for all features are equal (called sigma) in sge.

2. Naive Bayes classifier has to be designed by using the sge() to first estimate the parameters for the two classes, and then designing a Bayesian classifier (this part is simpler than the Bayesian classifier for multi-variate normal case due to spherical Gaussian assumption mentioned above).

Think of this in two steps:

1. Estimate parameters for class 1 and 2. This gives you probability distributions for the two classes.
2. Design the naive Bayes classifier by working from first principles using estimated probability distributions in step 1.