Iris dataset example

Roshan Santhosh

unread,

Jan 6, 2016, 8:03:24 AM1/6/16

to pystruct

In the example on multiclass classification on iris dataset given here , how can we interpret the joint_feature values and the model weights. I have used the MultiClassClf model described in the example, which is said to implement a multi-class SVM in CRF framework.

But how can we intuitively explain these features? There are 12 features in the join_feature list with 4 values for each class. Can these be compared to unary potentials in any way. Am I right in assuming that this CRF framework consists of no pair-wise features and only 4 unary features i.e 4 observed features connected to the label?

Andreas Mueller

unread,

Jan 6, 2016, 11:33:43 AM1/6/16

to pyst...@googlegroups.com

On 01/06/2016 08:03 AM, Roshan Santhosh wrote:

In the example on multiclass classification on iris dataset given here , how can we interpret the joint_feature values and the model weights. I have used the MultiClassClf model described in the example, which is said to implement a multi-class SVM in CRF framework.

But how can we intuitively explain these features? There are 12 features in the join_feature list with 4 values for each class. Can these be compared to unary potentials in any way. Am I right in assuming that this CRF framework consists of no pair-wise features and only 4 unary features i.e 4 observed features connected to the label?

Yes.
This is an implementation of the Crammer-Singer SVM, only things are reshaped such that everything is a dot product of the joint features and the weights.
The weights are just 3x4 weights, one entry for every feature and every class (similar as what you would have in multinomial logistic regression) and the features
are just unary features. the joint feature vector is the outer product of a class indicator and the 4 input features.

Roshan Santhosh

unread,

Jan 6, 2016, 7:19:25 PM1/6/16

to pystruct

I have attached a representation of a graph which I believe represents the current example. Here the factors are represented by the weights that change while evaluating for each class.

Is the following pseudocode correct for the example?

for each sample in x:
    for each class:
        multiply sample features with weights for the class to get a specific value for that class
    classify sample to class which gave the largest value

Andreas Mueller

unread,

Jan 7, 2016, 9:41:32 AM1/7/16

to pyst...@googlegroups.com

The pseudo code is correct.
The graph is not really. This is a factor graph, right? If x1, x2, x3 and x4 are the input features, they are not variable nodes, but they are parts of the factors.
I'd rather draw this as a round y with a single square box attached.

--
You received this message because you are subscribed to the Google Groups "pystruct" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystruct+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Roshan Santhosh

unread,

Jan 11, 2016, 2:45:37 AM1/11/16

to pystruct

So the graph would have just a single circle as the input, where x1-x4 are just the features that represent the single training sample?

1)
I tried the ChainCRF example given in the user guide. I get a model with 4004 weights. Considering that fact that this directed I get 26*26 (676) parameters for the pairwise potentials between the classes. Also considering the unary potentials for all the classes, we have 26*128 (3328) parameters, which together give the total 4004 parameters. I would like to know how these are parameters are placed in the weights vector. Are all the 3328 unary potential weights placed first followed by the pairwise potential weights or is there some other manner in which they are arranged?

2)
For multi-label classification using MultiLabelClf (with no edges) does the inference() function iterate over all the possible combination of labels and see which combination maximizes np.dot(weight.T , joint_feature).
I am assuming a naive approach like the one used in iris dataset where each class is independently evaluated will not work in this case.

Andreas Mueller

unread,

Jan 14, 2016, 3:05:51 PM1/14/16

to pyst...@googlegroups.com

On 01/11/2016 02:45 AM, Roshan Santhosh wrote:

So the graph would have just a single circle as the input, where x1-x4 are just the features that represent the single training sample?

Circles are random variables in factor graphs. The input is not a random variable, it is always conditioned on. So the input would be a unary factor, which is a rectangle.

1)
I tried the ChainCRF example given in the user guide. I get a model with 4004 weights. Considering that fact that this directed I get 26*26 (676) parameters for the pairwise potentials between the classes. Also considering the unary potentials for all the classes, we have 26*128 (3328) parameters, which together give the total 4004 parameters. I would like to know how these are parameters are placed in the weights vector. Are all the 3328 unary potential weights placed first followed by the pairwise potential weights or is there some other manner in which they are arranged?

yes, first unary, then pairwise. You can see this in the letters example, where I pull out the pairwise potentials and plot them:
http://pystruct.github.io/auto_examples/plot_letters.html

2) For multi-label classification using MultiLabelClf (with no edges) does the inference() function iterate over all the possible combination of labels and see which combination maximizes np.dot(weight.T , joint_feature).
I am assuming a naive approach like the one used in iris dataset where each class is independently evaluated will not work in this case.

If you use MultiLabelClf without edges, then all decisions will be independent. The edges are what create connections between the outputs. If you add a tree of edges, you can just use message passing for inference, if you use a graph with cycles, like a complete graph, you need to use one of the approximate inference methods.

Reply all

Reply to author

Forward