What is the best metric when using Junto for classification?

Mihir Kale

unread,

Aug 20, 2015, 5:00:13 AM8/20/15

to The Junto Label Propagation Toolkit Open Discussion

Hello!

Thanks for making Junto public and open source :D

I'm using the toolkit as a multi-label classification problem. What would be the best metric to tune in this scenario?

The way the algorithm works, it doesn't seem to make sense to compare different label scores for a particular node since they are calculated independently of each other. So one set of label scores might always be higher than the scores for another label. Is my line of thinking correct?

I was thinking of a metric along the lines of recall in the top-K scored nodes for each label but I'm not sure.

Do you have any suggestions?

Thanks!

Partha Pratim Talukdar

unread,

Aug 21, 2015, 2:51:51 AM8/21/15

to junto...@googlegroups.com

In case of Adsorption, the label scores on each node is normalized. So, they are no longer independent. In case of MADDL also (source code available separately), there is interaction among the labels. In case of MAD, because of the regularization targets, I think the scores across labels become somewhat comparable, although indirectly. What I mean is that, if the regularization target is 0 for two labels (except for the dummy label), then MAD will need to have reason to assign those labels any score other than 0. That way, if a particular label gets higher score compared to another label on a node, then the two scores can't be considered totally independent due to the common regularization target. Of course, this is indirect and a bit hand wavy, and if you want to enforce more comparability, then you want to enforce this explicitly.

hth,

Partha

Mihir Kale

unread,

Aug 31, 2015, 7:38:46 AM8/31/15

to The Junto Label Propagation Toolkit Open Discussion

Thanks for the response! :D

Is it possible to share the MADDL source code as well?

Also, is there a way to get around class imbalance problem? My label distribution is skewed. When I run the algorithm, the more common class is always getting a higher score than the the less frequently appearing label. Are there some techniques that will work well here, like say, downsampling?

My graph is a bipartite one with vertex sets V1 and V2. Suppose there are two labels L1 and L2.

My task is to classify each vertex v1 in V1. Initially in my dataset was V1xV2 with V1 being the samples and V2 being the features. But given my domain, it makes sense to model it as a bipartite graph as well. Logistic Regression and Boosted trees give decent results (assuming only one label L.

L = 0 ==> L1=1 and L = 1 ==> L2=1)

However when I ran Junto, the predictions were as good as random. Its weird because going by the Logistic Regression results, the features clearly have predictive power. Can you think of any possible reason? Or suggest a line of thought for debugging?

I've read the paper so I have some understanding of how the algorithm works. But I'm confounded wrt the results I'm getting. Not sure how to investigate. I have experimented with various values for the hyper parameters. (Not exhaustive since the paper suggests the also is not very sensitive to the hyper parameters)

Reply all

Reply to author

Forward