Learning with noisy labels for classification

ShNaYkHs ShNaYkHs

unread,

Sep 29, 2013, 7:57:56 AM9/29/13

to active-le...@googlegroups.com

Hi group,

For classification (especially in active learning for classification), it is usually difficult to obtain a perfectly labelled training set (with completely reliable labels); so the oracle who label the training data may give some erroneous/noisy labels.

Without talking about crowdsourcing techniques, what is the state of the art of learning with such noisy labels ? Do you know any interesting papers that deal with this issue ?

Best regards,

Shnaykhs.

Pallika Kanani

unread,

Sep 29, 2013, 11:01:58 AM9/29/13

to active-le...@googlegroups.com

Here's one of the early papers in this area. Please check citations for more recent work:

Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers • Victor S. Sheng, Foster Provost, and Panagiotis G. Ipeirotis • Proceedings of the Fourteenth ACM International Conference on Knowledge Discovery and Data Mining (KDD), 2008

http://www.ipeirotis.com/wp-content/uploads/2012/01/kdd2008.pdf

Best,

Pallika.

--
--
You are subscribed to the "Active Learning (Machine Learning)" group.
To post a message, send email to: active-le...@googlegroups.com
To unsubscribe, send email to: active-learning...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/active-learning-ml?hl=en

---
You received this message because you are subscribed to the Google Groups "Active Learning (Machine Learning)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to active-learning...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

ShNaYkHs ShNaYkHs

unread,

Nov 6, 2013, 4:57:51 PM11/6/13

to active-le...@googlegroups.com

Dear Pallika,

This is basically a paper where we repeatedly ask for a label from potential noisy labelers (close to crowdsourcing). What I'm searching for is rather, how to detect noisy labels (preferably for stream-based active learning) in order to correct them or do not update the model using them.

Do you know any interesting papers that deal with this issue ?

Thanks.

2013/9/29 Pallika Kanani <pal...@cs.umass.edu>

Arman Didandeh

unread,

Nov 6, 2013, 5:00:49 PM11/6/13

to active-le...@googlegroups.com

One interesting idea that I once was looking into it is to find instances/cases for which common knowledge of the crowd is never going to be accurate.

Finding these data points might also be something of your interest to take a look, although I am not sure how much quality work has been done on it, both theoretically and technically.

ShNaYkHs ShNaYkHs

unread,

Nov 10, 2013, 12:12:42 PM11/10/13

to active-le...@googlegroups.com

How would you detect instances for which common knowledge of the crowd is never going to be accurate ? Do you have some idea about such a measure ?

2013/11/6 Arman Didandeh <arman.d...@gmail.com>

Reply all

Reply to author

Forward