Usage of class_weights in model.fit()

9,160 views
Skip to first unread message

Andrea Cimino

unread,
May 18, 2016, 6:14:31 PM5/18/16
to Keras-users
I am working on a classification task, where the classes to be recognized are unbalanced: what
happens is that the trained model always predicts the most probable classes instead
of the class with fewer examples.
Suppose I have 3 classes with these frequencies: {0: 1000, 1:500, 2: 100},
how the class_weights parameter should be tuned most likely?
In my mind there are two possible configurations:
1)  {0: 0.1, 1: 0.5, 2: 1}
2)  {0: 1: 1: 0.5:, 2: 0.1}

Could someone please elaborate on that?

Kind regards

Craig Pfeifer

unread,
May 22, 2016, 9:33:17 PM5/22/16
to Keras-users
You can use sci-kit learn's sklearn.utils.compute_class_weight():

Which will look at the distribution of labels, and produce weights to equally penalize under/over represented classes in the training set. 
You can then pass this list to the class_weight param of the model.fit() function.

Andrea Cimino

unread,
May 23, 2016, 6:36:48 AM5/23/16
to Craig Pfeifer, Keras-users
Thanks Craig for this pointer,

I will look to use this method to produce the weights for my inbalanced data.
I have seen that the implementation is based on this paper:


so I will look into it too.

Andrea


--
You received this message because you are subscribed to a topic in the Google Groups "Keras-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/keras-users/MUO6v3kRHUw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to keras-users...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/8b4b904e-d95e-4c55-b51a-f0b10bf4b120%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

beaun...@gmail.com

unread,
Aug 31, 2016, 4:17:21 PM8/31/16
to Keras-users
Hi Andrea,
Starting high level: To train unbalanced classes 'fairly', we want to increase the importance of the under-represented class(es).
To do this, we need to chose a reference class. You can pick any class to serve as the reference, but conceptually, I like the majority class (the one with the most samples). 
Creating your class_weight dictionary:
1. determine the ratio of reference_class/other_class. If you choose class_0 as your reference, you'll have (1000/1000, 1000/500, 1000/100) = (1,2,10)
2. map the class label to the ratio: class_weight={0:1, 1:2, 2:10}

Conceptually, what your dict is saying is that, during training, class_1 should be treated as 2x as important as class_0. Likewise class_2 should be treated as 10x as important as class_0 and 5x as important as class_1

Andrea Cimino

unread,
Sep 1, 2016, 4:25:15 AM9/1/16
to Keras-users, beaun...@gmail.com


Il giorno mercoledì 31 agosto 2016 22:17:21 UTC+2, beaun...@gmail.com ha scritto:
Hi Andrea,
Starting high level: To train unbalanced classes 'fairly', we want to increase the importance of the under-represented class(es).
To do this, we need to chose a reference class. You can pick any class to serve as the reference, but conceptually, I like the majority class (the one with the most samples). 
Creating your class_weight dictionary:
1. determine the ratio of reference_class/other_class. If you choose class_0 as your reference, you'll have (1000/1000, 1000/500, 1000/100) = (1,2,10)
2. map the class label to the ratio: class_weight={0:1, 1:2, 2:10}

Thanks for your kind suggestion.
I will try to experiment with these settings and see if I will get some improvements!

Ciao,
Andrea
 

phydo...@gmail.com

unread,
Jan 30, 2018, 1:59:39 AM1/30/18
to Keras-users
Actually, you still need to pass the dict to the class_weight param of the model.fit() function. Passing list does not work.
something like:

from sklearn.utils.class_weight import compute_class_weight

class_weight_list
= compute_class_weight('balanced', np.unique(y_train_labels), y_train_labels)
class_weight
= dict(zip(np.unique(y_train_labels), class_weight_list))

pranitap...@gmail.com

unread,
Mar 7, 2018, 4:50:33 AM3/7/18
to Keras-users
I have four unbalanced classes with one-hot encoded target labels. I saw many posts suggesting to use sample_weights attribute of fit function in Keras but I did not find a proper example or documentation.Can someone tell me how to get class_weights or sample_weights for one-hot encoded target labels? 

Regards
Pranita
Reply all
Reply to author
Forward
0 new messages