Usage of class_weights in model.fit()

Andrea Cimino

unread,

May 18, 2016, 6:14:31 PM5/18/16

to Keras-users

I am working on a classification task, where the classes to be recognized are unbalanced: what

happens is that the trained model always predicts the most probable classes instead

of the class with fewer examples.

Suppose I have 3 classes with these frequencies: {0: 1000, 1:500, 2: 100},

how the class_weights parameter should be tuned most likely?

In my mind there are two possible configurations:

1) {0: 0.1, 1: 0.5, 2: 1}

2) {0: 1: 1: 0.5:, 2: 0.1}

Could someone please elaborate on that?

Kind regards

Craig Pfeifer

unread,

May 22, 2016, 9:33:17 PM5/22/16

to Keras-users

You can use sci-kit learn's sklearn.utils.compute_class_weight():

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/class_weight.py

Which will look at the distribution of labels, and produce weights to equally penalize under/over represented classes in the training set.

You can then pass this list to the class_weight param of the model.fit() function.

Andrea Cimino

unread,

May 23, 2016, 6:36:48 AM5/23/16

to Craig Pfeifer, Keras-users

Thanks Craig for this pointer,

I will look to use this method to produce the weights for my inbalanced data.

I have seen that the implementation is based on this paper:

http://gking.harvard.edu/files/0s.pdf,

so I will look into it too.

Andrea

--
You received this message because you are subscribed to a topic in the Google Groups "Keras-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/keras-users/MUO6v3kRHUw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to keras-users...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/8b4b904e-d95e-4c55-b51a-f0b10bf4b120%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

beaun...@gmail.com

unread,

Aug 31, 2016, 4:17:21 PM8/31/16

to Keras-users

Hi Andrea,

Starting high level: To train unbalanced classes 'fairly', we want to increase the importance of the under-represented class(es).

To do this, we need to chose a reference class. You can pick any class to serve as the reference, but conceptually, I like the majority class (the one with the most samples).

Creating your class_weight dictionary:

1. determine the ratio of reference_class/other_class. If you choose class_0 as your reference, you'll have (1000/1000, 1000/500, 1000/100) = (1,2,10)

2. map the class label to the ratio: class_weight={0:1, 1:2, 2:10}

Conceptually, what your dict is saying is that, during training, class_1 should be treated as 2x as important as class_0. Likewise class_2 should be treated as 10x as important as class_0 and 5x as important as class_1

Andrea Cimino

unread,

Sep 1, 2016, 4:25:15 AM9/1/16

to Keras-users, beaun...@gmail.com

Il giorno mercoledì 31 agosto 2016 22:17:21 UTC+2, beaun...@gmail.com ha scritto:

Hi Andrea,
Starting high level: To train unbalanced classes 'fairly', we want to increase the importance of the under-represented class(es).
To do this, we need to chose a reference class. You can pick any class to serve as the reference, but conceptually, I like the majority class (the one with the most samples).
Creating your class_weight dictionary:
1. determine the ratio of reference_class/other_class. If you choose class_0 as your reference, you'll have (1000/1000, 1000/500, 1000/100) = (1,2,10)
2. map the class label to the ratio: class_weight={0:1, 1:2, 2:10}

Thanks for your kind suggestion.

I will try to experiment with these settings and see if I will get some improvements!

Ciao,

Andrea

phydo...@gmail.com

unread,

Jan 30, 2018, 1:59:39 AM1/30/18

to Keras-users

Actually, you still need to pass the dict to the class_weight param of the model.fit() function. Passing list does not work.

something like:

from sklearn.utils.class_weight import compute_class_weight

class_weight_list = compute_class_weight('balanced', np.unique(y_train_labels), y_train_labels)
class_weight = dict(zip(np.unique(y_train_labels), class_weight_list))

pranitap...@gmail.com

unread,

Mar 7, 2018, 4:50:33 AM3/7/18

to Keras-users

I have four unbalanced classes with one-hot encoded target labels. I saw many posts suggesting to use sample_weights attribute of fit function in Keras but I did not find a proper example or documentation.Can someone tell me how to get class_weights or sample_weights for one-hot encoded target labels?