Multi-class classification/one-hot encoded labels

31 views
Skip to first unread message

Ryan Friedman

unread,
Sep 5, 2021, 7:04:42 PM9/5/21
to Selene (sequence-based deep learning package)
Hi Kathy et al.,

I'm trying to train my neural net to do multi-class classification, i.e. I have K possible labels and the output layer of my CNN is the SoftMax on K binary classifiers so that the probability of a sequence belonging to each class sums to 1. More specifically, the input is a cis-regulatory sequence and the output is the probability that sequence is a strong enhancer, weak enhancer, inactive, or silencer.

Currently, the way that I am representing my labels ("features" in Selene) is by one-hot encoding them. This works fine and I can get performance metrics on the validation data using macro averaging. However, I would also like to monitor the F1 score during training. To do that, I need to take the argmax of the predictions for each sequence. However, that seems incompatible with how `TrainModel.validate` functions, specifically `PerformanceMetrics.update`, because it loops over each label, computes performance metrics for that label, and then averages across them.

I'm thinking that a solution could be to represent my labels as a single int (0, 1, 2, or 3) rather than one-hot encoding them. Then I can write custom wrapper functions for computing AUROC, AUPR, and F1. However, performing backprop would have problems because the shape of my labels (1D array or N x 1 array) will not match the shape of the predictions (N x 4 array).

Do you have any suggestions for how to handle this? Would it be better to one-hot encode the labels somewhere in my loss function, write my own sampler to deal with one-hot encoded labels, or something else altogether?

Thanks for your help!
Ryan
Reply all
Reply to author
Forward
0 new messages