Hello, All
For face attribute training, three losses are all deployed at the top of the network, one L2 loss for landmark regression, one L2 loss for age regression and one Softmax/Hinge Loss for gender regression.
The overview network is as following
-> 3 fc for age -> L2 Loss
stacked convolutions layers and pooling -> 3 fc for gender -> Softmax/Hinge Loss
-> 3 fc for landmark -> L2 Loss
But obvious problem is as the following,
1. different loss function has different order like 1e4 for landmark and 1e-2 for gender classfication.
2. different loss function has different gradient direction which are in conflict in some degree when training.
3. Actually, this network fluctuates very seriously.
So should I merge these three functions into one loss or is there any other methods I can use to solve these problems?
Any help is greatly appreciated.
Thanks.
Yan