Hi, dear all,
Recently, I ran experiment with CTC loss, which is based on Baidu's warp-ctc library. I use the default SGD optimizer. My data timestep is about 3500, the truth length is not longer than 639. All the input sequences are zero padded to the timestep, while the label are padded with -1. Based on my understanding, the ctc will not compute loss for negative value, while the keras masking will filter out zero-padding input. My batch size is 32, inupt dimension is 30, output dimension is 63.
In the first epoch, everything is fine. However when it goes into second epoch, I observe underflow (loss is very very very small displaying as 0). Baidu claims that they have optimize the stability by using log value for loss. However, I still have this problem. Does anyone know how to solve this problem?
Thank you very much!