Robinn
unread,Dec 14, 2023, 9:00:13 PM12/14/23You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
I got some old neural network code (written about 30 years ago).
It has several activation functions, which only change 2 lines, like so:
if (activation(1:2).eq.'SI' .or. activation(1:2).eq.'LO') then
output(i,j) = 1.0/(1.0+EXP(-output(i,j))) ! sigmoid
slope(i,j) = output(i,j) * (1.0 - output(i,j)) ! sigmoid
elseif (activation(1:2).eq.'TA') then
output(i,j) = TANH(output(i,j)) ! TANH
slope(i,j) = 1.0 - output(i,j)*output(i,j) ! TANH
elseif (activation(1:2).eq.'AR') then
y = output(i,j)
output(i,j) = ATAN(y) ! arctan
slope(i,j) = 1.0/(1.0 +y*y) ! arctan
elseif (activation(1:5).eq.'SOFTP') then
y = EXP(output(i,j))
output(i,j) = LOG(1.0+y) ! softplus
slope(i,j) = 1.0/(1.0+1.0/y) ! softplus
elseif (activation(1:5).eq.'SOFTS') then
y = output(i,j)
output(i,j) = y/(ABS(y)+1.0) ! softsign
slope(i,j) = 1.0/(1.0+ABS(y))**2 ! softsign
Now when running it, the tanh option is slowest, as expected.
But the sigmoid (using exp) is faster than softsign, which only needs
abs and simple arithmetic. How can this be? Even if exp is using a
table lookup and spline interpolation, I would think that is slower.
Softsign would have an extra divide, but I can't see that tipping the
scales.