Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

475 views

Skip to first unread message

Dec 14, 2023, 9:00:13 PM12/14/23

to

I got some old neural network code (written about 30 years ago).

It has several activation functions, which only change 2 lines, like so:

if (activation(1:2).eq.'SI' .or. activation(1:2).eq.'LO') then

output(i,j) = 1.0/(1.0+EXP(-output(i,j))) ! sigmoid

slope(i,j) = output(i,j) * (1.0 - output(i,j)) ! sigmoid

elseif (activation(1:2).eq.'TA') then

output(i,j) = TANH(output(i,j)) ! TANH

slope(i,j) = 1.0 - output(i,j)*output(i,j) ! TANH

elseif (activation(1:2).eq.'AR') then

y = output(i,j)

output(i,j) = ATAN(y) ! arctan

slope(i,j) = 1.0/(1.0 +y*y) ! arctan

elseif (activation(1:5).eq.'SOFTP') then

y = EXP(output(i,j))

output(i,j) = LOG(1.0+y) ! softplus

slope(i,j) = 1.0/(1.0+1.0/y) ! softplus

elseif (activation(1:5).eq.'SOFTS') then

y = output(i,j)

output(i,j) = y/(ABS(y)+1.0) ! softsign

slope(i,j) = 1.0/(1.0+ABS(y))**2 ! softsign

Now when running it, the tanh option is slowest, as expected.

But the sigmoid (using exp) is faster than softsign, which only needs

abs and simple arithmetic. How can this be? Even if exp is using a

table lookup and spline interpolation, I would think that is slower.

Softsign would have an extra divide, but I can't see that tipping the

scales.

It has several activation functions, which only change 2 lines, like so:

if (activation(1:2).eq.'SI' .or. activation(1:2).eq.'LO') then

output(i,j) = 1.0/(1.0+EXP(-output(i,j))) ! sigmoid

slope(i,j) = output(i,j) * (1.0 - output(i,j)) ! sigmoid

elseif (activation(1:2).eq.'TA') then

output(i,j) = TANH(output(i,j)) ! TANH

slope(i,j) = 1.0 - output(i,j)*output(i,j) ! TANH

elseif (activation(1:2).eq.'AR') then

y = output(i,j)

output(i,j) = ATAN(y) ! arctan

slope(i,j) = 1.0/(1.0 +y*y) ! arctan

elseif (activation(1:5).eq.'SOFTP') then

y = EXP(output(i,j))

output(i,j) = LOG(1.0+y) ! softplus

slope(i,j) = 1.0/(1.0+1.0/y) ! softplus

elseif (activation(1:5).eq.'SOFTS') then

y = output(i,j)

output(i,j) = y/(ABS(y)+1.0) ! softsign

slope(i,j) = 1.0/(1.0+ABS(y))**2 ! softsign

Now when running it, the tanh option is slowest, as expected.

But the sigmoid (using exp) is faster than softsign, which only needs

abs and simple arithmetic. How can this be? Even if exp is using a

table lookup and spline interpolation, I would think that is slower.

Softsign would have an extra divide, but I can't see that tipping the

scales.

Dec 14, 2023, 11:22:19 PM12/14/23

to

compiler and operating system? Second, how did you do the timing?

Third, is there a minimum working example that others can profile?

--

steve

Dec 22, 2023, 9:37:58 AM12/22/23

to

Il 15/12/23 05:22, Steven G. Kargl ha scritto:

Fourth, what were the numbers of timing.

Giorgio

Giorgio

Jan 30, 2024, 3:40:27 AMJan 30

to

call the strncmp function) you have in your conditionals are quite expensive on

todays CPUs. I would recommend to use an INTEGER constant to make the switch.

Thomas

0 new messages

Search

Clear search

Close search

Google apps

Main menu