Asking the neural network to predict decimal values in range [-1,1]

115 views
Skip to first unread message

Shivam Srivastava

unread,
May 26, 2018, 4:53:33 AM5/26/18
to Keras-users
Hi everyone,
I am trying to perform a very simple experiment, predict the input number. The concept is same as an auto-encoder. But with just one layer, which can handle the task of encoding and decoding-

Also, wanted to observe how the network learns, with different training examples.

Initially, I took 10000 training samples, of integers, in range[1, 10000]. So, example if I pass 767676 as one of the test sample it should predict the output as 767676.0
This test passed very well.
The activation function used here was 'softplus', which keeps the values in range [0, inf).
the network - 
train_d = numpy.array([i for i in range(10000)], dtype=np.int)

model = Sequential()
model.add(Dense(1, input_shape=(1, ), activation='softplus'))

model.fit([self.train_d], self.train_d, epochs=5000, batch_size=256, shuffle=True)

Now when I gave it 10000 training sample of decimal values in the range[-1, 1].  E.g. 0.3456 the expected output will be 0.3456 but rather, its giving me 0.5721. 
The network - 
train_d = numpy.array[round(random.uniform(-1, 1), 4) for i in range(10000)], dtype=np.float)

model = Sequential()
model.add(Dense(self.INPUT_DIM, input_shape=(self.INPUT_DIM,), activation='softsign'))

model.fit([self.train_d], self.train_d, epochs=5000, batch_size=256, shuffle=True)

Any suggestion or thoughts are appreciated.
Thank You

Daπid

unread,
May 26, 2018, 10:25:31 AM5/26/18
to Shivam Srivastava, Keras-users
What you are doing is impossible.  You are trying to learn the inverse function of the softsign with just a linear map: one multiplication and one addition.

If you want perfect results, you will need to either remove the softsign, in which case you should converge to the true solution fairly fast, but the outputs won't be constrained, or use a more complicated network (a single hidden layer should be sufficient, since the function is continuous). But be aware, numbers close to the extremes will be slightly off. 

The middle option is to use a linear output with a clip, so you always have numbers inside the range.

/David 

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/51f233a5-075c-4160-9308-4d3ca82cb44d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Shivam Srivastava

unread,
May 26, 2018, 10:48:47 AM5/26/18
to Keras-users
Thank you for the prompt reply. I was, i am confused with former result where I used softplus for replicate integer input. There it is behaving as per expectation. I think there also same process is being followed, but the use of softplus allows the output to be in range 0 to inf.
I thought changing the activation to softsign also meets the range of -1 to 1, and NN will contraint the output to this range.
Does it not work this way?
Like, the NN will learn the function to fit the result with the provided contraints?

Daπid

unread,
May 26, 2018, 4:40:09 PM5/26/18
to Shivam Srivastava, Keras-users
Softplus is pretty much the identity for x > 3 [1], so in your case, most of your range (1-10 000) works dandy. You will notice that the prediction for 1 gets worse.

A NN is a function with a bunch of parameters. In your case, the function is:

y = softsign(a * x + b)     (Eq. 1)

where a and b are two numbers, the parameters you try to optimise in training. Your target function is

y = x     (Eq. 2)

There are no values of a and b that make (1) be close to (2) for your whole range. It is just mathematically impossible. Now, a more complicated network, with one hidden layer, can learn it.


--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users+unsubscribe@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/f849467b-f843-40ed-9049-460b7d619dc3%40googlegroups.com.

Shivam Srivastava

unread,
May 26, 2018, 5:00:54 PM5/26/18
to Daπid, Keras-users
Thank you for the clarification. 
So to get the same output i will have to use something which is linear. 
So like the image autoencoders are present right. They use several hidden layers and very different activations. Why are they even called auto encoders. They are still ment to replicate the input. I am thinking correctly?

On Sun 27 May, 2018, 2:10 AM Daπid, <david...@gmail.com> wrote:
Softplus is pretty much the identity for x > 3 [1], so in your case, most of your range (1-10 000) works dandy. You will notice that the prediction for 1 gets worse.

A NN is a function with a bunch of parameters. In your case, the function is:

y = softsign(a * x + b)     (Eq. 1)

where a and b are two numbers, the parameters you try to optimise in training. Your target function is

y = x     (Eq. 2)

There are no values of a and b that make (1) be close to (2) for your whole range. It is just mathematically impossible. Now, a more complicated network, with one hidden layer, can learn it.

On 26 May 2018 at 16:48, Shivam Srivastava <srivastav...@gmail.com> wrote:
Thank you for the prompt reply. I was, i am confused with former result where I used softplus for replicate integer input. There it is behaving as per expectation. I think there also same process is being followed, but the use of softplus allows the output to be in range 0 to inf.
I thought changing the activation to softsign also meets the range of -1 to 1, and NN will contraint the output to this range.
Does it not work this way?
Like, the NN will learn the function to fit the result with the provided contraints?

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.

Shivam Srivastava

unread,
May 28, 2018, 8:55:15 AM5/28/18
to Keras-users
Hi, 
I understood what you are referring to as the impossible thing, and you are correct that it is not mathematically possible.
So i went through your later solution, use a complex network. 
Here are my observations-

When using only a single layer with a softsign activation, the loss on the training doesn't go beyond 0.0163.
And for the test the input is 0.9876 the predicted value is 0.7243.

I increased the layer to two -
Architecture - 

model = Sequential()
model.add(Dense(1 input_shape=(1, ), activation='softsign'))
model.add(Dense(self.INPUT_DIM, activation='softsign'))

The loss is 0.0166

I introduced on more layer

model = Sequential()
model.add(Dense(1 input_shape=(1, ), activation='softsign'))
model.add(Dense(1, activation='softsign'))
model.add(Dense(1 input_shape=(1, ), activation='softsign'))

Loss - 0.0164

I though to increase the neurons in the middle layer and see how things change

model = Sequential()
model.add(Dense(1 input_shape=(1, ), activation='softsign'))
model.add(Dense(1, activation='softsign'))
model.add(Dense(1 input_shape=(1, ), activation='softsign'))

Loss - 3.0713e-04 

Well here we see a dramatic change.

This completely  confirms that, complex the task complex the architecture will be.
Thanks a lot for the help, really appreciate.

Reply all
Reply to author
Forward
0 new messages