I'm trying to create a custom layer but I'm pretty well stuck.
I've
read the docs at
https://keras.io/layers/writing-your-own-keras-layers/, the example code at
https://github.com/fchollet/keras/blob/master/examples/antirectifier.py, and several files of Keras source, but I'm
still having trouble figuring out how to build a custom layer of the kind I want. Adding
to the problem, calls like self.add_weight() are in the official
example, and in the source, but have no documentation save for scant comments in the
source itself. I've been following simple models through the debugger to figure things out, but I'm missing some conceptual pieces and I've hit the wall of frustration.
I'd like to make a
layer that, ideally, contains a multitude of individual "units",
conceptually like a dense layer has multiple neurons. Each units computes a single number as output, representing the distance
to its inputs found by summing the squared weighted inputs, then taking the
square root. There's no additional regularization or activation function or other stuff, just that one little computation.
My code compiles, but fails
during run time deep in the library. It's very hard for me to unravel what's wrong, or even if I'm going about this properly.
Specific questions:
1. Can I make
multiple "units" on one layer? If so, how? If not, should I use the
Functional API, create many single-unit layers with a common parent, and
then combine their outputs (with, say, the concatenation layer). In
that case, I could take the output of the concatenation layer and feed
it to each of the next set of one-unit layers, collect them back
together, and so on. So the graph would be the input layer, a wide bunch of HyperEllipse layers that all collapse into a concatenation layer, then another wide bunch of HyperEllipse layers that again collapse, and so on until the final processing layer(s). Or is there a better way?
2. I want my units
to each receive their inputs along weighted arcs. Do I use
self.add_weight() for this? My guess is that this puts a learnable
weight on each input that can then be updated after backprop. Is that
right? So in my case, I'd want one such learnable weight on each input
of each "unit" of the layer. If I create these weights with
"add_weight", are they automatically updated during training?
Here's the code I'm using now. It uses one "unit" per layer. I'm sure it's muddle-headed in some way, but the conceptual model of a custom layer is still opaque to me, so I started with the example code for AntiRectifier and combined it with the code in Dense (as an example for using multiple units). Since this dies with an error at run time, I've obviously bungled something, but I've been chasing my own tail on this and getting nowhere.
I would be grateful for any help in figuring out any aspect of this. Thank you!
class HyperEllipse(layers.Layer):
'''A HyperEllipse layer.
'''
# We return one number, the square root of the summed, squared inputs.
def compute_output_shape(self, input_shape):
shape = list(input_shape)
shape[-1] = 1
return tuple(shape)
# Create weights for the inputs (I think?).
def build(self, input_shape):
self.kernel = self.add_weight(name='kernel',
shape=(input_shape[1], 1),
initializer='uniform',
trainable=True)
super(HyperEllipse, self).build(input_shape) # Be sure to call this somewhere!
# Return a Tensor object of shape (?, output_length) with type float32
# If necessary, use K.reshape with sizes (1, output_length). The first index
# should be ? but I don't know how to make that, and this seems to work (so far)
def call(self, inputs):
input2 = K.square(inputs)
summed_squares = K.sum(input2)
dist = K.sqrt(summed_squares)
# The above is a Tensor with shape (). Let's make it (1,1)
dist = K.reshape(dist, (1, 1))
print("dist=", dist)
return dist
# Test it out with a simple model
inputs = Input(shape=(2,))
x = HyperEllipse()(inputs)
output = Dense(1, activation=None)(x)
model = Model(inputs=inputs, outputs=output)
# compile the model
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])
# Dumb data for fake training, just to see if it will run
x_train = [[1,2],[2,3],[3,4]]
y_train = [1, 2, 3]
x_test = [[1,2],[2,3],[3,4]]
y_test = [1, 2, 3]
# train the model
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=1, # epochs,
verbose=1,
validation_data=(x_test, y_test))
print("Done")
======
This fails during the call to fit() with the following error trace (from the PyCharm IDE):
File "/Users/Andrew/Desktop/MLBook/MyCode/debugging-foo/foo.py", line 132, in <module>
validation_data=(x_test, y_test))
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/keras/engine/training.py", line 1575, in fit
self._make_train_function()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/keras/engine/training.py", line 960, in _make_train_function
loss=self.total_loss)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
return func(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/keras/optimizers.py", line 237, in get_updates
new_a = self.rho * a + (1. - self.rho) * K.square(g)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1358, in square
return tf.square(x)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 420, in square
return gen_math_ops.square(x, name=name)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 2463, in square
result = _op_def_lib.apply_op("Square", x=x, name=name)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 508, in apply_op
(input_name, err))
ValueError: Tried to convert 'x' to a tensor and failed. Error: None values not supported.