How to use sample weights for 3d outputs (for RNNs)?

Hristo Buyukliev

unread,

Sep 18, 2016, 4:22:10 AM9/18/16

to Keras-users

So, i have time series, and want to predict the same time series one step ahead. Unfortunately, a lot of the data is missing. I want to use sample weights, so that is a field is missing, it is not added to the loss. However, sample_weights can be 1D (by default), or 2D (when sample_weight_mode is set to temporal). However, I want them to be 3D - e.g. my input is (100, 1000, 100), and my output is (100, 1000, 5) - (samples, timesteps, features). When I use 2D weights, if a row in my output has only one missing value, the whole row's information is discarded.

What I want to do is something like this:

sample_weights = (1-np.isnan(X))

model = Model(input = inputs, output = dense)
model.compile(optimizer="rmsprop", loss="mae", sample_weight_mode="temporal")

model.fit(X, y, sample_weight=sample_weights )

But I get an exception:

Using Theano backend.
Traceback (most recent call last):
File "test.py", line 21, in <module>
model.fit(values[:,:-1,:], values[:,1:,:], sample_weight=sample_weights )
File "/home/hristo/mlenv/local/lib/python2.7/site-packages/keras/engine/training.py", line 1032, in fit
batch_size=batch_size)
File "/home/hristo/mlenv/local/lib/python2.7/site-packages/keras/engine/training.py", line 970, in _standardize_user_data
in zip(y, sample_weights, class_weights, self.sample_weight_modes)]
File "/home/hristo/mlenv/local/lib/python2.7/site-packages/keras/engine/training.py", line 367, in standardize_weights
str(sample_weight.shape) + '. '
Exception: Found a sample_weight array with shape (214, 36, 1305). In order to use timestep-wise sample weighting, you should pass a 2D sample_weight array.

ar...@cardiogr.am

unread,

Jun 2, 2017, 3:21:07 PM6/2/17

to Keras-users

Hey! Did you end up figuring out how this works? Did you write a custom loss function instead?

Thanks!

dieuwk...@gmail.com

unread,

Jul 3, 2017, 6:28:24 AM7/3/17

to Keras-users

Hey! The answer to this question would also make my life a lot easier, did you figure it out?

Best,

Dieuwke

Op zondag 18 september 2016 10:22:10 UTC+2 schreef Hristo Buyukliev:

ar...@cardiogr.am

unread,

Jul 5, 2017, 1:01:33 PM7/5/17

to Keras-users, dieuwk...@gmail.com

From what I've seen, you can create a multi-output model (https://keras.io/getting-started/functional-api-guide/#multi-input-and-multi-output-models) and pass in a set of 2D sample weights PER output. If you concatenate the results at the end, you effectively have used 3d sample weights. Does this make sense?

Dieuwke Hupkes

unread,

Jul 6, 2017, 6:30:52 AM7/6/17

to ar...@cardiogr.am, Keras-users

Thanks for responding :).

I think this is not a very feasible solution, but correct me if I'm wrong. Would you have have as many output units as your batch size? Or as the length of the sequence?

In the first case I suppose it would work I think, but you lose some of the advantages of using batches. For example, you cannot parallize the computation of the diff

batches on the gpu anymore.

In the second case I think some strange things are happening, as the outputs in this case are dependent on each other (as they represent unfolded representations

from one recurrent layer). Of course you can connect the outputs in a sequential way, ut this means that from the first output layer you'd need the first value from the

second the second value, etc. Similarly, you'd have to give the first input to the first LSTM, the second one to the second, etc. with respect to the gates I am not even

sure how this would work as the gates of the second lstm should be dependent on the hidden state of the first one, and so on. Or am I missing something now?

I think the custom lossfunction which ignores missing values is probably a better solution (although also this may be wasting some computational power).

Reply all

Reply to author

Forward