I was recently trying to reproduce a Chainer model in Tensorflow and was running into problems reproducing outputs in 1D deconvolution layers. For example, in the code snippet below, I use the same weights, strides, paddings, etc and get different outputs for TF deconvs and chainer deconvs:
import numpy as np
import chainer.links as L
from tensorflow.keras.layers import Conv1DTranpose
chainer_deconv = L.DeconvolutionND(1, 100, 1024, 1, 1, 0, initialW=fsgen_weights['dc0/W'],initial_bias=fsgen_weights['dc0/b'])
tf_deconv = Conv1DTranspose(1024, 1, strides=1,data_format='channels_first',kernel_initializer=tf.constant_initializer(fsgen_weights['dc0/W']), bias_initializer=tf.constant_initializer(fsgen_weights['dc0/b']))
batch_size = 5
test_in = np.random.rand(batch_size,100,1)*2-1
#print a portion of the outputs
print(chainer_deconv(test_in)[0][0:4])
--> prints variable([[-0.00434582]
[ 0.08588306]
[ 0.01726495]
[ 0.0137401 ] ])
print(tf_deconv(test_in)[0][0:4])
--> prints tf.Tensor(
[[-0.03280227]
[ 0.08922134]
[-0.03946912]
[-0.10054161] ])
Can anybody tell me why these layers return different values if they are using identical weights and inputs?