it is my aim to train a simple autoencoder, which samples down the dataset from 30 or so columns to 3 and then (middle layer) and then reconstructs it to 30 or so dimensions. The aim here is to measure the reconstruction error, giving me an idea of how good the network is at encoding the data and preserving its structure.When training, I get nans for the loss. The data looks totally normal and the inputs are nd numpy arrays.
I normalize the data. Outcome looks normal.
or i in range(0,len(data_prep.columns)):
#print(i)
try:
data_prep.iloc[:,i] = (data_prep.iloc[:, i]).astype(float)
except ValueError:
print('float war nicht moeglich jetzt int')
data_prep.iloc[:,i] = (data_prep.iloc[:, i]).astype(int)
maximum = np.nanmax(np.asarray(data_prep.iloc[:, i]))
minimum = np.nanmin(np.asarray(data_prep.iloc[:, i]))
rangevalues = maximum-minimum
for x in range(0,400):
data_prep.iloc[x, i] = (data_prep.iloc[x, i] - minimum)/rangevalues
#-------------Variable Declaration--------------------------------------
#Anzahl der Spalten von denen wir ausgeganen sind
input_data = Input(shape=(numberofdimensions,))
#---------------------Autoencoder Modell----------------------------------
model = Sequential()
model.add(keras.layers.Dense(3, input_shape = (numberofdimensions,),activation='relu'))
model.add(keras.layers.Dense(numberofdimensions,activation='relu'))
model.compile(optimizer='rmsprop', loss='mean_squared_error')
## ive tried different loss functions
#just splitting the data but not really using Y as validation yet.
X = data_prep.iloc[0: int(len(data_prep.index) * 0.7),0:numberofdimensions]
Y = data_prep.iloc[int(len(data_prep.index) * 0.7):399,0:numberofdimensions]
X = np.asarray(X)
Y = np.asarray(Y)
model.fit(X, X,
batch_size=256, epochs=number_of_epochs, verbose=2)