I have an input vector of real numbers describing the current state, meanwhile the output is the next action to take (natural number).
Is it right coherent to use an LSTM network to solve a classification problem regarding a sequence of states related to a sequence of corresponding actions?
Is this the right way to build the input for LSTM or are there better ways?
Happy to hear opinions and suggestions.