putting model = 'sparse returns the output formatted in the in the way thats required if your using 'spare_categorical_crossentropy' as a loss. In other words, it should return data thats been encoded as a 1-dimensional vector with integer labels [0,1,2,3,4,....n] where n is the number of classes, similar to the output from scikit-learn's LabelEncoder.
'categorical' mode should return one-hot 2-dimensional output identical to what you would have gotten if you'd used pad_sequences on a [0,1,2,3,4,....n] vector like the one I mentioned above.
I think 'binary' should require a 1d vector with only labels 0 and 1. Ie similar to sparse in it the dimensionality (1d) but similar to categorical in the possible values for the integers. (eg [0,1, 0, 1, 1, 1,0 ,...] )
Since 'sparse' and 'binary' would require the same sort of formats for the labels in the case where there are only 2 labels (and you shouldnt be using binary if there are more than 2 anyway), i imagine the they might share many of the same methods.