Train set - 2.5M points, test set - 0.8M points. Every time i make prediction, i have different results. Roc_auc is varying between 0.5-0.9. Averaging of prediction gives roc_auc > 0.9.
I know mcmc is pretty occasional, but maybe it is possible to have results more stable without averaging?
And second question, is it necessary to make scaling of input variables?