Yes, different arguments will give different results. Even with the same arguments, each time you run, you should get different results. The generator randomly samples from the range specified.
Depending on your problem, you may want to modify which augmentations and their ranges are sampled. The idea is to make the NN robust, so that if it later sees a slightly rotated, or scaled, or horizontally flipped version of your image it would still be able to classify it correctly.
By number of images, I'm guessing you mean number of epochs? If so, that should also effect the results. The more images, the better!
Both training and image generation is a stocastic process. This is important to prevent over fitting, making NN robust and also to prevent your optimizer from getting stuck in a local minimum.