I don't think that approach works for an imbalanced dataset:
Considering a DB of: 112222222222333,
a shuffled DB would be:
212223232221232
Taking a batch sizes of 3 would result in:
212, 223,232,221,232
which is unbalanced (not uniform distributed) and follows the distribution of the imbalanced dataset.
One possibility would be to insert copies of the minority classes to the dataset:
11[1][1][1][1][1][1][1][1]2222222222333[3][3][3][3][3][3][3], where [1] denotes a copy of a random image of the '1' class.
After shuffling, the batch would be then balanced, e.g.:
[3]21[3][1]2[3]2[1]2[1]3[3][1]232[1][3]22[1]1[3]2[1]32[1][3]
Nevertheless, this approach requires huge amount of disc space since every minority class will be scaled to the number of images of the most frequent class.
Looking at your image size I'm not sure if this approach is suitable.