Hi Alexis,
Thank you for using Yellowbrick, I'm glad to hear that it's useful to you!
That learning curve is very strange, and my first guess would be that it is related to a feature in the data. Without knowing too much about it, I'd be suspicious that the oversampling technique appended instances to the end of the data set, and that's what may have caused the dramatic change in CV score?
Perhaps you could use shuffle=True in the learning curve, this applies a shuffle to the training data before taking prefixes. It isn't done by default to protect time series data, but if you're doing oversampling, it's probably a good step to use. If you did use shuffle, perhaps you could invert the order of your dataset and not shuffle? If the effect still remains, then it could be that you simply need far more than 1600 instances.
Hope that helps, good luck!
Best Regards,
Benjamin Bengfort