How to plot a learning curve for 10-fold cross validation

123 views
Skip to first unread message

Muneera

unread,
Mar 4, 2023, 12:20:58 PM3/4/23
to python-weka-wrapper
Hi everyone
I am working on a dataset and use 10-fold cross validation

I want to plot a learning curve that shows the performance of the model + if the model is overfitting, underfitting, or good fit).

When I searched about the way to do it, I found the most posts are talking about hold-out (splitting dataset into training and testing sets).

Can you please provide me a link about plotting a learning curve for k-fold cross validation?

N.B: I built my model using Weka platform, but it is OK to plot the curve using python code via Kaggle website, if it is needed.

Peter Reutemann

unread,
Mar 5, 2023, 5:39:39 PM3/5/23
to python-we...@googlegroups.com
What are trying to vary for your learning curve?
- size of dataset?
- particular filter or classifier parameter?
- different classifiers?

For determining the stability of a model, I would recommend using the
Experimenter. A large standard deviation means that the model
performance varies a lot between train/test splits. And such a large
variation can also mean overfitting.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, Hamilton, NZ
Mobile +64 22 190 2375
https://www.cs.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/

Muneera

unread,
Mar 6, 2023, 1:31:59 AM3/6/23
to python-weka-wrapper
I do not know what is the appropriate for my case.
However, I have 7 classifiers. I want to compare beteewn their performance and choose the best one.
So, I think I need to plot 7 learning curves (a curve for each classifier) and vary the dataset's size in each curve.

Peter Reutemann

unread,
Mar 6, 2023, 10:51:10 PM3/6/23
to python-we...@googlegroups.com
> I do not know what is the appropriate for my case.
> However, I have 7 classifiers. I want to compare beteewn their performance and choose the best one.
> So, I think I need to plot 7 learning curves (a curve for each classifier) and vary the dataset's size in each curve.

I've added an example for using the experiment API to generate learning curves:
https://github.com/fracpete/python-weka-wrapper3-examples/blob/master/src/wekaexamples/experiments/learning_curve.py

NB: I made some modifications to the "plot_experiment" method, so you
will need to install pww3 straight from the github repository:
https://fracpete.github.io/python-weka-wrapper3/install.html#github
Reply all
Reply to author
Forward
0 new messages