Hello There,
When I was checking the procedure that scikit-learn do Cross-Validation or doing the
GridSearchCV, your official websites about Cross-Validation in model tuning said ambiguously like:
"
A solution to this problem is a procedure called
cross-validation (CV for short). A test set
should still be held out for final evaluation, but the validation set is no longer needed when doing CV. In the basic approach, called
k-fold CV, the training set is split into
k smaller sets (other approaches are described below, but generally follow the same principles). The following procedure is followed for each of the
k “folds”:
-
A model is trained using 𝑘−1
of the folds as training data; -
the resulting model is validated on the remaining part of the data (i.e., it is used as a test set to compute a performance measure such as accuracy).
"( cited from the link at very beginning of this mail, I hope that was your official description)
The words ' using the k-1 of the folds as training data' for each k-folds, it sounds like either
fitting are done k-1 times for each of those folds(and even maybe do with an average as aggregated method) or
fitting once for the combined data of those k-1 folds.
So Which understanding shall be taken? and it seems that the number of fittings is no longer callable when using
GridSearchCV, does number(fittings)=number(candidates)*cv hold?
With Regards
Y,Gao