how to train a model if I used KFold cross validation

Question

how to train a model if I used KFold cross validation

After splitting the set using the "sklearn.cross_validation.KFold" I have 6 chunks (3 train ,3 test,+ answers for them) . Is there a function that can be used to train the algorithm just to throw all the chunks, or do you need to constantly write :

Vasya=model.fit(chank1,answer1)
a1=model.predict(Vasya,answer_t_1)

?

4

python pandas машинное-обучение scikit-learn dataframe

Author: MaxU, 2018-02-25

Source

2 answers

Crossvalidation is built into sklearn. If you need to test the model on different folds using KFold the easiest way is cross_val_score or cross_val_predict

cross_val_score(model,chank1,answer1,cv=n) will give estimates for folds
cross_val_score(model,chank1,answer1,cv=n) will give all the predictions for X

But usually crossvalidation selects hyperparameters for this there are GridSearchCV which can be given a grid of parameters and "itself" will select the best one a combination.

NB all these functions have a parameter n_jobs so it's worth it not to write loops with handles n_jobs = -1 will put the machine on while the work is being done by loading all the prots - this is something that is not so easy to do in python.

4

Author: ilia timofeev, 2018-02-25 18:15:00

score 3 · Accepted Answer

If you want to learn how to use KFold, here is a small example:

kf = KFold(n_splits=N)
for train, test in kf.split(X):
    print("%s %s" % (train, test))
    X_train, X_test, y_train, y_test = X[train], X[test], y[train], y[test]
    model.fit(X_train, y_train)
    ...

The simpler option:

from sklearn.model_selection import cross_val_score

kf = KFold(n_splits=N)
results = cross_val_score(model, X, y, cv=kf)