Python: ValueError too many values to unpack (expected 2)

I'm trying to find the best parameters for the model using GridSearchCV and I want to use the data for April as cross validation. Code:

x_train.head()

x_train.head()

y_train.head()

y_train.head()

from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import make_scorer
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import TimeSeriesSplit
import xgboost as xg

xgb_parameters={'max_depth':[3,5,7,9],'min_child_weight':[1,3,5]}
xgb=xg.XGBRegressor(learning_rate=0.1, n_estimators=100,max_depth=5, min_child_weight=1, gamma=0, subsample=0.8, colsample_bytree=0.8)
model=GridSearchCV(n_jobs=2,estimator=xgb,param_grid=xgb_parameters,cv=train_test_split(x_train,y_train,test_size=len(y_train['2016-04':'2016-04']), random_state=42, shuffle=False),scoring=my_func)
model.fit(x_train,y_train)
model.grid_scores_
model.best_params_

When I run the code, this error occurs: Mistake

Can you please tell me what the problem might be? Thank you

Author: MaxU, 2018-04-07

1 answers

You specified the cv parameter incorrectly in the GridSearchCV() call.

Here is what can be specified as cv:

cv : int, cross-validation generator or an iterable, optional
    Determines the cross-validation splitting strategy.
    Possible inputs for cv are:
      - None, to use the default 3-fold cross validation,
      - integer, to specify the number of folds in a `(Stratified)KFold`,
      - An object to be used as a cross-validation generator.
      - An iterable yielding train, test splits.

    For integer/None inputs, if the estimator is a classifier and ``y`` is
    either binary or multiclass, :class:`StratifiedKFold` is used. In all
    other cases, :class:`KFold` is used.

    Refer :ref:`User Guide <cross_validation>` for the various
    cross-validation strategies that can be used here.

train_test_split returns four arrays: X_train, X_test, y_train, y_test, this causes the specified error.

In your case, it is enough to specify cv=3 (or another integer "folds" for cross-validation).

If you specify an integer (for example, 3), GridSearchCV() will prepare three random splits into the training and test data sets. In each in this case, the size of the training sample will be approximately 2/3 and the test sample 1/3 for cv=3 (for cv=5, this will be 4/5 and 1/5 respectively). All these breakdowns participate in the selection of hyperparameters.

Example:

If you specify cv=3 and xgb_parameters={'max_depth':[3,5,7,9],'min_child_weight':[1,3,5]}, then GridSearchCV() will have a total of:

4 (max_depth) * 3 (min_child_weight) * 3 (cv) = 36

Tasks.

PS when training the model, make sure that the model does not see the test data when training, to avoid data leakage.

 1
Author: MaxU, 2018-04-07 20:20:17