IndexError: too many indices for array

Please help me I can't understand what exactly I'm doing wrong, I think the error is stupid, but I don't have enough knowledge

import numpy as np
from pandas import DataFrame
import pandas as pd
import warnings 
from sklearn import cross_validation, svm
warnings.simplefilter('ignore') # отключим предупреждения Anaconda
data = pd.read_csv('C:\\Users\\Vika\\Downloads\\ENB2012.csv', ';')
data.head()
from sklearn.cross_validation import train_test_split, cross_val_score
kfold = 5 #количество подвыборок для валидации
itog_val = {} #список для записи результатов кросс валидации разных алгоритмов 
X = data.values[::, 0:8]
y = data.values[::, 0:1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print ('обучающая выборка:\n', X_train[:9])
print ('\n')
print ('тестовая выборка:\n', X_test[:7])
from sklearn.ensemble import AdaBoostClassifier
clf = AdaBoostClassifier(n_estimators=60) 
scores = cross_validation.cross_val_score(clf, X_train, y_train, cv=kfold)
itog_val['AdaBoostClassifier'] = scores.mean()
clf.fit(X_train, y_train) 
clf.score(X_test, y_test) 
clf.predict(X_test) 
print ('AdaBoostClassifier:\n', X_test[:9])

Error mistake

Author: MaxU, 2018-01-12

1 answers

Looks like AdaBoostClassifier. fit(X, y, sample_weight=None) expects 1D array as y.

Try:

scores = cross_validation.cross_val_score(clf, X_train, y_train.ravel())

If y_train is not an integer type, then you may have to convert it first (to avoid ValueError: Unknown label type: 'continuous'):

scores = cross_validation.cross_val_score(clf, X_train, (y_train.ravel()*1000).astype(int))

Or use one of the linear regression models instead of the classifier

 0
Author: MaxU, 2018-01-13 00:30:47