scikit-learn
How to sort GridSearchCV.cv results
I'm taking a course in data science and they use the sklearn library, where there is a GridSearchCV method, the problem is th ... ain_fin, y_train_fin)
sorted(gridsearch.cv_results_,key = lambda x: -x.mean_validation_score)
What could be the problem ?
How can I encode categorical features containing Nan without adding a new category?
For example, a feature that takes the values {'Male', 'Female', NaN}, when using OneHotEncoder (or some other means), transla ... such a dataset should be obtained regardless of whether the set on which the encoder was trained had Nan in some categories.
How do I add micro avg to the classification report from sklearn.metrics?
I output the calculated metrics for the test data:
print(classification_report(y_true, y_pred_classes, target_names = CLAS ... 0.92 0.92 0.92 10000
I'm missing micro avg in this report. How do I correct the output to add this line?
Functions (metrics) for assessing the quality of classification
How can I use sklearn or numpy to find the fraction of wrongly predicted values?
There are two arrays of numbers of the same ... d to compare them, and divide the number of incorrect answers by the length. Is it possible to do this somehow in 1 function?
Different output values with the same parameters when classifying data
I select the parameters for the best training of the classification model.
I do it like this:
print('Исходная обученность: ... ий: ', accuracy_score(res3, y_test3))
As a result, the results are different.
Where did I go wrong? What am I doing wrong?
Classification methods in machine learning
There is a certain classification task: for training, the classifier receives an array of strings as a class and some numbers ... the problem, because I don't know about methods with multiple classes for an object yet.
I will be grateful for your advice!
Sampling and cross-validation
Tell me, I have a df... If I'm going to use cross-validation, it's enough for me to split my df into training and test samples and I don't need to extract the validation set additionally. Right? Or do I not understand something correctly?
Why is there such a big difference in accuracy when applying the Gini test and entropy?
Hello everyone.
I continue to slowly study ML and got to the well-known data set 'Wine'. And I hit the next point: if I use e ... rong (for example, I calculated the accuracy)? I read the theory and did not find any prerequisites for such big differences.
Logistic regression in Python
Here is the code from the course
y_pred_train=logreg.predict(x_train)
y_predict_train=logreg.predict_proba(x_train)[:,1]
lo ... that is, how many times do not call predict (), the coefficients or weights will be the same.
But I'm not so sure about that)
ValueError error: The truth value of an array with more than one element is ambiguous.Use a.any() or a.all()
import pandas as pd
import numpy as np
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import tr ... an error.
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Python average relative error of regression approximation
I'm new to python and don't quite understand how I can calculate the average relative approximation error using the formula
... ready-made function for getting this error, or is it just a loop? If it is a loop, do I need to normalize Y_test-real values?
The ML k-nearest Neighbor (kNN) algorithm)
Tell me if it is possible and how you can add a condition so that the prediction of the label knn.predict(x_test) occurs only ... _score(knn, X, Y, cv=LeaveOneOut())
print(scores.mean())
Or is there another method more suitable for this purpose? Thanks!
Nonlinear regression by the Gauss-Newton method
It is required to implement a nonlinear regression of the circular point cloud.
There is a point cloud in 3d circular cross-s ... ud.
Tell me, in which direction to look for a solution and are there any examples of implementations of nonlinear regression?
Select a parameter that maximizes the F-measure
I select the parameter k (integer) to multiply the classification threshold T.
That is, T = 0.1k.
There are three algorithm ... 6023125, 0.7659328 , 0.70362246, 0.70127618, 0.8578749 , 0.83641841,
0.62959491, 0.90445368])
Python: ValueError too many values to unpack (expected 2)
I'm trying to find the best parameters for the model using GridSearchCV and I want to use the data for April as cross validat ... s_
model.best_params_
When I run the code, this error occurs:
Can you please tell me what the problem might be? Thank you
how to train a model if I used KFold cross validation
After splitting the set using the "sklearn.cross_validation.KFold" I have 6 chunks (3 train ,3 test,+ answers for them) . Is ... all the chunks, or do you need to constantly write :
Vasya=model.fit(chank1,answer1)
a1=model.predict(Vasya,answer_t_1)
?
Python Anaconda: 1) installation; 2) need for Machine Learning
Two questions about Python Anaconda
OS Ubuntu 16.04. Do I need to demolish the existing Python and libraries (pandas, numpy ... the PA is valid will it greatly simplify life in this sense?
The questions are simple, so I will accept answers like yes\no.
cross validation
Program code, cross-validation is considered a bit wrong, help fix
import numpy as np
from pandas import DataFrame
import pa ... st[:9])
The answer is always only this
total {'AdaBoostClassifier': 1.0}
Original selection https://ru.files.fm/u/aempdy95
Missing sklearn.cross validation module
Mistake: In the example, there was a module sklearn.cross_validation. The module is missing and the program does not work. I ... =1):
8 print('{:^9} {} {!s:^25}'.format(iteration, data[0], data[1]))
TypeError: 'KFold' object is not iterable
IndexError: too many indices for array
Please help me I can't understand what exactly I'm doing wrong, I think the error is stupid, but I don't have enough knowledg ... f.fit(X_train, y_train)
clf.score(X_test, y_test)
clf.predict(X_test)
print ('AdaBoostClassifier:\n', X_test[:9])
Error