Logistic regression in Python

Question

Logistic regression in Python

Here is the code from the course

y_pred_train=logreg.predict(x_train)

y_predict_train=logreg.predict_proba(x_train)[:,1]

logreg.score(x_test,y_train)

Here's a good explanation I found.

Model.predict(X_test) - predict the values of the target variable

Model. predict_proba () - output the "degree of confidence" in the response (probability) - for some models

Model. score() – most models have built-in methods for evaluating their performance

1)I don't really understand what the degree of confidence means? and why not give an answer that is 100% sure?

2)how does score () evaluate the performance of the model if x_train and y_train are input? Although logically, should y_pred and y_train? And in general, what does the evaluation of the work mean?

Somewhere it is said about the determinant, although I am not familiar with this concept.

Score(self, X, y, sample_weight=None)[source] Returns the coefficient of determination R^2 of the prediction. ... A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0. From sklearn documentation

Finally, why do we assign everything to different variables?

Why not? a=predict(x_test) b=a.predict_proba()

I suspect that we got away with it,due to the fact that I determined random_state in advance, that is, how many times do not call predict (), the coefficients or weights will be the same. But I'm not so sure about that)

0

python машинное-обучение scikit-learn

Author: Тима , 2020-07-04

Source

1 answers

score 2 · Accepted Answer

In your previous question, you were advised to read at least some literature on the topic, well, at least for younger students. It looks like you didn't take the advice. Because the question shows that there is no real progress in understanding what you do, as you have not been and there is no, well, unless you looked at the formats of three more teams, and did not understand what they do and in general - why is all this.

Nevertheless, I will try to answer

1)Not very I understand what a degree of confidence means? and why can't you give an answer that is 100% sure?

In the method that you are "learning", the result is obtained as a number ranging from zero to one. For some (many) problems , this is exactly what you need - to show that both options are possible and to show what is the probability that the object belongs to one class or another. For other tasks (there are also many of them), you need to give an absolutely unambiguous answer to which class belongs to the answer. In this case, the response for the first option is reduced to the two-digit logic "0" - "1"using the trigger function. At the same time, you need to understand that the answer will be with an error, which should also be evaluated. (see below)

2)how does score () evaluate the performance of the model if x_train and y_train? Although logically, should y_pred and y_train? And in general, what does the evaluation of the work mean?

It is x_train and y_train that should be fed to the input of this function. Function, knowing, what should be the response of the model y_train performs the classification task, and then compares the received response with the expected one for each x_train. The score means the accuracy index for the (binary) classification and the value of the determination coefficient R^2 for regression models.

And finally, why do we all assign to different variables?

Why can't a=predict(x_test) b=a. predict_proba ()

Something strange is written here. At the same time, you are asking us to shall I tell you why you assign them so? In theory, it should be

а=model.predict(x_test)
а=model.predict_proba(x_test)

And you want to assign something to different

а=model.predict(x_test)
b=model.predict_proba(x_test)

In fact, the answers to all these questions can be found directly in the documentation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

то есть сколько раз не вызывай predict(), коэффициенты или же веса будут одинаковые

And why should they be different if you use the same model and the same data?

Once again-knowledge of the format of the fit, predict commands or score DOES NOT make a person an expert in machine learning.