Why using the principal component method, when applying the constructed transformation to the source data, negative values are obtained?

Data with a dimension of 374 rows x 31 columns is given. The first column is the date, the other columns are the share prices of 30 companies. I need to apply the principal component method. To do this, I wrote the following code:

import numpy as np
import pandas as pd
Location1 = r'C:\Users\...\close_prices.csv'
df = pd.read_csv(Location1)
from sklearn.decomposition import PCA
X = df.drop('date', 1)
pca = PCA(n_components=10)
pca.fit(X)
print(pca.explained_variance_ratio_)
# первая компонента объясняет больше всего вариации признаков (цены 30-ти компаний)
# теперь применяю преобразование к исходным данным
X1 = pca.transform(X)
X1
Out[7]:
array([[-50.90240358, -17.63167724,  -7.7360209 , ...,   3.55657041,
     -5.82197358,  -1.72604005],
   [-52.84690919, -19.14690749,  -7.27254551, ...,   3.43259929,
     -5.63318106,  -2.0122316 ],
X1.shape
# (374, 10)
# необходимо взять первую компоненту и рассчитать коэфициент корреляции Пирсона для Индекса Доу Джонса размерностью (374, 1) => я беру (374, 1)
X11 = X1[:,[0]]
X11.shape
# (374,1)

But I can't calculate the coefficient since the numbers are negative in X1. Therefore, when taking the root and dividing, the matrix is obtained with nan.

Why is it that after applying the trained model to X, a matrix with negative values is obtained?

Author: αλεχολυτ, 2016-07-25

1 answers

And what prevents multiplying the result by -1? PCA identifies directions in the feature space, while the orientation of the eigenvectors that define these directions does not play a special role.

 1
Author: q-dad, 2016-07-29 02:06:29