我应该如何使用它的 `.components` 编写代码 scikit-learn PCA `.transform()` 方法？

皆川圭

我应该如何.transform()使用它的scikit-learn PCA方法编写代码.components？

我认为 PCA.transform()方法只需将矩阵M应用于 3D 点，即可将 3D 点转换为 2D 点，P如下所示：

np.dot(M, P)

为了确保这是正确的，我编写了以下代码。但是，结果是，我无法获得与 PCA.transform()方法相同的结果。我应该如何修改代码？我错过了什么吗？

from sklearn.decomposition import PCA
import numpy as np

data3d = np.arange(10*3).reshape(10, 3) ** 2
pca = PCA(n_components=2)
pca.fit(data3d)
pca_transformed2d = pca.transform(data3d)

sample_index = 0
sample3d = data3d[sample_index]

# Manually  transform `sample3d` to 2 dimensions.
w11, w12, w13 = pca.components_[0]
w21, w22, w23 = pca.components_[1]
my_transformed2d = np.zeros(2)
my_transformed2d[0] = w11 * sample3d[0] + w12 * sample3d[1] + w13 * sample3d[2]
my_transformed2d[1] = w21 * sample3d[0] + w22 * sample3d[1] + w23 * sample3d[2]

print("================ Validation ================")
print("pca_transformed2d:", pca_transformed2d[sample_index])
print("my_transformed2d:", my_transformed2d)
if np.all(my_transformed2d == pca_transformed2d[sample_index]):
    print("My transformation is correct!")
else:
    print("My transformation is not correct...")

输出：

================ Validation ================
pca_transformed2d: [-492.36557212   12.28386702]
my_transformed2d: [ 3.03163093 -2.67255444]
My transformation is not correct...

用户6655984

PCA 从将数据居中开始：减去所有观察值的平均值。在这种情况下，居中是通过

centered_data = data3d - data3d.mean(axis=0)

沿轴 = 0（行）求平均值意味着只剩下一行，包含平均值的三个分量。居中后，将数据乘以PCA分量；但我不会手动写出矩阵乘法，而是使用.dot：

my_transformed2d = pca.components_.dot(centered_data[sample_index])

最后，验证。不要==在浮点数之间使用；完全平等是罕见的。由于某处的操作顺序不同，会出现微小的差异：例如，

0.1 + 0.2 - 0.3 == 0.1 - 0.3 + 0.2

是假的。这就是为什么我们有np.allclose，它说“它们足够接近”。

if np.allclose(my_transformed2d, pca_transformed2d[sample_index]):
    print("My transformation is correct!")
else:
    print("My transformation is not correct...")

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。