PCA implementation issue

12 views
Skip to first unread message

Serge Stinckwich

unread,
Jul 31, 2018, 10:28:20 AM7/31/18
to polymath...@googlegroups.com
Dear all,

I try to implement a PCA (Principal Component Analysis) algorithm based on existing code of Didier.

This is an example in last version of PolyMath:
==============================================================
m := PMMatrix rows: #(#(-1 -1) #(-2 -1) #(-3 -2) #(1 1) #(2 1) #(3 2)).

"Compute PCA components"
pca := PMPrincipalComponentAnalyser new.
pca componentsNumber: 1.
pca fit: m.

"Return eigen values"
pca components.
 "#(6.616285933932035)"

"Return eigen vectors"
pca transformMatrix.
 "a PMVector(0.8384922379048739 -0.5449135408239332)"

pca transform: m.
"a PMVector(-0.29357869708094075)
a PMVector(-1.1320709349858147)
a PMVector(-1.4256496320667553)
a PMVector(0.29357869708094075)
a PMVector(1.1320709349858147)
a PMVector(1.4256496320667553)"

================================================================

If I'm doing something similar in Python, I'm not having exactly the same eigen vectors ... Apparently sign are inversed on the diagonal. I dunno why ...

=================================================================
>>> import numpy as np
>>> from sklearn.decomposition import PCA
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> pca = PCA(n_components=1)
>>> pca.fit(X)
PCA(copy=True, iterated_power='auto', n_components=1, random_state=None,
  svd_solver='auto', tol=0.0, whiten=False)
>>> pca.components_
array([[-0.83849224, -0.54491354]])
>>> pca.transform(X)
array([[ 1.38340578],
       [ 2.22189802],
       [ 3.6053038 ],
       [-1.38340578],
       [-2.22189802],
       [-3.6053038 ]])
===============================================================

I can try to implement a PCA with SVD instead (like in Python) but I would like to save my time.

PolyMath implementation of PCA use Jacobi transformation of the covariance matrix of the data. The covariance matrix is symmetric, so the Jacobi transformation should work correctly I think.

Someone has an idea what happens here ?

Thank you.​
​A+​

--
Serge Stinckwich
UMI UMMISCO 209 (SU/IRD/UY1)
"Programs must be written for people to read, and only incidentally for machines to execute."
http://www.doesnotunderstand.org/

Serge Stinckwich

unread,
Aug 1, 2018, 11:42:46 AM8/1/18
to polymath...@googlegroups.com
I have now implement a PCA based on SVD and I obtain the same result than with the Jacobi transformation.
I think I found the explanation about the sign differences with the Python version:

I will implement a function that flip the signs, in order to have the same answer than in Python. Will ease the comparisons of results with scikit-learn framework.

A+

Serge Stinckwich

unread,
Aug 3, 2018, 10:52:41 AM8/3/18
to polymath...@googlegroups.com
I put an issue here describing the current situation:
Reply all
Reply to author
Forward
0 new messages