Projects X onto a d-dimensional linear subspace via Principal Component Analysis.
PCA is a linear dimensionality reduction algorithm that projects a point set X onto a lower dimensional space using an orthogonal projector built from the eigenvalue decomposition of its covariance matrix.
Parameters
Name
Type
Description
Default
X
ArrayLike
(n x D) point cloud / design matrix of n points in D dimensions.
required
d
int
dimension of the embedding to produce.
2
center
bool
whether to center the data prior to computing eigenvectors. Defaults to True.
True
coords
bool
whether to return the embedding or the eigenvectors. Defaults to the embedding.
True
Returns
Type
Description
Union[np.ndarray, tuple]
if coords = True (default), returns the projection of X onto the largest d eigenvectors of Xs covariance matrix. Otherwise, the eigenvalues and eigenvectors can be returned as-is.
Notes
PCA is dual to CMDS in the sense that the embedding produced by CMDS on the Euclidean distance matrix from X satisfies the same reconstruction loss as with PCA. In particular, when X comes from Euclidean space, the output of pca(…) will match the output of cmds(…) exactly up to rotation and translation.
Examples
import numpy as np from geomcover.linalg import pca, cmds## Start with a random set of points in R^3 + its distance matrixX = np.random.uniform(size=(50,3))D = np.linalg.norm(X - X[:,np.newaxis], axis=2)## Note that CMDS takes as input *squared* distancesY_pca = pca(X, d=2)Y_mds = cmds(D**2, d=2)## Get distance matrices for both embeddingsY_pca_D = np.linalg.norm(Y_pca - Y_pca[:,np.newaxis], axis=2)Y_mds_D = np.linalg.norm(Y_mds - Y_mds[:,np.newaxis], axis=2)## Up to rotation and translation, the coordinates are identicalall_close = np.allclose(Y_pca_D, Y_mds_D)print(f"PCA and MDS coord. distances identical? {all_close}")