pca
Computes principal component analysis for the data matrix X
Syntax
comprinc = pca(X) [comprinc, score, lambda] = pca(X) [comprinc, score, lambda, tsquare] = pca(X) [comprinc, score, lambda, tsquare, explained, mu] = pca(X) comprinc = pca(X, Name, Value) [comprinc, score, lambda] = pca(X, Name, Value) [comprinc, score, lambda, tsquare] = pca(X, Name, Value) [comprinc, score, lambda, tsquare, explained, mu] = pca(X, Name, Value)
Arguments
- X
is a nxp (n individuals, p variables) real matrix.
- Name, Value
'Centered': boolean indicator for centering the columns. Default value: %t.
'Economy': boolean indicator, use to allow economy size singular value decomposition. Default value: %t.
'NumComponents': integer value, number of components returned. Default value: size(X, 2).
'Weights': row vector of doubles of length size(X, 1) containing observation weights. Default value: ones(1, size(X, 1)).
'VariableWeights': "variance" value or row vector of doubles of length size(X, 2) containing variable weights. Default value: [].
- comprinc
are the principal component coefficients, p-by-p matrix where p is equal to size(X,2).
- score
n-by-p matrix or n-by-NumComponents matrix if 'NumComponents' is specified.
are the principal component scores.
- lambda
is a p-by-1 or NumComponents-by-1 (if 'NumComponents' is specified) vector.
contains the eignevalues of the covariance X
- tsquare
a
n
column vector. It contains the Hotelling's T^2 statistic for each data point.- explained
a column vector of length "number of components". The percentage of variance explained by each principal component.
- mu
a row vector of length
p
. The estimated mean of each variable ofX
.
Description
This function performs several computations known as "principal component analysis".
The idea behind this method is to represent in an approximative manner a cluster of n individuals in a smaller dimensional subspace. In order to do that, it projects the cluster onto a subspace. The choice of the k-dimensional projection subspace is made in such a way that the distances in the projection have a minimal deformation: we are looking for a k-dimensional subspace such that the squares of the distances in the projection is as big as possible (in fact in a projection, distances can only stretch). In other words, inertia of the projection onto the k dimensional subspace must be maximal.
To obtain the pca graph, use the show_pca function.
Examples
x = [ 395 224 35.1 79.1 6.0 14.9 410 232 31.9 73.4 8.7 16.4 405 233 30.7 76.5 7.0 16.5 405 240 30.4 75.3 8.7 16.0 390 217 31.9 76.5 7.8 15.7 415 243 32.1 77.4 7.1 18.5 390 229 32.1 78.4 4.6 17.0 405 240 31.1 76.5 8.2 15.3 420 234 32.4 76.0 7.2 16.8 390 223 33.8 77.0 6.2 16.8 415 247 30.7 75.5 8.4 16.1 400 234 31.7 77.6 5.7 18.7 400 224 28.2 73.5 11.0 15.5 395 229 29.4 74.5 9.3 16.1 395 219 29.7 72.8 8.7 18.5 395 224 28.5 73.7 8.7 17.3 400 223 28.5 73.1 9.1 17.7 400 224 27.8 73.2 12.2 14.6 400 221 26.5 72.3 13.2 14.5 410 233 25.9 72.3 11.1 16.6 402 234 27.1 72.1 10.4 17.5 400 223 26.8 70.3 13.5 16.2 400 213 25.8 70.4 12.1 17.5 ]; [comprinc, scores, lambda, tsquare, explained] = pca(wcenter(x, 1)); scf(); show_pca(lambda, comprinc) // // Extract the two last columns. x = x(:,1:2); [comprinc, scores, lambda, tsquare, explained] = pca(wcenter(x, 1)); scf(); // See how the points are perfectly on the circle. show_pca(lambda, comprinc)
x = [1 2 1;2 1 3; 3 2 3] [comprinc, scores, lambda, tsquare, explained, mu] = pca(x, "Economy", %t); scores * comprinc' + ones(3, 1) * mu // == x
See also
- show_pca — Visualization of principal components analysis results
Bibliography
Saporta, Gilbert, Probabilites, Analyse des Donnees et Statistique, Editions Technip, Paris, 1990.
History
Version | Description |
2025.0.0 | Improvements of the function:
|
Report an issue | ||
<< covar | Multivariables - regress correl covar PCA | princomp >> |