show_pca

Visualization of principal components analysis results

Calling Sequence

show_pca(lambda, facpr, N)

Arguments

lambda: is a p x 2 numerical matrix. In the first column we find the eigenvalues of V, where V is the correlation p x p matrix and in the second column are the ratios of the corresponding eigenvalue over the sum of eigenvalues.
facpr: are the principal factors: eigenvectors of V. Each column is an eigenvector element of the dual of R^p.
N: Is a 2x1 integer vector. Its coefficients point to the eigenvectors corresponding to the eigenvalues of the correlation matrix p by p ordered by decreasing values of eigenvalues. If N. is missing, we suppose N=[1 2]..

Description

This function visualize the pca results.

The function produces a graphics with two subplots.

The graphics on the left represents the correlation circle. This graphics is based on the two principal components 1 and 2, i.e. the main components. For each parameter, it represents the linear correlation coefficient between the component and each parameter. Points which are close to the unit circle can be interpreted in terms of correlation, but points close to the center of the circle should not be interpreted. For the Axis 1, points on the right of the graphics are positively correlated to this component, while points on the left are negatively correlated. For the Axis 2, points on the top of the graphics are positively correlated to this component, while points on the bottom are negatively correlated. For example, if the point x1 is close to the circle, on the right, close to the Axis 1, this means that the component 1 is positively correlated to the parameter 1. When the data is perfectly represented by only two components, the points are on the circle. When more than two components are needed to represent the data, the points are inside the circle.
The graphics on the right represents the eigenvalues. The bar graph represents the eigenvalues, sorted in decreasing order. More precisely, each bar has a length equal to the ratio of the eigenvalue over the sum of the eigenvalues. The line plot represents the cumulated sum, i.e. the cumulative variances explained by the associated principal components. For example, if the cumulated sum for the eigenvalue 2 is greater than 0.9, the points are flat in a subspace associated with the two first eigenvectors: representing the data with only these two directions may be a good representation.

Implementation notes. The right part of the graphics is based on the second column of the lambda output argument of the pca function.

Examples

// Test a table of standard Normal random numbers
// 100 observations in 10 dimensions.
a=rand(100,10,"n");
[lambda,facpr,comprinc] = pca(a);
show_pca(lambda,facpr)
// See how the points are inside the circle: 
// more than 2 components are required to represent 
// the data.

// Source : "Analyse en composantes principales", 
// Jean-François Delmas et Saad Salam
// Weight of several parts of 23 cows
// X1: weight (alive)
// X2: skeleton weight
// X3: first grade meat weight
// X4: total meat weight
// X5: fat weight
// X6: bones weight
x = [
395     224     35.1     79.1     6.0     14.9
410     232     31.9     73.4     8.7     16.4
405     233     30.7     76.5     7.0     16.5
405     240     30.4     75.3     8.7     16.0
390     217     31.9     76.5     7.8     15.7
415     243     32.1     77.4     7.1     18.5
390     229     32.1     78.4     4.6     17.0
405     240     31.1     76.5     8.2     15.3
420     234     32.4     76.0     7.2     16.8
390     223     33.8     77.0     6.2     16.8
415     247     30.7     75.5     8.4     16.1
400     234     31.7     77.6     5.7     18.7
400     224     28.2     73.5     11.0     15.5
395     229     29.4     74.5     9.3     16.1
395     219     29.7     72.8     8.7     18.5
395     224     28.5     73.7     8.7     17.3
400     223     28.5     73.1     9.1     17.7
400     224     27.8     73.2     12.2     14.6
400     221     26.5     72.3     13.2     14.5
410     233     25.9     72.3     11.1     16.6
402     234     27.1     72.1     10.4     17.5
400     223     26.8     70.3     13.5     16.2
400     213     25.8     70.4     12.1     17.5 
];
[lambda,facpr,comprinc] = pca(x);
scf();
show_pca(lambda,facpr)
//
// Extract the two first columns.
x = x(:,1:2);
[lambda,facpr,comprinc] = pca(x);
scf();
// See how the points are perfectly on the circle. 
show_pca(lambda,facpr)

Bibliography

Saporta, Gilbert, Probabilites, Analyse des Donnees et Statistique, Editions Technip, Paris, 2011, 3ème Edition.

Analyse en composantes principales, Jean-François Delmas et Saad Salam, http://cermics.enpc.fr/scilab_new/site/Tp/Statistique/acp/index.htm

Report an issue
<< princomp	Principal Component Analysis	Regression >>

Copyright (c) 2022-2025 (Dassault Systèmes S.E.)
Copyright (c) 2017-2022 (ESI Group)
Copyright (c) 2011-2017 (Scilab Enterprises)
Copyright (c) 1989-2012 (INRIA)
Copyright (c) 1989-2007 (ENPC)
with contributors

Last updated:
Mon Oct 01 17:37:21 CEST 2012