Naive PCA With R

Pablo Bernabeu gives us a naive method for performing a Principal Component Analysis:

STAGE 1.  Determine whether PCA is appropriate at all, considering the variables

  • Variables should be inter-correlated enough but not too much. Field et al. (2012) provide some thresholds, suggesting that no variable should have many correlations below .30, or any correlation at all above .90. Thus, in the example here, variable Q06 should probably be excluded from the PCA.

  • Bartlett’s test, on the nature of the intercorrelations, should be significant. Significance suggests that the variables are not an ‘identity matrix’ in which correlations are a sampling error.

  • KMO (Kaiser-Meyer-Olkin), a measure of sampling adequacy based on common variance (so similar purpose as Bartlett’s). As Field et al. review, ‘values between .5 and .7 are mediocre, values between .7 and .8 are good, values between .8 and .9 are great and values above .9 are superb’ (p. 761). There’s a general score as well as one per variable. The general one will often be good, whereas the individual scores may more likely fail. Any variable with a score below .5 should probably be removed, and the test should be run again.

  • Determinant: A formula about multicollinearity. The result should preferably fall below .00001.

PCA is a powerful tool in several fields, including clinical testing.

Related Posts

Creating Map Plots With ggmap

Laura Ellis shows how to use the ggmap package to create choropleth maps in R: In the last map, it was a bit tricky to see the density of the incidents because all the graphed points were sitting on top of each other.  In this scenario, we are going to make the data all one […]

Read More

R 3.5.0 Released

Tal Galili announces that R 3.5.0 is now available: By default the (arbitrary) signs of the loadings from princomp() are chosen so the first element is non-negative. If –default-packages is not used, then Rscript now checks the environment variable R_SCRIPT_DEFAULT_PACKAGES. If this is set, then it takes precedence over R_DEFAULT_PACKAGES. If default packages are not specified on the command line or by one […]

Read More

Categories

September 2017
MTWTFSS
« Aug Oct »
 123
45678910
11121314151617
18192021222324
252627282930