R: Principal Components Analysis (2024)

prcomp {stats}R Documentation

Description

Performs a principal components analysis on the given data matrixand returns the results as an object of class prcomp.

Usage

prcomp(x, ...)## S3 method for class 'formula'prcomp(formula, data = NULL, subset, na.action, ...)## Default S3 method:prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE, tol = NULL, rank. = NULL, ...)## S3 method for class 'prcomp'predict(object, newdata, ...)

Arguments

formula

a formula with no response variable, referring only tonumeric variables.

data

an optional data frame (or similar: seemodel.frame) containing the variables in theformula formula. By default the variables are taken fromenvironment(formula).

subset

an optional vector used to select rows (observations) of thedata matrix x.

na.action

a function which indicates what should happenwhen the data contain NAs. The default is set bythe na.action setting of options, and isna.fail if that is unset. The ‘factory-fresh’default is na.omit.

...

arguments passed to or from other methods. If x isa formula one might specify scale. or tol.

x

a numeric or complex matrix (or data frame) which providesthe data for the principal components analysis.

retx

a logical value indicating whether the rotated variablesshould be returned.

center

a logical value indicating whether the variablesshould be shifted to be zero centered. Alternately, a vector oflength equal the number of columns of x can be supplied.The value is passed to scale.

scale.

a logical value indicating whether the variables shouldbe scaled to have unit variance before the analysis takesplace. The default is FALSE for consistency with S, butin general scaling is advisable. Alternatively, a vector of lengthequal the number of columns of x can be supplied. Thevalue is passed to scale.

tol

a value indicating the magnitude below which componentsshould be omitted. (Components are omitted if theirstandard deviations are less than or equal to tol times thestandard deviation of the first component.) With the default nullsetting, no components are omitted (unless rank. is specifiedless than min(dim(x)).). Other settings for tol could betol = 0 or tol = sqrt(.Machine$double.eps), whichwould omit essentially constant components.

rank.

optionally, a number specifying the maximal rank, i.e.,maximal number of principal components to be used. Can be set asalternative or in addition to tol, useful notably when thedesired rank is considerably smaller than the dimensions of the matrix.

object

object of class inheriting from "prcomp"

newdata

An optional data frame or matrix in which to look forvariables with which to predict. If omitted, the scores are used.If the original fit used a formula or a data frame or a matrix withcolumn names, newdata must contain columns with the samenames. Otherwise it must contain the same number of columns, to beused in the same order.

Details

The calculation is done by a singular value decomposition of the(centered and possibly scaled) data matrix, not by usingeigen on the covariance matrix. Thisis generally the preferred method for numerical accuracy. Theprint method for these objects prints the results in a niceformat and the plot method produces a scree plot.

Unlike princomp, variances are computed with the usualdivisor N - 1.

Note that scale = TRUE cannot be used if there are zero orconstant (for center = TRUE) variables.

Value

prcomp returns a list with class "prcomp"containing the following components:

sdev

the standard deviations of the principal components(i.e., the square roots of the eigenvalues of thecovariance/correlation matrix, though the calculationis actually done with the singular values of the data matrix).

rotation

the matrix of variable loadings (i.e., a matrixwhose columns contain the eigenvectors). The functionprincomp returns this in the element loadings.

x

if retx is true the value of the rotated data (thecentred (and scaled if requested) data multiplied by therotation matrix) is returned. Hence, cov(x) is thediagonal matrix diag(sdev^2). For the formula method,napredict() is applied to handle the treatment of valuesomitted by the na.action.

center, scale

the centering and scaling used, or FALSE.

Note

The signs of the columns of the rotation matrix are arbitrary, andso may differ between different programs for PCA, and even betweendifferent builds of R.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.

Mardia, K. V., J. T. Kent, and J. M. Bibby (1979)Multivariate Analysis, London: Academic Press.

Venables, W. N. and B. D. Ripley (2002)Modern Applied Statistics with S, Springer-Verlag.

See Also

biplot.prcomp, screeplot,princomp, cor, cov,svd, eigen.

Examples

C <- chol(S <- toeplitz(.9 ^ (0:31))) # Cov.matrix and its rootall.equal(S, crossprod(C))set.seed(17)X <- matrix(rnorm(32000), 1000, 32)Z <- X %*% C ## ==> cov(Z) ~= C'C = Sall.equal(cov(Z), S, tol = 0.08)pZ <- prcomp(Z, tol = 0.1)summary(pZ) # only ~14 PCs (out of 32)## or choose only 3 PCs more directly:pz3 <- prcomp(Z, rank. = 3)summary(pz3) # same numbers as the first 3 abovestopifnot(ncol(pZ$rotation) == 14, ncol(pz3$rotation) == 3, all.equal(pz3$sdev, pZ$sdev, tol = 1e-15)) # exactly equal typically## signs are randomrequire(graphics)## the variances of the variables in the## USArrests data vary by orders of magnitude, so scaling is appropriateprcomp(USArrests) # inappropriateprcomp(USArrests, scale = TRUE)prcomp(~ Murder + Assault + Rape, data = USArrests, scale = TRUE)plot(prcomp(USArrests))summary(prcomp(USArrests, scale = TRUE))biplot(prcomp(USArrests, scale = TRUE))

[Package stats version 3.4.1 Index]

R: Principal Components Analysis (2024)

References

Top Articles
Latest Posts
Article information

Author: Merrill Bechtelar CPA

Last Updated:

Views: 5669

Rating: 5 / 5 (70 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: Merrill Bechtelar CPA

Birthday: 1996-05-19

Address: Apt. 114 873 White Lodge, Libbyfurt, CA 93006

Phone: +5983010455207

Job: Legacy Representative

Hobby: Blacksmithing, Urban exploration, Sudoku, Slacklining, Creative writing, Community, Letterboxing

Introduction: My name is Merrill Bechtelar CPA, I am a clean, agreeable, glorious, magnificent, witty, enchanting, comfortable person who loves writing and wants to share my knowledge and understanding with you.