prcomp {stats} | R Documentation |
Description
Performs a principal components analysis on the given data matrixand returns the results as an object of class prcomp
.
Usage
prcomp(x, ...)## S3 method for class 'formula'prcomp(formula, data = NULL, subset, na.action, ...)## Default S3 method:prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE, tol = NULL, rank. = NULL, ...)## S3 method for class 'prcomp'predict(object, newdata, ...)
Arguments
formula | a formula with no response variable, referring only tonumeric variables. |
data | an optional data frame (or similar: see |
subset | an optional vector used to select rows (observations) of thedata matrix |
na.action | a function which indicates what should happenwhen the data contain |
... | arguments passed to or from other methods. If |
x | a numeric or complex matrix (or data frame) which providesthe data for the principal components analysis. |
retx | a logical value indicating whether the rotated variablesshould be returned. |
center | a logical value indicating whether the variablesshould be shifted to be zero centered. Alternately, a vector oflength equal the number of columns of |
scale. | a logical value indicating whether the variables shouldbe scaled to have unit variance before the analysis takesplace. The default is |
tol | a value indicating the magnitude below which componentsshould be omitted. (Components are omitted if theirstandard deviations are less than or equal to |
rank. | optionally, a number specifying the maximal rank, i.e.,maximal number of principal components to be used. Can be set asalternative or in addition to |
object | object of class inheriting from |
newdata | An optional data frame or matrix in which to look forvariables with which to predict. If omitted, the scores are used.If the original fit used a formula or a data frame or a matrix withcolumn names, |
Details
The calculation is done by a singular value decomposition of the(centered and possibly scaled) data matrix, not by usingeigen
on the covariance matrix. Thisis generally the preferred method for numerical accuracy. Theprint
method for these objects prints the results in a niceformat and the plot
method produces a scree plot.
Unlike princomp
, variances are computed with the usualdivisor N - 1.
Note that scale = TRUE
cannot be used if there are zero orconstant (for center = TRUE
) variables.
Value
prcomp
returns a list with class "prcomp"
containing the following components:
sdev | the standard deviations of the principal components(i.e., the square roots of the eigenvalues of thecovariance/correlation matrix, though the calculationis actually done with the singular values of the data matrix). |
rotation | the matrix of variable loadings (i.e., a matrixwhose columns contain the eigenvectors). The function |
x | if |
center, scale | the centering and scaling used, or |
Note
The signs of the columns of the rotation matrix are arbitrary, andso may differ between different programs for PCA, and even betweendifferent builds of R.
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)The New S Language.Wadsworth & Brooks/Cole.
Mardia, K. V., J. T. Kent, and J. M. Bibby (1979)Multivariate Analysis, London: Academic Press.
Venables, W. N. and B. D. Ripley (2002)Modern Applied Statistics with S, Springer-Verlag.
See Also
biplot.prcomp
, screeplot
,princomp
, cor
, cov
,svd
, eigen
.
Examples
C <- chol(S <- toeplitz(.9 ^ (0:31))) # Cov.matrix and its rootall.equal(S, crossprod(C))set.seed(17)X <- matrix(rnorm(32000), 1000, 32)Z <- X %*% C ## ==> cov(Z) ~= C'C = Sall.equal(cov(Z), S, tol = 0.08)pZ <- prcomp(Z, tol = 0.1)summary(pZ) # only ~14 PCs (out of 32)## or choose only 3 PCs more directly:pz3 <- prcomp(Z, rank. = 3)summary(pz3) # same numbers as the first 3 abovestopifnot(ncol(pZ$rotation) == 14, ncol(pz3$rotation) == 3, all.equal(pz3$sdev, pZ$sdev, tol = 1e-15)) # exactly equal typically## signs are randomrequire(graphics)## the variances of the variables in the## USArrests data vary by orders of magnitude, so scaling is appropriateprcomp(USArrests) # inappropriateprcomp(USArrests, scale = TRUE)prcomp(~ Murder + Assault + Rape, data = USArrests, scale = TRUE)plot(prcomp(USArrests))summary(prcomp(USArrests, scale = TRUE))biplot(prcomp(USArrests, scale = TRUE))
[Package stats version 3.4.1 Index]