PCAproj {pcaPP} | R Documentation |
Computes a desired number of (robust) principal components using the algorithm of Croux and Ruiz-Gazen (JMVA, 2005).
PCAproj(x, k = 2, method = c("sd", "mad", "qn"), CalcMethod = c("eachobs", "lincomb", "sphere"), nmax = 1000, update = TRUE, scores = TRUE, maxit = 5, maxhalf = 5, control, ...)
x |
a numeric matrix or data frame which provides the data for the principal components analysis. |
k |
desired number of components to compute |
method |
scale estimator used to detect the direction with the largest
variance. Possible values are "sd" , "mad" and "qn" , the
latter can be called "Qn" too. "mad" is the default value. |
CalcMethod |
the variant of the algorithm to be used. Possible values are
"eachobs" , "lincomb" and "sphere" , with "eachobs" being
the default. |
nmax |
maximum number of directions to search in each step (only when
using "sphere" or "lincomb" as the CalcMethod ). |
update |
a logical value indicating whether an update algorithm should be used. |
scores |
a logical value indicating whether the scores of the principal component should be calculated. |
maxit |
maximim number of iterations. |
maxhalf |
maximum number of steps for angle halving. |
control |
a list whose elements must be the same as (or a subset of)
the parameters above. If the control object is supplied, the parameters from
it will be used and any other given parameters are overridden. The parameter
... is not affected by this parameter though. It cannot be given as an
element of the control object and is not ignored if the control
object is supplied. |
... |
additional arguments passed to the function
ScaleAdvR |
Basically, this algrithm considers the directions of each observation
through the origin of the centered data as possible projection directions.
As this algorithm has some drawbacks, especially if ncol(x) > nrow(x)
in the data matrix, there are several improvements that can be used with this
algorithm.
CalcMethod
"sphere"
-algorithm, but the new data points are generated using linear
combinations of the original data b_1*x_1 + ... + b_n*x_n
where the
coefficients b_i
come from a uniform distribution in the interval
[0, 1]
.
Similar to the function princomp
, there is a print
method
for the these objects that prints the results in a nice format and the plot
method produces a scree plot (screeplot
). There is also a
biplot
method.
The function returns a list of class "princomp"
, i.e. a list similar to the
output of the function princomp
.
sdev |
the (robust) standard deviations of the principal components. |
loadings |
the matrix of variable loadings (i.e., a matrix whose columns
contain the eigenvectors). This is of class "loadings" :
see loadings for its print method. |
center |
the means that were subtracted. |
scale |
the scalings applied to each variable. |
n.obs |
the number of observations. |
scores |
if scores = TRUE , the scores of the supplied data on the
principal components. |
call |
the matched call. |
Heinrich Fritz, Peter Filzmoser <P.Filzmoser@tuwien.ac.at>
C. Croux, P. Filzmoser, M. Oliveira (2004) Projection-pursuit Estimators for Robust Principal Component Analysis, Technical Report TS-04-4, Vienna University of Technology, Austria
# multivariate data with outliers x <- rbind(rmvnorm(200, rep(0, 6), diag(c(5, rep(1,5)))), rmvnorm( 15, c(0, rep(20, 5)), diag(rep(1, 6)))) # Here we calculate the principal components with PCAgrid pc <- PCAproj(x, 6) # we could draw a biplot too: biplot(pc) # we could use another calculation method and another objective function, and # maybe only calculate the first three principal components: pc <- PCAproj(x, 3, "qn", "sphere") biplot(pc) # now we want to compare the results with the non-robust principal components pc <- princomp(x) # again, a biplot for comparision: biplot(pc)