candisc {candisc} | R Documentation |
candisc
performs a generalized canonical discriminant analysis for
one term in a multivariate linear model (i.e., an mlm
object),
computing canonical scores and vectors. It represents a transformation
of the original variables into a canonical space of maximal differences
for the term, controlling for other model terms.
To be of any use,
the term should be a factor or interaction corresponding to a
multivariate test with 2 or more degrees of freedom for the
null hypothesis.
candisc(mod, ...) ## S3 method for class 'mlm': candisc(mod, term, type = "2", manova, ndim = rank, ...) ## S3 method for class 'candisc': coef(object, type = c("std", "raw", "structure"), ...) ## S3 method for class 'candisc': plot(x, which = 1:2, conf = 0.95, col, pch, scale, asp = 1, var.col = "blue", var.lwd = par("lwd"), prefix = "Can", suffix=TRUE, ...) ## S3 method for class 'candisc': print(x, digits=max(getOption("digits") - 2, 3), ...) ## S3 method for class 'candisc': summary(object, means = TRUE, scores = FALSE, coef = c("std"), ndim, digits = max(getOption("digits") - 2, 4), ...)
mod |
An mlm object, such as computed by lm() with a multivariate response |
term |
the name of one term from mod |
type |
type of test for the model term , one of: "II", "III", "2", or "3" |
manova |
the Anova.mlm object corresponding to mod . Normally,
this is computed internally by Anova(mod) |
ndim |
Number of dimensions to store in (or retrieve from, for the summary method)
the means , structure , scores and
coeffs.* components. The default is the rank of the H matrix for the hypothesis
term. |
object, x |
A candisc object |
which |
A vector of two integers, selecting the canonical dimensions to plot |
conf |
Confidence coefficient for the confidence circles plotted in the plot method |
col |
A vector of colors to be used for the levels of the term in the plot method |
pch |
A vector of point symbols to be used for the levels of the term in the plot method |
scale |
Scale factor for the variable vectors in canonical space. If not specified, a scale factor is calculated to make the variable vectors approximately fill the plot space. |
asp |
Aspect ratio for the plot method. The asp=1 (the default) assures that
the units on the horizontal and vertical axes are the same, so that lengths and angles of the
variable vectors are interpretable. |
var.col |
Color used to plot variable vectors |
var.lwd |
Line width used to plot variable vectors |
prefix |
Prefix used to label the canonical dimensions plotted |
suffix |
Suffix for labels of canonical dimensions. If suffix=TRUE
the percent of hypothesis (H) variance accounted for by each canonical dimension is added to the axis label. |
means |
Logical value used to determine if canonical means are printed |
scores |
Logical value used to determine if canonical scores are printed |
coef |
Type of coefficients printed by the summary method. Any one or more of "std", "raw", or "structure" |
digits |
significant digits to print. |
... |
arguments to be passed down. In particular, type="n" can be used with
the plot method to suppress the display of canonical scores. |
Canonical discriminant analysis is typically carried out in conjunction with
a one-way MANOVA design. It represents a linear transformation of the response variables
into a canonical space in which (a) each successive canonical variate produces
maximal separation among the groups (e.g., maximum univariate F statistics), and
(b) all canonical variates are mutually uncorrelated.
For a one-way MANOVA with g groups and p responses, there are
dfh
= min( g-1, p) such canonical dimensions, and tests, initally stated
by Bartlett (1938) allow one to determine the number of significant
canonical dimensions. Computational details for the one-way case are described
in Cooley & Lohnes (1971), and in the SAS/STAT User's Guide, "The CANDISC procedure:
Computational Details," http://support.sas.com/onlinedoc/913/getDoc/en/statug.hlp/candisc_sect12.htm.
A generalized canonical discriminant analysis extends this idea to a general
multivariate linear model. Analysis of each term in the mlm
produces
a rank dfh H matrix sum of squares and crossproducts matrix that is
tested against the rank dfe E matrix by the standard multivariate
tests (Wilks' Lambda, Hotelling-Lawley trace, Pillai trace, Roy's maximum root
test). For any given term in the mlm
, the generalized canonical discriminant
analysis amounts to a standard discriminant analysis based on the H matrix for that
term in relation to the full-model E matrix.
An object of class candisc
with the following components:
dfh |
hypothesis degrees of freedom for term |
dfe |
error degrees of freedom for the mlm |
rank |
number of non-zero eigenvalues of HE^{-1} |
eigenvalues |
eigenvalues of HE^{-1} |
canrsq |
squared canonical correlations |
pct |
A vector containing the percentages of the canrsq of their total. |
ndim |
Number of canonical dimensions stored in the means , structure and coeffs.* components |
means |
A data.frame containing the class means for the levels of the factor(s) in the term |
factors |
A data frame containing the levels of the factor(s) in the term |
term |
name of the term |
terms |
A character vector containing the names of the terms in the mlm object |
coeffs.raw |
A matrix containing the raw canonical coefficients |
coeffs.std |
A matrix containing the standardized canonical coefficients |
structure |
A matrix containing the canonical structure coefficients on ndim dimensions, i.e.,
the correlations between the original variates and the canonical scores.
These are sometimes referred to as Total Structure Coefficients. |
scores |
A data frame containing the predictors in the mlm model and the
canonical scores on ndim dimensions.
These are calculated as Y %*% coeffs.raw , where Y contains the
standardized response variables. |
Michael Friendly and John Fox
Bartlett, M. S. (1938). Further aspects of the theory of multiple regression. Proc. Camb. Phil. Soc. 34, 33-34.
Cooley, W.W. & Lohnes, P.R. (1971). Multivariate Data Analysis, New York: Wiley.
Gittins, R. (1985). Canonical Analysis: A Review with Applications in Ecology, Berlin: Springer.
grass.mod <- lm(cbind(N1,N9,N27,N81,N243) ~ Block + Species, data=Grass) Anova(grass.mod,test="Wilks") grass.can1 <-candisc(grass.mod, term="Species") plot(grass.can1, type="n") # library(heplots) heplot(grass.can1, scale=6) # iris data iris.mod <- lm(cbind(Petal.Length, Sepal.Length, Petal.Width, Sepal.Width) ~ Species, data=iris) iris.can <- candisc(iris.mod, data=iris) plot(iris.can) heplot(iris.can) # 1-dim plot iris.can1 <- candisc(iris.mod, data=iris, ndim=1) plot(iris.can1)