dr {dr}R Documentation

Dimension reduction regression

Description

The function dr implements dimension reduction methods, including SIR, SAVE and pHd. 'dr' calls 'dr.compute', so only the former will be needed by most users.

Usage

dr (formula, data, subset, na.action = na.fail, weights, 
    ...)
    
dr.compute (x, y, weights, method = "sir", ...)
 

Arguments

formula a symbolic description of the model to be fit. The details of the model are the same as for lm. Although factors may not be appropriate for dr methods, they are permitted. Full rank models are recommended, although rank deficient models are permitted.
data an optional data frame containing the variables in the model. By default the variables are taken from the environment from which `dr' is called.
subset an optional vector specifying a subset of observations to be used in the fitting process.
weights an optional vector of weights to be used where appropriate. In the context of dimension reduction methods, weights are used to obtain elliptical symmetry, not constant variance; see dr.weights.
na.action a function which indicates what should happen when the data contain `NA's. The default is `na.fail,' which will stop calculations. The option 'na.omit' is also permitted, but it may not work correctly when weights are used.
x The design matrix
y The response vector or matrix
method This character string specifies the method of fitting. ``sir" specifies sliced inverse regression and ``save" specifies sliced average variance estimation. ``phdy" uses principal hessian directions using the response as suggested by Li, and ``phdres" uses the LS residuals as suggested by Cook. Other methods may be added.
... For 'dr', arguments passed to 'dr.compute'. For 'dr.compute', arguments required for particular dimension reduction method. nslices is the number of slices used by sir and save. numdir is the maximum number of directions to compute, with default equal to 4. other methods may have other defaults.

Details

The general regression problem studies F(y|x), the conditional distribution of a response y given a set of predictors x. This function provides methods for estimating the dimension and central subspace of a general regression problem. That is, we want to find a p by d matrix B such that

F(y|x)=F(y|B'x)

Both the dimension d and the subspace R(B) are unknown. These methods make few assumptions. All the methods available in this function estimate the unknowns by study of the inverse problem, F(x|y). In each, a kernel matrix M is estimated such that the column space of M should be close to the central subspace. Eigenanalysis of M is then used to estimate the central subspace. Objects created using this function have appropriate print, summary and plot methods.

Weights can be used, essentially to specify the relative frequency of each case in the data. Empirical weights that make the contours of the weighted sample closer to elliptical can be computed using dr.weights. This will usually result in zero weight for some cases. The function will set zero estimated weights to missing.

Several functions are provided that require a dr object as input. dr.permutation.tests uses a permutation test to obtain significance levels for tests of dimension. dr.coplot allows visualizing the results using a coplot of either two selected directions conditioning on a third and using color to mark the response, or the resonse versus one direction, conditioning on a second direction. plot.dr provides the default plot method for dr objects, based on a scatterplot matrix.

Value

dr returns an object that inherits from dr (the name of the type is the value of the method argument), with attributes:

x The design matrix
y The response vector
weights The weights used, normalized to add to n.
qr QR factorization of x.
cases Number of cases used.
call The initial call to 'dr'.
M A matrix that depends on the method of computing. The column space of M should be close to the central subspace.
evalues The eigenvalues of M (or squared singular values if M is not symmetric).
evectors The eigenvectors of M (or of M'M if M is not square and symmetric) ordered according to the eigenvalues.
numdir The maximum number of directions to be found. The output value of numdir may be smaller than the input value.
slice.info output from 'sir.slice', used by sir and save.
method the dimension reduction method used.


dr.weights returns a vector of weights estimated weights, scaled to add to the number of cases.

Author(s)

Sanford Weisberg, sandy@stat.umn.edu

For weights, see R. D. Cook and C. Nachtsheim (1994), Reweighting to achieve elliptically contoured predictors in regression. Journal of the American Statistical Association, 89, 592–599.

References

The details of these methods are given by R. D. Cook (1998). Regression Graphics. New York: Wiley. Equivalent methods are also available in Arc, R. D. Cook and S. Weisberg (1999). Applied Regression Including Computing and Graphics, New York: Wiley, www.stat.umn.edu/arc.

See Also

dr.permutation.test,dr.x,dr.y, dr.direction,dr.coplot,dr.weights

Examples

library(dr)
data(ais)
attach(ais)  # the Australian athletes data
#fit dimension reduction using sir
m1 <- dr(LBM~Wt+Ht+RCC+WCC, method="sir", nslices = 8)
summary(m1)
# repeat, using save:
m2 <- update(m1,method="save")
summary(m2)
# repeat, using phd:
m3 <- update(m2, method="phdres")
summary(m3)
# repeat, using weights:
w1 <- dr.weights(LBM~Wt+Ht+RCC+WCC, covmethod="mve")
m4 <- dr(LBM~Wt+Ht+RCC+WCC, method="sir", nslices = 8, weights=w1)

[Package dr version 2.0.3 Index]