edrcv {EDR}R Documentation

Risk assessment by Cross-Validation

Description

Tis function, additionally to estimating the effective dimension reduction space (EDR), see also function edr, estimates the Mean Squared Error of Prediction (MSEP) and the Mean Absolute Error of Prediction (MAEP) when using the estimated EDR by Cross-Validation. Estimates of the regression function are produced using function sm.regression from package sm.

Usage

edrcv(x, y, m = 2, rho0 = 1, h0 = NULL, ch = exp(0.5/max(4, (dim(x)[2]))), crhomin = 1, cm = 4, method = "Penalized", basis = "Quadratic", cw = NULL, graph = FALSE, show = 1, trace = FALSE, seed = 1, cvsize = 1, m0 = min(m, 2),hsm=NULL)

Arguments

x x specifies the design matrix, dimension (n,d)
y y specifies the response, length n.
m Rank of matrix M in case of method="Penalized", not used for the other methods.
rho0 Initial value for the regularization parameter rho.
h0 Initial bandwidth.
ch Factor for indecreasing h with iterations.
crhomin Factor to in(de)crease the default value of rhomin. This is just added to explore properties of the algorithms. Defaults to 1.
cm Factor in the definition of Pi_k=C_m*rho_k^2 I_L + hat{M}_{k-1}. Only used if method="Penalized".
method Secifies the algoritm to use. The default method="Penalized" corresponds to the algoritm proposed in ... (2006). method="HJPS" corresponds to the original algorithm from Hristache et.al. (2001) while method="HJPS2" specifies a modifification (correction) of this algoritm.
basis Specifies the set of basis functions. Options are basis="Quadratic" (default) and basis="Linear".
cw cw another regularization parameter, secures identifiability of a minimum number of local gradient directions. Defaults to 1/d . Has to be positive or NULL.
graph If graph==TRUE intermediate results are plotted.
show If graph==TRUE the parameter show determines the dimension of the EDR that is to be used when plotting intermediate results. If trace=TRUE and !is.null(R) it determines the dimension of the EDR when computing the risk values.
trace trace=TRUE additional diagnostics are provided for each iteration. This includes current, at iteration k, values of the regularization parameter rho_k and bandwidth h_k, normalized cimmulative sums of eigenvalues of hat{B} and if !is.null(R) two distances between the true, specified in R and estimated EDR.
seed Seed for generating random groups for CV
cvsize Groupsize k in leave-k-out CV
m0 Dimension of the dimension reduction space to use when fitting the data. Should be either 1 or 2.
hsm If is.null(hsm) the bandwidth used by sm.regression for smoothing within the EDR is chosen by cross-validation within sm.regression when needed. Alternatively a grid of bandwidths may be specified. In that case a bandwidth for sm.regression is chosen from the grid that minimizes the extimated mean absolute error of prediction.

Details

This function performs a leave-k-out cross-validation to estimate the risk in terms of Mean Squared Error of Prediction (MSEP) and Mean Absolute Error of Prediction (MAEP) when using function edr to estimate an effective dimension reduction space of dimension m0 and using this estimated space to predict values of the response. Smoothing within the dimension reduction space is performed using the function sm.regression from package sm. The bandwidth for sm.regression is chosen by Cross-Validation.

Value

Object of class "edr" with components.

x The design matrix.
y The values of the response.
bhat Matrix hat{B} characterizing the effective dimension space. For a specified dimension m hat{B}_m = hat{B} O_m, with hat{B}^T hat{B}= O Lambda O^T being the eigenvalue decomposition of hat{B}^T hat{B}, specifies the projection to the m-dimensional subspace that provides the best approximation.
fhat an highly oversmoothed estimate of the values of the regression function at the design points. This is provided as a backup only for the case that package sm is not installed.
cumlam Cummulative amount of information explained by the first components of hat{B}.
nmean Mean numbers of observations used in each iteration.
h Final bandwidth
rho Final value of rho
h0 Initial bandwidth
rho0 Initial value of rho
cm The factor cm
call Arguments of the call to edrcv
cvres Residuals from cross-validation.
cvmseofh Estimates of MSEP for bandwidths hsm
cvmaeofh Estimates of MAEP for bandwidths hsm
cvmse Estimate of MSEP
cvmae Estimate of MAEP
hsm Set of bandwidths specified for use with sm.regression
hsmopt Bandwidth selected for use with sm.regression if hsm was specified.

Note

This function requires package sm.

Author(s)

Joerg Polzehl, polzehl@wias-berlin.de

References

M. Hristache, A. Juditsky, J. Polzehl and V. Spokoiny (2001). Structure adaptive approach for dimension reduction, The Annals of Statistics. Vol.29, pp. 1537-1566.
J. Polzehl, S. Sperlich, V. Spokoiny (2006). Estimating Generalized Principle Components, Manuscript in preparation.

See Also

edr,plot.edr, summary.edr, print.edr, edr.R

Examples

require(EDR)
demo(edr_ex4)

[Package EDR version 0.6-2.2 Index]