kda, pda, Hkda, Hkda.diag, kda.kde, pda.pde {ks}R Documentation

Kernel and parametric discriminant analysis

Description

Kernel and parametric discriminant analysis.

Usage

Hkda(x, x.group, Hstart, bw="plugin", nstage=2, pilot="samse",
     pre="sphere", binned=FALSE)
Hkda.diag(x, x.group, bw="plugin", nstage=2, pilot="samse", 
     pre="sphere", binned=FALSE)

kda(x, x.group, Hs, y, prior.prob=NULL)
pda(x, x.group, y, prior.prob=NULL, type="quad")

kda.kde(x, x.group, Hs, gridsize, supp=3.7, eval.points=NULL)
pda.pde(x, x.group, gridsize, type="quad", xlim, ylim, zlim)

Arguments

x matrix of training data values
x.group vector of group labels for training data
y matrix of test data
Hs (stacked) matrix of bandwidth matrices
prior.prob vector of prior probabilities
type "line" = linear discriminant, "quad" = quadratic discriminant
bw bandwidth: "plugin" = plug-in, "lscv" = LSCV, "scv" = SCV
nstage number of stages in the plug-in bandwidth selector (1 or 2)
pilot "amse" = AMSE pilot bandwidths, "samse" = single SAMSE pilot bandwidth
pre "scale" = pre-scaling, "sphere" = pre-sphering
Hstart (stacked) matrix of initial bandwidth matrices, used in numerical optimisation
binned if TRUE (default) used binned estimation otherwise use exact kernel density estimation
gridsize vector of number of grid points
supp effective support for standard normal is [-supp, supp]
eval.points points that density estimate is evaluated at
xlim, ylim, zlim x-axis, y-axis, z-axis limits (used only for plotting)

Details

– If you have prior probabilities then set prior.prob to these. Otherwise prior.prob=NULL is the default i.e. use the sample proportions as estimates of the prior probabilities.

The linear and quadratic discriminant analysers are based on lda and qda from the MASS library.

– The values that valid for bw are "plugin", "lscv" and "scv" for Hkda. These in turn call Hpi, Hlscv and Hscv. For plugin selectors, all of nstage, pilot and pre need to be set. For SCV selectors, currently nstage=1 always but pilot and pre need to be set. For LSCV selectors, none of them are required.

For Hkda.diag, options are "plugin" or "lscv" which in turn call respectively Hpi.diag and Hlscv.diag. Again, nstage, pilot and pre are available for Hpi.diag but not required for Hlscv.diag.

– The kernel density estimate is based on kde.

If eval.points=NULL (default) then the density estimate is automatically computed over a grid whose resolution is controlled by gridsize (default is 100 in each co-ordinate direction).

If xlim and ylim are not specified then they default to be 10% bigger than the range of the data values.

Value

– The result from Hkda and Hkda.diag is a stacked matrix of bandwidth matrices for each training data group. This is then suitable to passed as the Hs argument in kda.
The values that valid for bw are "plugin", "lscv" and "scv" for Hkda. These in turn call Hpi, Hlscv and Hscv. For plugin selectors, all of nstage, pilot and pre need to be set. For SCV selectors, currently nstage=1 always but pilot and pre need to be set. For LSCV selectors, none of them are required.
For Hkda.diag, options are "plugin" or "lscv" which in turn call respectively Hpi.diag and Hlscv.diag. Again, nstage, pilot and pre are available for Hpi.diag but not required for Hlscv.diag.
For details on the pre-transformations in pre, see pre.sphere and pre.scale.
– The result from kda and pda is a vector of group labels estimated via a discriminant (or classification) rule. If the test data y are given then these are classified. Otherwise the training data x are classified.
– The result from kda.kde and pda.pde is a density estimate for discriminant analysis is an object of class dade which is a list with 6 fields

x data points - same as input
eval.points points that density estimate is evaluated at
estimate density estimate at eval.points
H bandwidth matrices
prior.prob sample proportions of each group
type one of "kernel", "linear", "quadratic" indicating the type of discriminant analyser used.

References

Mardia, K.V., Kent, J.T. & Bibby J.M. (1979) Multivariate Analysis. Academic Press. London.

Silverman, B. W. (1986) Data Analysis for Statistics and Data Analysis. Chapman & Hall. London.

Simonoff, J. S. (1996) Smoothing Methods in Statistics. Springer-Verlag. New York

Venables, W.N. & Ripley, B.D. (1997) Modern Applied Statistics with S-PLUS. Springer-Verlag. New York.

See Also

compare, compare.kda.cv, compare.pda.cv

Examples


### bivariate example - restricted iris dataset  
library(MASS)
data(iris)
ir <- iris[,1:2]
ir.gr <- iris[,5]

H <- Hkda(ir, ir.gr, bw="plugin", pre="scale")
kda.gr <- kda(ir, ir.gr, H, ir)
lda.gr <- pda(ir, ir.gr, ir, type="line")
qda.gr <- pda(ir, ir.gr, ir, type="quad")

## Not run: 
### multivariate example - full iris dataset
ir <- iris[,1:4]
ir.gr <- iris[,5]

H <- Hkda(ir, ir.gr, bw="plugin", pre="scale")
kda.gr <- kda(ir, ir.gr, H, ir)
lda.gr <- pda(ir, ir.gr, ir, type="line")
qda.gr <- pda(ir, ir.gr, ir, type="quad")
## End(Not run)

[Package ks version 1.4.0 Index]