ce.impute {dprep}R Documentation

Imputation in supervised classification

Description

This function performs data imputation in datasets for supervised classification by using mean, median or knn imputation methods.

Usage

ce.impute(data, method = c("mean", "median", "knn"), atr,
 nomatr = rep(0, 0), k1 = 10)

Arguments

data the name of the dataset
method the name of the method to be used
atr a vector identifying the attributes where imputations will be performed
nomatr a vector identifying the nominal attributes
k1 the number of neighbors to be used for the knn imputation

Value

Returns a matrix without missing values.

Note

A description of all the imputations carried out may be stored in a report that is later saved to the current workspace. To produce the report, lines at the end of the code must be uncommented. The report objects name starts with Imput.rep.

Author(s)

Caroline Rodriguez

References

Acuna, E. and Rodriguez, C. (2004). The treatment of missing values and its effect in the classifier accuracy. In D. Banks, L. House, F.R. McMorris, P. Arabie, W. Gaul (Eds). Classification, Clustering and Data Mining Applications. Springer-Verlag Berlin-Heidelberg, 639-648.

See Also

clean

Examples

data(hepatitis)
#--------Median Imputation-----------
#ce.impute(hepatitis,"median",1:19)
#--------knn Imputation--------------
hepa.imputed=ce.impute(hepatitis,"knn",k1=10)

[Package dprep version 2.0 Index]