sfs {dprep}R Documentation

Sequential Forward Selection

Description

Applies the Sequential Forward Selection algorithm for Feature Selection.

Usage

sfs(data, method = c("lda", "knn", "rpart"), kvec = 5,
 repet = 10)

Arguments

data dataset to be used for feature selection
method classifier to be used, currently only the lda, knn and rpart classifiers are supported
kvec number of neighbors to use for the knn classification
repet number of time to repeat the selection.

Details

The best subset of features is initialized as the empty set and at each step a the feature that gives the highest correct classification rate along with the features already included is added to set. The "best subset" of features is constructed based on the frequency with which each attribute is selected in the number of repetitions given. Due to the time complexity of the algorithm its use is not recommended for a large number of attributes(say more than 1000).

Value

bestsubset subset of features that have been determined to be relevant.

Author(s)

Edgar Acuna

References

Acuņa, E , (2003) A comparison of filters and wrappers for feature selection in supervised classification. Proceedings of the Interface 2003 Computing Science and Statistics. Vol 34.

Examples

#---- Sequential forward selection using the knn classifier----
data(my.iris)
sfs(my.iris,method="knn",kvec=3,repet=10)

[Package dprep version 1.0 Index]