discretize {minet}R Documentation

Unsupervized Data Discretization

Description

discretize discretizes data using the equal frequencies or equal width binning algorithm. "equalwidth" and "equalfreq" discretizes each random variable (each column) of the data into nbins. "globalequalwidth" discretizes the range of the random vector data into nbins.

Usage

discretize( data,disc="equalfreq",nbins=sqrt(nrow(data)) )

Arguments

data A data.frame containing data to be discretized. The columns contains variables and the rows samples.
disc The name of the discretization method to be used :"equalfreq", "equalwidth" or "globalequalwidth" (default : "equalfreq") - see references.
nbins Integer specifying the number of bins to be used for the discretization. By default the number of bins is set to sqrt(N) where N is the number of samples.

Value

discretize returns the discretized dataset.

Author(s)

Patrick E. Meyer, Frederic Lafitte, Gianluca Bontempi, Korbinian Strimmer

References

Supervised and unsupervised discretization of continuous features. J.Dougherty, R. Kohavi, M. Sahami. ICML, 1995.

See Also

build.mim

Examples

data(syn.data)
ew.data <- discretize(syn.data,"equalwidth")
ef.data <- discretize(syn.data,"equalfreq")
gew.data <- discretize(syn.data,"globalequalwidth")

[Package minet version 1.1.7 Index]