Fcat {exactmaxsel}R Documentation

Distribution of maximally selected statistics for multicategorical variables

Description

The function Fcat computes the distribution of the maximally selected association criterion of interest (either the chi-square statistic or the Gini-gain in the current version) when Y is binary and X has unordered categorical values, given n0, n1 and A.

Usage

Fcat(c, n0, n1, A, statistic)

Arguments

c the value at which the distribution function has to be computed.
n0 the number of observations in class Y=0.
n1 the number of observations in class Y=1.
A a vector of length K giving the number of observations with X=1,...,X=K.
statistic the association measure used as criterion to select the best split. Currently, only statistic="chi2" (chi-square statistic) and statistic="gini" (the Gini-gain from machine learning) are implemented.

Details

Suppose the response Y is binary (Y=0,1) and the predictor X has K unordered categorical values (X=1,...,K). The criterion is maximized over all the binary splittings of the set {1,...,K}. For K=3, the criterion is thus maximized over the splittings {1,2}{3}, {1,3}{2} and {1}{2,3}.

Value

the value of the distribution function at c.

Author(s)

Anne-Laure Boulesteix (http://www.statistik.lmu.de/~socher/)

References

A.-L. Boulesteix (2006), Maximally selected chi-square statistics and binary splits of nominal variables, Biometrical Journal 48.

See Also

Ford, maxsel.

Examples

# load exactmaxsel library
library(exactmaxsel)

Fcat(c=4,n0=15,n1=10,A=c(6,10,9),statistic="chi2")
Fcat(c=5,n0=15,n1=15,A=c(5,8,7,10),statistic="gini")


[Package exactmaxsel version 1.0 Index]