Fcat {exactmaxsel} | R Documentation |
The function Fcat
computes the distribution of the maximally selected
association criterion of interest (either the chi-square statistic or the
Gini-gain in the current version) when Y is binary and X has unordered
categorical values, given n0
, n1
and A
.
Fcat(c, n0, n1, A, statistic)
c |
the value at which the distribution function has to be computed. |
n0 |
the number of observations in class Y=0. |
n1 |
the number of observations in class Y=1. |
A |
a vector of length K giving the number of observations with X=1,...,X=K. |
statistic |
the association measure used as criterion to select the
best split. Currently, only statistic="chi2" (chi-square statistic)
and statistic="gini" (the Gini-gain from machine learning) are
implemented. |
Suppose the response Y is binary (Y=0,1) and the predictor X has K unordered categorical values (X=1,...,K). The criterion is maximized over all the binary splittings of the set {1,...,K}. For example, if K=4, the criterion is thus maximized over the splittings {1}{2,3,4}, {1,2}{3,4}, {1,2,3}{4}, {1,2,4}{3}, {1,4}{2,3}, {1,3,4}{2}, {1,3}{2,4}.
the value of the distribution function at c
.
Anne-Laure Boulesteix (http://www.slcmsr.net/boulesteix)
A.-L. Boulesteix (2006), Maximally selected chi-square statistics and binary splits of nominal variables, Biometrical Journal 48:838-848.
# load exactmaxsel library library(exactmaxsel) Fcat(c=4,n0=15,n1=10,A=c(6,10,9),statistic="chi2") Fcat(c=5,n0=15,n1=15,A=c(5,8,7,10),statistic="gini")