TWIX {TWIX} | R Documentation |
Trees with extra splits
TWIX(formula, data = NULL, test.data = 0, subset = NULL, method = "deviance", topn.method = "complete", cluster = NULL, minsplit = 30, minbucket = round(minsplit/3), Devmin = 0.05, topN = 1, level = 30, st = 1, cl.level = 2, tol = 0.15, score = 1, k = 0, trace.plot=FALSE, ...)
formula |
formula of the form y ~ x1 + x2 + ... ,
where y must be a factor and x1,x2,... are numeric or factor. |
data |
an optional data frame containing the variables in the model(training data). |
test.data |
This can be a data frame containing new data, 0 (default),
or "NULL" .If set to "NULL" the bad obserations will be specified. |
subset |
an optional vector specifying a subset of observations to be used. |
method |
Which split points will be used? This can be "deviance"
(default), "grid" or "local" . If the method is set to:"local" - the program uses the local maxima of the split function(entropy),"deviance" - all values of the entropy,"grid" - grid points. |
topn.method |
one of "complete" (default) or "single" .
A specification of the consideration of the split points.
If set to "complete" it uses split points from all variables,
else it uses split points per variable. |
cluster |
name of the cluster, if parallel computing will be used. |
minsplit |
the minimum number of observations that must exist in a node. |
minbucket |
the minimum number of observations in any terminal <leaf> node. |
Devmin |
the minimum improvement on entropy by splitting. |
topN |
integer vector. How many splits will be selected and at which
level? If length 1, the same size of splits will be selected at each level.
If length > 1, for example topN=c(3,2) , 3 splits will be chosen
at first level, 2 splits at second level and for all next levels 1 split. |
level |
maximum depth of the trees. If level set to 1, trees
consist of root node. |
st |
step parameter for method "grid" . |
cl.level |
parameter for parallel computing. |
tol |
parameter, which will be used, if topn.method is set to
"single" . |
score |
a parameter, which can be 1 (default) or 2 .
If it is 2 the sort-function will be used,if it set to 1 weigth-function will be usedscore = 0.25*scale(dev.tr)+0.6*scale(fit.tr)+0.15*(tree.structure) |
k |
k-fold cross-validation of split-function. k specify the part of observations which will be take in hold-out sample (k can be (0,0.5)). |
trace.plot |
Should trace plot be ploted? |
... |
further arguments to be passed to or from methods. |
a list with the following components :
call |
the call generating the object. |
trees |
a list of all constructed trees, which include ID, Dev ... for each tree. |
greedy.tree |
greedy tree |
multitree |
database |
agg.id |
vector specifying trees for aggregation. |
Bad.id |
ID-vector of bad observations from train data. |
get.tree
, predict.TWIX
,
print.single.tree
, plot.TWIX
,
deviance.TWIX
data(olives) i <- sample(572,150) ic <- setdiff(1:572,i) training <- olives[ic,] test <- olives[i,] # #Tree1<-TWIX(Region~.,data=training[,1:9],topN=c(9,2),method="local") #Tree1$trees # #pred<-predict(Tree1,newdata=test,sq=1:2) # #predict(Tree1,newdata=test,sq=1:2,ccr=TRUE)$CCR