rmb {extracat}R Documentation

Multiple Barchart for relative frequencies and generalized Spineplots

Description

The rmb function basically produces a Multiple Barchart for the relative frequencies of some target categories within each combination of the explanatory variables. The weights corresponding to each of these combinations (that is the absolute frequencies) will be represented by a Multiple Barchart with horizontal bars in the background. The width of the barcharts for the target variable can be constricted to the length of the horizontal bars. Additionally the rmb function allows to draw spineplots instead of the barcharts within each combination. On that score it can be seen as a generalization of Spineplots.

Usage

rmb(f, dset, hsplit = NULL, spine = FALSE,  hlcat = 1,  eqwidth = FALSE,
                tfreq = "Id", max.scale = 1,  use.na = FALSE, expected = NULL,
                mod.type = "poisson", resid.type = "pearson",
                use.expected.values = FALSE, resid.max = NULL, cut.rv = TRUE, cut.rs = 5,                
                base = 0.2, mult = 1.5, colv = NULL, lab = TRUE, yaxis = TRUE, 
                label = NULL, min.alpha = 0.1, base.alpha = 0.75, boxes = TRUE, 
                lab.tv = FALSE, varnames = TRUE, abbr = FALSE, lab.cex = 1.2,...)

Arguments

f The formula specifying the variables in their given order with the last variable being the target variable. The left hand is either empty or denotes the frequency variable.
dset The dataset in form of a frequency table (see ftable or subtable for more information) with a column named "Freq" or in raw format (that is the rows represent the cases and the columns represent the variables).
hsplit Logical vector with split directions where TRUE stands for horizontal splitting. The last (target) variable is always arranged on the x-axis.
spine If TRUE a spineplot will be drawn instead of each barchart. This is recommended for binary target variables.
hlcat A vector specifying the categories of the target variable which shall be highlighted (spineplot version) or plotted as bars(barchart version) in the given order. Specifying less than 2 indices will keep the original order in the barchart version.
eqwidth If TRUE the bar length of the multiple barchart in the background no longer restricts the width of the barcharts for the relative frequencies of the target variable.
tfreq The absolute frequencies used for the multiple barchart in the background can be transformed by means of "log" or square root("sqrt"). No transformation is represented by "Id".
max.scale The maximum value of the probability (y-axis) scale for each combination. Unsurprisingly the default is 1. The axis will be drawn if yaxis is TRUE.
use.na If TRUE missing values will be changed to a level "N/A" and else (which is default) the function na.omit will be called to reduce the dataset to complete cases only.
expected A list of integer vectors denoting the interaction terms in the poisson or proportional odds model. If undefinded no residual shadings will be used.
mod.type Either "poisson" or "polr". See glm and polr.
resid.type "pearson", "deviance", working, partial or "response". For polr models only the latter is available.
use.expected.values A logical specifying whether or not to use the frequencies predicted by the model instead of the observed frequencies.
resid.max The maximum scale value for the residuals. If undefined it will be chosen automatically.
cut.rv A logical. If FALSE the residual shading alpha values will be chosen exactly and otherwise depending on the specified scale intervals.
cut.rs The number of cuttingpoints for the residual scale.
base The maximum proportion of the total plot width which is used for the gaps.
mult The incremental multiplier for the gaps of different dimensions.
colv A vector defining the colors of the bars or NULL for rainbow colors. Has no effect if an expected model is defined.
lab If TRUE the plot size will decrease in order to make place for a labeling.
yaxis If TRUE a vertical axis will be drawn at both sides of the plot. This is recommended when changing the max.scale argument.
label An optional vector of logicals defining which variables shall be labelled. Has no effect if lab if FALSE.
min.alpha In case of eqwidth = T alpha blending with respect to the corresponding weight is applied to the background color of the bars. In order to save very sparse combinations from disappearing there is a minimum alpha value.
base.alpha A basic alpha value which will be applied to the bar colors. Does also work with residual shadings but not with colors chosen individually by setting colv
boxes Should the labels be surrounded by boxes?
lab.tv Should the target variable be included in the labeling?
varnames Should the variable names be shown as labels?
abbr If TRUE the labels will automatically be abbreviated (3 characters) using the abbreviate function.
lab.cex The fontsize multplier.
... further arguments. Usually not necessary.

Details

See upcoming paper (2009).

Value

No return value.

Author(s)

Alexander Pilhoefer
Department for Computer Oriented Statistics and Data Analysis
University of Augsburg
Germany

References

Alexander Pilhoefer New approaches in visualization of categorical data: R-package extracat
Journal of Statistical Software, submitted Jan 2010

Examples

    data(housing)
    # simple example
    rmb(f = ~Type+Infl+Cont+Sat, dset = housing, mult = 2,
        hsplit = c(FALSE,TRUE,TRUE,FALSE), abbr = TRUE)
    
    # with sqrt-transformation and horizontal splits only
    rmb(f = ~Type+Infl+Cont+Sat, dset = housing, mult = 2,
        hsplit = c(TRUE,TRUE,TRUE,TRUE), tfreq = "sqrt",  abbr = TRUE)
    
    # a generalized spineplot with the first category highlighted
    rmb(f = ~Type+Infl+Cont+Sat, dset = housing, spine = TRUE, 
        hlcat = 1, mult = 2, hsplit = c(TRUE,FALSE,TRUE,TRUE), 
        tfreq = "sqrt", abbr = TRUE)
    
    # a generalized spineplot with all category highlighted in changed order
    rmb(f = ~Type+Infl+Cont+Sat, dset = housing, spine = TRUE,
        hlcat = c(3,1,2), mult = 2, hsplit = c(TRUE,FALSE,TRUE,TRUE),
        tfreq = "sqrt", abbr = TRUE)
    
    # with equal widths
    rmb(f = ~Type+Infl+Cont+Sat, dset = housing, eqwidth = TRUE,
        mult = 2, hsplit = c(TRUE,FALSE,TRUE,TRUE), lab.tv = TRUE,
        abbr = TRUE)
        
    # with equal widths, residual shadings and expected values
    rmb(f = ~Type+Infl+Cont+Sat, dset = housing, eqwidth = TRUE,
        mult = 2, hsplit = c(TRUE,FALSE,TRUE,TRUE), lab.tv = TRUE,
        abbr = TRUE, expected = list(c(1,2),c(2,3),c(3,4)),
        use.expected.values = TRUE)


[Package extracat version 1.0-0 Index]