gpairs {YaleToolkit} | R Documentation |
Produces a matrix of plots showing pairwise relationships between quantitative and categorical variables in a complex data set.
gpairs(x, upper.pars = list(scatter = "points", conditional = "barcode", mosaic = "mosaic"), lower.pars = list(scatter = "points", conditional = "boxplot", mosaic = "mosaic"), diagonal = "default", outer.margins = list(bottom = unit(2, "lines"), left = unit(2, "lines"), top = unit(2, "lines"), right = unit(2, "lines")), xylim = NULL, outer.labels = NULL, outer.rot = c(90, 0), gap = 0.05, buffer = 0.02, reorder = NULL, cluster.pars = NULL, stat.pars = NULL, scatter.pars = NULL, bwplot.pars = NULL, stripplot.pars = NULL, barcode.pars=NULL, mosaic.pars = NULL, axis.pars = NULL, diag.pars = NULL, whatis = FALSE) corrgram(x)
x |
a data frame (or matrix the relationships between whose columns are to be examined). Any combination of quantitative and categorical variables is acceptable. |
upper.pars |
see Details |
lower.pars |
see Details |
diagonal |
by default, the diagonal from the top left to the bottom right is used for displaying the variable names (and, in our version, the marginal distributions of the variables); diagonal="other" will place the diagonal running from the top right down to the bottom left. |
outer.margins |
a list of length 4 with units as components named bottom, left, top, and right, giving the outer margins; the default uses two lines of text. A vector of length 4 with units (ordered properly) will work, as will a vector of length 4 with numeric values (interpreted as lines). |
xylim |
optionally specify a single range to be used as xlim and ylim where appropriate. Note that if this option causes cropping, it will fail to work in barcode panels. |
outer.labels |
the default is NULL , for alternating axis labels around the perimeter. If "all" , all labels are printed, and if "none" no labels are printed. |
outer.rot |
a 2-vector (x, y) rotating the top/bottom outer labels x degrees and the left/right outer labels y degrees. Only works for categorical labels of boxplot and mosaic panels. |
gap |
the gap between the tiles; defaulting to 0.05 of the width of a tile. |
buffer |
the fraction by which to expand the range of quantitative variables to provide plots that will not truncate plotting symbols. Defaults to 0 percent of range currently. |
reorder |
currently only support for the string "cluster" , which reorders the columns according to the output of hclust . Note that factors are coerced to numbers by replacing them with integers, which implicitly assumes what is probably an arbitrary ordering. |
cluster.pars |
a list with two elements named dist.method and hclust.method . These are passed respectively to dist and hclust . NULL is equivalent to list(dist.method = "euclidean", hclust.method = "complete") . |
stat.pars |
NULL is equivalent to list(fontsize = 7, signif = 0.05, verbose = FALSE, use.color = TRUE, missing = 'missing', just = 'centre') ; stat.pars\$verbose can be TRUE (providing 5 statistics), FALSE (providing 2 statistics), or NA (nothing). The string missing is used in summaries where there are missing values; fontsize and just control the size and justification of the text summaries (see grid.text and gpar . The use.color=FALSE option provides an alternative summary of the strength of the correlation (see Green and Hickey (2006)). This is only used with scatter="stats") in upper.pars and lower.pars . |
scatter.pars |
NULL is equivalent to list(pch = 1, size = unit(0.25, "char"), col = "black", frame.fill = NULL, border.col = "black") . |
bwplot.pars |
NULL , passed to bwplot for producing boxplots. |
stripplot.pars |
NULL is equivalent to list(pch = 1, size = unit(0.5, 'char'), col = 'black', jitter = FALSE) . |
barcode.pars |
NULL is equivalent to list(nint = 0, ptsize = unit(0.25, "char"), ptpch = 1, bcspace = NULL, use.points = FALSE) . |
mosaic.pars |
NULL . Currently, only shade and gp_labels are passed through to strucplot for producing mosaic tiles. |
axis.pars |
NULL is equivalent to list(n.ticks = 5, fontsize = 9) . |
diag.pars |
NULL is equivalent to list(fontsize = 9, show.hist = TRUE, hist.color = 'black') . |
whatis |
default is FALSE ; TRUE returns whatis(x) . |
In some cases, the graphics device can not be resized after production of the plot because of the way rotation of barcodes is performed.
upper.pars
and lower.pars
are lists possibly containing named elements 'scatter'
, 'conditional'
and 'mosaic'
. Each element of the list is a string implementing the following options: scatter
= exactly one of ('points', 'lm', 'ci', 'symlm', 'loess', 'corrgram', 'stats', 'qqplot')
;
'conditional'
= exactly one of ('boxplot', 'stripplot', 'barcode')
; mosaic='mosaic'
(only option currently implemented).
corrgram()
is just a wrapper to gpairs()
producing a `corrgram' in the style of Michael Friendly.
If whatis=TRUE
, the value is a data frame containing variable names, types, numbers of missing values, numbers of distinct values, precisions, maxima and minima.
John W. Emerson, Walton Green
Emerson, John W. (1998) "Mosaic Displays in S-PLUS: A General Implementation and a Case Study." {it Statistical Computing and Graphics Newsletter} Vol. 9,No. 1, 1998.
Basford, K. E. and J. W. Tukey (1999) {it Graphical Analysis of Multiresponse Data: Illustrated with a Plant Breeding Trial.}
Friendly, M. (2000). {it Visualizing Categorical Data.} SAS Press.
Friendly, M., 2002, `Corrgrams: Exploratory displays for correlation matrices.' {it American Statistician} 56(4), 316–324.
Green, W. A. (2006) Loosening the CLAMP: An exploratory graphical approach to the Climate Leaf Analysis Multivariate Program {it Palaeontologia Electronica} 9(2):9A.
pairs
, splom
, mosaicplot
, strucplot
, bwplot
, barcode
, stripplot
.
allexamples <- FALSE y <- data.frame(A=c(rep("red", 100), rep("blue", 100)), B=c(rnorm(100),round(rnorm(100,5,1),1)), C=runif(200), D=c(rep("big", 150), rep("small", 50)), E=rnorm(200)) gpairs(y) data(iris) gpairs(iris) if (allexamples) { gpairs(iris, upper.pars = list(scatter = 'stats'), scatter.pars = list(pch = substr(as.character(iris$Species), 1, 1), col = as.numeric(iris$Species)), stat.pars = list(verbose = FALSE)) gpairs(iris, lower.pars = list(scatter = 'corrgram'), upper.pars = list(conditional = 'boxplot', scatter = 'loess'), scatter.pars = list(pch = 20)) } data(Leaves) gpairs(Leaves[1:10], lower.pars = list(scatter = 'loess')) if (allexamples) { gpairs(Leaves[1:10], upper.pars = list(scatter = 'stats'), lower.pars = list(scatter = 'corrgram'), stat.pars = list(verbose = FALSE), gap = 0) corrgram(Leaves[,-33]) } runexample <- FALSE if (runexample) { data(NewHavenResidential) gpairs(NewHavenResidential) }