gx.hist {rgr} | R Documentation |
Plots a histogram for a data set, the user has options for defining the axis and main titles, the x-axis limits, arithmetic or logarithmic x-axis scaling, the number of bins the data are displayed in, and the colour of the infill.
gx.hist(xx, xlab = deparse(substitute(xx)), ylab = "Number of Observations", log = FALSE, xlim = NULL, main = "", nclass = "Scott", colr = 8, ifnright = TRUE)
xx |
name of the variable to be plotted |
xlab |
a title for the x-axis. It is often desirable to replace the default x-axis title of the input variable name text string with a more informative title, e.g., xlab = "Cu (mg/kg) in <2 mm O-horizon soil" . |
ylab |
a default y-axis title of "Number of Observations" is provided, this may be changed, e.g., ylab = "Counts" . |
log |
if it is required to display the data with logarithmic (x-axis) scaling, set log = TRUE . |
xlim |
default limits of the x-axis are determined in the function for use in other panel plots of function shape . However when used stand-alone the limits may be user-defined by setting xlim , see Note below. |
main |
when used stand-alone a title may be added optionally above the plot by setting main , e.g., main = "Kola Project, 1995" . |
nclass |
the default procedure for preparing the histogram is to use the Scott (1979) rule. This usually provides an informative histogram, other optional rules are nclass = "sturges" or nclass = "fd" ; the later standing for Freedman-Diaconis (1981), a rule that is resistant to the presence of outliers in the data. |
colr |
by default the histogram is infilled in grey, colr = 8 . If no infill is required, set colr = 0 . See function display.lty for the range of available colours. |
ifnright |
controls where the sample size is plotted in the histogram display, by default this in the upper right corner of the plot. If the data distribution is such that the upper left corner would be preferable, set ifnright = FALSE . |
xlim |
A two element vector containing the actual minimum [1] and maximum [2] x-axis limits used in the histogram display are returned. These are use in function shape to ensure all panels have the same x-axis limits. |
Any less than detection limit values represented by negative values, or zeros or other numeric codes representing blanks in the data vector, must be removed prior to executing this function, see ltdl.fix.df
.
Any NA
s in the data vector are removed prior to displaying the plots.
If the default selection for xlim
is inappropriate it can be set, e.g., xlim = c(0, 200)
or c(2, 200)
. If the defined limits lie within the observed data range a truncated plot will be displayed. If this occurs the number of data points omitted is displayed below the total number of observations.
If it is desired to prepare a display of data falling within a defined part of the actual data range, then either a data subset can be prepared externally using the appropriate R syntax, or xx
may be defined in the function call as, for example, Cu[Cu < some.value]
which would remove the influence of one or more outliers having values greater than some.value
. In this case the number of data values displayed will be the number that are <some.value
.
Robert G. Garrett
Venables, W.N. and Ripley, B.D., 2001. Modern Applied Statistics with S-Plus, 3rd Edition, Springer - see pp. 119 for a description of histogram bin selection computations.
display.lty
, ltdl.fix.df
, remove.na
## Make test data available data(kola.o) attach(kola.o) ## Generates an initial display to have a first look at the data and ## decide how best to proceed gx.hist(Cu) ## Provides a more appropriate initial display gx.hist(Cu, xlab = "Cu (mg/kg) in <2 mm O-horizon soil", log = TRUE) ## Causes the Friedman-Diaconis rule to be used to select the number ## of histogram bins shape(Cu, xlab = "Cu (mg/kg) in <2 mm O-horizon soil", log = TRUE, nclass = "fd") ## Detach test data detach(kola.o)