reldist {reldist}R Documentation

Inference for Relative Distributions

Description

Estimate and graph relative distribution and density functions for continuous or discrete data.

Usage

reldist(y, yo=FALSE, ywgt=FALSE,yowgt=FALSE,
  show="none", decomp="locadd",
  location="median", scale="IQR",
  rpmult=FALSE, 
  z=FALSE, zo=FALSE,
  smooth = 0.35, 
  quiet = TRUE, 
  cdfplot=FALSE,
  ci=FALSE,
  bar="no",
  add=FALSE,
  graph=TRUE, type="l",
  xlab="Reference proportion",ylab="Relative Density",yaxs="r",
  yolabs=pretty(yo), yolabslabs=NULL, 
  ylabs=pretty(y), ylabslabs=NULL,
  yolabsloc=0.6, ylabsloc=1, 
  ylim=NULL, cex=0.8, lty=1,
  binn=100,
  aicc=seq(0.0001, 5, length=30),
  deciles=(0:10)/10,
  discrete=FALSE,
  method="gam",
  ...)

Arguments

y Sample from comparison distribution.
yo Sample from reference distribution.
discrete Do y and yo refer to a discrete distribution? If TRUE a discrete estimator is used instead of the default continuous one.
smooth Degree of smoothness required in the fit. Higher values lead to smoother curves, lower positive values lead to closer fits to the observed data. If it is not specified the value that minimizes GCV is used. If a value less than zero is specified then the value is chosen to minimize a corrected AIC. If discrete=TRUE it is the minimum number of values to pool in the reference distribution in the probabiliy mass function estimate.
method Method used to estimate the relative density. The default ("gam") uses a local likelihood approach based on smoothed Poisson regression. The option "loclik" uses log-splines. The option "quick" uses the Anscombe transformation to stabilize variances. In versions prior to 1.3 the "quick" approach was used.
graph Graph the results on the current device.
bar Graph the deciles on the current device. Possible values of bar are "no" (no deciles plotted), "yes" (deciles plotted with the non-parametric fit, "only"(deciles plotted without non-parametric fit).
add Add the density to the current plot?
ylim plotting limit for the vertical axis.
lty Line type to be used for the density.
xlab Horizontal label.
ylab Vertical label.
ylabs Locations for label to be added to the right axis.
ylabslabs Labels indicating the original scale for the comparison distribution.
ylabsloc Distance of labels to right of axis (in lines).
yolabs locations for labels to be added to the tip axis.
yolabslabs Labels indicating the original scale for the reference distribution.
yolabsloc Distance of labels above axis (in lines).
yaxs Style of vertical axis.
cdfplot calculate and plot the CDF rather than the density.
quiet Should the output be returned invisibly?
ci Plot (pointwise) 95% confidence intervals?
ywgt Weights on the comparison sample.
yowgt Weights on the reference sample.
z Covariate on the comparison sample to be used to adjust it to the reference distribution. Only used if the form of matching specified in decomp="covariate".
zo Covariate on the reference sample to be used in the adjustment. to the reference distribution. Only used if the form of matching specified in decomp="covariate".
show Type of relative distribution to produce. Possible values are "none" (comparions to reference), residual (location-matched reference to reference), effect (comparison to location-matched reference).
decomp Form of matching to the comparison sample. Possible values are locmult (multiplicatively scale the reference), locadd (additively shift the reference), lsadd (location/scale additive shift), covariate (covariate adjust the refernce (requires z and zo to be specified)).
location How to measure location. Possible values are "mean" and "median".
scale How to measure the scale. Possible values are "standev" (standard deviation) and IQR (interquartile range).
rpmult Only in calculation of polarization indices: multiplicatively scale the reference sample to the comparison sample before comparing the two distributions?
binn Number of bins used in the smoother.
deciles The percentiles used for the histogram bins. Typically deciles (i.e., 0.0, 0.1, 0.2,...,0.9, 1.0), but any set can be used (e.g., quintiles, terciles).
aicc Values of the smoothing parameter to search over in minimizing the corrected AIC. Only used if method="gam" and smooth is less than 0.
type Type of plot to use. See par().
cex Character expansion to use in plots. See par().
... Additional arguments to the plot functions. See par().

Value

x Horizontal coordinates for the density (typically percentages).
y Density at x.
rp 95% confidence interval for the median relative polarization as lower bound, estimate, upper bound.
rpl 95% confidence interval for the lower relative polarization as lower bound, estimate, upper bound.
rpu 95% confidence interval for the upper relative polarization as lower bound, estimate, upper bound.
cdf x coordinates for the CDF (typically percentages) and y CDF at x.

Note

Most of the code is for the plotting and tinkering. The guts of the method are forming the relative data at the top. The rest is a standard fixed interval density estimation with a few bells and whistles.

References

For more examples see the tech report by Handcock & Aldrich (2002) available at http://www.csss.washington.edu/Papers

Examples

#
# First load the data.
#

data(nls, package="reldist")

#
# A simple example comparing permanent wages of the original to the
# recent cohort in the NLS.  See H&M (1999) for details.

reldist(y=recent$chpermwage,yo=original$chpermwage)

#
# A more sophisticated version of the same.
#

reldist(y=recent$chpermwage, yo=original$chpermwage,
        yowgt=original$wgt, ywgt=recent$wgt,      
        bar=TRUE,                                   
        smooth=0.1,                              
        yolabs=seq(-1, 3, by=0.5),                 
        ylim=c(0, 3.0),cex=0.8,                   
        ylab="Relative Density",                 
        xlab="Proportion of the Original Cohort")

#
# A CDF version.
#

reldist(y=recent$chpermwage, yo=original$chpermwage,
    yowgt=original$wgt, ywgt=recent$wgt,      
    cdfplot=TRUE,                               
    smooth=0.4,                              
    yolabs=seq(-1,3,by=0.5),                 
    ylabs=seq(-1,3,by=0.5),                  
    cex=0.8,                                 
    ylab="proportion of the recent cohort",  
    xlab="proportion of the original cohort")

[Package reldist version 1.5-5.1 Index]