outlierplot {compositions} | R Documentation |
A collection of plots emphasing different aspects of possible outliers.
outlierplot(X,...) ## S3 method for class 'acomp': outlierplot(X,colcode=colorsForOutliers1,pchcode=pchForOutliers1, type=c("scatter","biplot","dendrogram","ecdf","portion","nout"), legend.position,pch=19,...,clusterMethod="ward", myCls=classifier(X,alpha=alpha,type=class.type,corrected=corrected), classifier=OutlierClassifier1, alpha=0.05, class.type="best", Legend,pow=1, main=paste(deparse(substitute(X))), corrected=TRUE,robust=TRUE,princomp.robust=FALSE, mahRange=exp(c(-5,5))^pow, flagColor="red", meanColor="blue", grayColor="gray40", goodColor="green", mahalanobisLabel="Mahalanobis Distance" )
X |
The dataset as an acomp object |
colcode |
A color palette for factor given by the myCls ,
or function to create it from the factor. Use colorForOutliers2 if
class.method="all" is used. |
pchcode |
A function to create a plot character palette for the factor
returned by the myCls call |
type |
The type of plot to be produced. See details for more precise definitions. |
legend.position |
The location of the legend. Must!!! be given to draw a classical legend. |
pch |
A default plotting char |
... |
Further arguments to the used plotting function |
clusterMethod |
The clustering method for hclust
based outlier grouping. |
myCls |
A factor presenting the groups of outliers |
classifier |
The routine to create a factor presenting the groups
of outliers heuristically. It is only used in the default argument
to myCls . |
alpha |
The confidence level to be used for outlier classification tests |
class.type |
The type of classification that should be generated
by classifier |
Legend |
The content will be substituted and stored as list entry legend in the result of the function. It can than be evaluated to actually create a seperate legend on another device (e.g. for publications). |
pow |
The power of Mahalanobis distances to be used. |
main |
The title of the graphic |
corrected |
Literature typically proposes to compare the Mahalanobis distances with the distribution of a random Mahalanobis distance. However it would be needed to correct this for (dependent) multiple testing, since we always test the whole dataset, which means comparing against the distribution of the maximum Mahalanobis distance. This argument switches to this second behavior, giving less outliers. |
robust |
A robustness description as define in
robustnessInCompositions |
princomp.robust |
Either a logical determining wether or not the principal component analysis should be done robustly or a principle component object for the dataset. |
mahRange |
The range of Mahalanobis distances displayed. This is fixed to make views comparable among datasets. However if the preset default is not enough a warning is issued and a red mark is drawn in the plot |
flagColor |
The color to draw critical situations. |
meanColor |
The color to draw typical curves. |
goodColor |
The color to draw confidence bounds. |
grayColor |
The color to draw less important things. |
mahalanobisLabel |
The axis label to be used for axes displaying Mahalanobis distances. |
See outliersInCompositions for a comprehensive introduction into the outlier treatment in compositions.
type="scatter"
type="biplot"
coloredBiplot
is used rather than the usual one.
type="dendrogram"
type="ecdf"
meanColor
. The
alpha
-quantile – i.e. a lower prediction bound – for the
cdf is given in goodColor. A line in grayColor
show the
minium portion of observations above some limit to be
outliers, based on the portion of observations necessary to move
down to make the empirical distribution function get above its lower
prediction limit under the assumption of normality.
type="portion"
.
type="portion"
meanColor
we see a curve of an estimated
number of outliers above some limit, generated by estimating the
portion of outliers with a Mahalanobis distance over the given
limit by max(0,1-ecdf/cdf). The minimum
number of outliers is computed by replacing cdf by its lower
confidence limit and displayed in goodColor
. The
Mahalanobis distances of the individual data points are added as a
stacked stripchart
, such that the influence of
individual observations can be seen.
$mahalanobis
) with a cutoff inferred from this graphic.
type="nout"
a list respresenting the criteria computed to create the plots. The content of the list depends on the plotting type selected.
The package robustbase is required for using the robust estimations.
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
OutlierClassifier1
, ClusterFinder1
data(SimulatedAmounts) outlierplot(acomp(sa.outliers5)) ## Not run: datas <- list(data1=sa.outliers1,data2=sa.outliers2,data3=sa.outliers3,data4=sa.outliers4,data5=sa.outliers5,data6=sa.outliers6) opar<-par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1)) tmp<-mapply(function(x,y) { outlierplot(x,type="scatter",class.type="grade"); title(y) },datas,names(datas)) par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1)) tmp<-mapply(function(x,y) { myCls2 <- OutlierClassifier1(x,alpha=0.05,type="all",corrected=TRUE) outlierplot(x,type="scatter",classifier=OutlierClassifier1,class.type="best", Legend=legend(1,1,levels(myCls),xjust=1,col=colcode,pch=pchcode), pch=as.numeric(myCls2)); legend(0,1,legend=levels(myCls2),pch=1:length(levels(myCls2))) title(y) },datas,names(datas)) # To slow par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1)) for( i in 1:length(datas) ) outlierplot(datas[[i]],type="ecdf",main=names(datas)[i]) par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1)) for( i in 1:length(datas) ) outlierplot(datas[[i]],type="portion",main=names(datas)[i]) par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1)) for( i in 1:length(datas) ) outlierplot(datas[[i]],type="nout",main=names(datas)[i]) par(opar) ## End(Not run)