missing.compositions {compositions} | R Documentation |
This help section discusses some general strategies of working with missing valuess in a compositional, relative or vectorial context and shows how the various types of missings are represented and treated in the "compositions" package, according to each strategy/class of analysis of compositions or amounts.
is.BDL(x,mc=attr(x,"missingClassifier")) is.SZ(x,mc=attr(x,"missingClassifier")) is.MAR(x,mc=attr(x,"missingClassifier")) is.MNAR(x,mc=attr(x,"missingClassifier")) is.NMV(x,mc=attr(x,"missingClassifier")) is.WMNAR(x,mc=attr(x,"missingClassifier")) is.WZERO(x,mc=attr(x,"missingClassifier")) has.missings(x,...) ## Default S3 method: has.missings(x,mc=attr(x,"missingClassifier"),...) ## S3 method for class 'rmult': has.missings(x,mc=attr(x,"missingClassifier"),...) SZvalue MARvalue MNARvalue BDLvalue
x |
A vector, matrix, acomp, rcomp, aplus, rplus object for which we would like to know the missing status of the entries |
mc |
A missing classifier function, giving for each value one of the values BDL (Below Detection Limit), SZ (Structural Zero), MAR (Missing at random), MNAR (Missing not at random), NMV (Not missing value) This functions are introduced to allow a different coding of the missings. |
... |
further generic arguments |
In the context of compositional data we have to consider at least four types of missing and zero values:
Each function of type is.XXX
checks the status of its argument according to
the XXX type of value from those above.
Different steps of a statistical analysis and different understanding
of the data will lead to different approaches with respect to missings and zeros.
In the first exploratory step, the problem is to keep the
methods working and to make the missing structure visible in the
analysis. The user should need as less as possible extra thinking
about missings, an get nevertheless a true picture of the data. To
achieve this we tried to make the basic layer of computational
functions working consitently with missings and propagating the
missingness character seamlessly. However some of this only works with
acomp
, where a closed form missing theories are available
(e.g. proportional imputation [e.g. Mart'in-Fern'andez, J.A. et
al.(2003)]or estimation with missings
[Boogaart&Tolosana 2006]). The main graphics should hint towards
missing and try to add missings to the plot by marking the remaining
informaion on the axes. However one again should be clear that this is
only reasonably justified in the relative geometries. Unfortunatly the
missing subsystem is currently not fully compatible with the
robustness subsystem.
As a second step, the analyst might want to analyse the
missing structure for itself. This is preliminarly provided by these
functions, since their result can be treated as a boolean data set in
any other R function. Additionally a missingSummary
provides some a convenience function to provide a fast overview over
the different types of missings in the dataset.
In the later inferential steps, the problem is to get results valid
with respect to a model. One needs to be able to look through the data
on the true processes behind, without being distracted by artifacts
stemming from missing values. For the moment, how analyses react to the
presence of missings depend on the value of the na.action option. If this
is set to na.omit (the default), then cases with missing values on any
variable are completely ignored by the analysis. If this is set to
na.pass, then some of the following applies.
The policy on how a missing value is to be introduced into the
analysis depends on the purpose of the analysis, the type of analysis
and the model behind. With respect to this issue this package and
probabily the whole science of compositional data analysis is still
very preliminary.
The four philosophies work with different approaches to these problems:
rplus
zeroreplace
. A structural zero can either
be seen as a true zero or as a MAR value.rcomp
and acomp
aplus
More information on how missings are actually processed can be found in the help files of each individual functions.
A logical vector or matrix with the same shape as x stating wether or not the value is of the given type of missing.
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana Delgado, Matevz Bren
Boogaart, K.G. v.d., R. Tolosana-Delgado, M. Bren (2006) Concepts for handling of zeros and missing values in compositional data, in E. Pirard (ed.) (2006)Proccedings of the IAMG'2006 Annual Conference on "Quantitative Geology from multiple sources", September 2006, Liege, Belgium, S07-01, 4pages, http://www.math-inf.uni-greifswald.de/~boogaart/Publications/iamg06_s07_01.pdf, ISBN: 978-2-9600644-0-7
Aitchison, J. (1986) The Statistical Analysis of Compositional
Data Monographs on Statistics and Applied Probability. Chapman &
Hall Ltd., London (UK). 416p.
Aitchison, J, C. Barcel'o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn
(2002) A consise guide to the algebraic geometric structure of the
simplex, the sample space for compositional data analysis, Terra
Nostra, Schriften der Alfred Wegener-Stiftung, 03/2003
Billheimer, D., P. Guttorp, W.F. and Fagan (2001) Statistical interpretation of species composition,
Journal of the American Statistical Association, 96 (456), 1205-1214
Mart'in-Fern'andez, J.A., C. Barcel'o-Vidal, and V. Pawlowsky-Glahn (2003)
Dealing With Zeros and Missing Values in Compositional
Data Sets Using Nonparametric Imputation. Mathematical Geology, 35(3)
253-278
compositions-package, missingsInCompositions,
robustnessInCompositions, outliersInCompositions,
zeroreplace
, rmult
, ilr
,
mean.acomp
, acomp
, plot.acomp
require(compositions) # load library data(SimulatedAmounts) # load data sa.lognormals dat <- acomp(sa.missings) dat var(dat) mean(dat) plot(dat) boxplot(dat) barplot(dat)