rfm.test {MarkedPointProcess} | R Documentation |
rfm.test
performs MC tests which enables the user to decide
whether a marked point process may be considered as a random field
model, i.e., as a model where the marks are independent of the locations
rfm.test(coord=NULL, data, normalize=TRUE, MCrepetitions=99, MCmodel=list(model="exponential", param=c(mean=0,variance=NA,nugget=0,scale=NA)), method=NULL, bin=c(-1,seq(0,1.2,l=15)), MCregister=1, n.hypo=1000, pvalue=c(10, 5, 1), tests="l1 & w3", tests.lp=NULL, tests.weight=NULL, Barnard=FALSE, PrintLevel=RFparameters()$Print,... )
coord |
matrix with 2 columns; the coordinates of the points |
data |
vector or matrix; the univariate marks that correspond to
the locations; if data is a matrix then each column is
interpreted as an independent observation given the locations
coord ; see Details for further possibilities
|
normalize |
logical; if TRUE the data are transformed to
standard normal data before analysed; if data is a matrix
this is done for each column separately |
MCrepetitions |
usually 19 or 99; number of simulations that are compared with the data |
MCmodel |
variogram model to be fitted, see
fitvario .
|
method |
method used to simulate Gaussian random fields;
see GaussRF
|
bin |
sequence of increasing bin margins for calculating the function E, V, etc in analogy to the binning for variograms; see Details |
MCregister |
0:9; the register to which intermediate results are stored when the random fields are generated for the MC test |
n.hypo |
number of repeated MC tests to determine the
pvalue -position for the Nullhypothesis.
If the variogram were not estimated, this position would
be (1-pvalue ) (MCrepetitions + 1).
see Details |
pvalue |
test levels in percent.
Only values below 50 are accepted;
otherwise 100-pvalue is regarded as pvalue
(to be consistent with the former definition)
|
tests |
vector of characters, see Details. |
tests.lp |
vector of characters, see Details. |
tests.weight |
vector of characters, see Details. |
Barnard |
test by Barnard (1963) on the independence of marks |
PrintLevel |
If zero then no messages are printed. The higher the value the more tracing information is given. |
... |
any parameter for variofit
can be passed,
except for x , y , z , T , data ,
model , param , mle.methods and
cross.methods
|
data
: there are three possibilities to pass the data
data
a vector or matrix, coord
contains the
coordinates, as described above
data=list(coord=,data=)
and coord=NULL
data=list( list(coord=,data=), ..., list(coord=,
data=))
; several data sets are analysed and all the results are
summed up, and returned in a single matrix E
(or V
or
SQ
)
bin
: as the variogram in geostatistics, the characteristics for
the marks of a marked point process depend on a distance (vector)
r. Instead of returning a cloud of values, binned values are
calculated in the same way the binned variogram is obtained. bin
gives
the margins of the bins (left open, right closed ones) as an
increasing sequence. The first bin must include the zero, i.e.,
bin=c(-1, 0, ...)
.
n.hypo
: for correct appreciation of the relative position of the
statistic for the data set w.r.t. the simulations,
the reference values for the estimated pvalue
level must be determined:
n.hypo
realisations of the Gaussian random field are
are simulated
MCrepetitions
is performed (estimation of
the parameters of the random field, and test statistics for 99
realisations)
pvalue
positions is determined, which
have values around (1-pvalue
) (MCrepetitions
+
1);
this position is returned in null.sl
as
reference values for the estimated pvalue
level
tests
, tests.lp
, tests.weight
:
tests="all"
then the results for all test variants are
returned, independently of the values of tests.lp
and
tests.weight
tests
and
the combinations of tests.lp
and tests.weight
are given.
tests.lp
are
“max” (maximum norm),
“l2” (l2 norm),
“l1” (l1 norm),
“robust” (the distance is squared for small distances only),
“anti” (the distance is square for large distances only)
tests.weight
are
“const” (constant weight),
“1/sum#” (‘sum#’ is the cummulative sum of the number of points
in all bins to the left, and the considered bin itself),
“sqrt(1/sum#)” (sqrt of ‘1/sum#’),
“1/sumsqrt#” (similar to ‘1/sum#’, but square root of
the number of points is summed up),
“#” (number of points within a bin),
“sqrt#” (square root of the number of points),
“1/sd” (sd=estimated standard deviation within a bin)
or, equivalently,
“w1”, “w2”, “w3”, “w4”, “w5”, “w6”, “w7”.
tests
are
“max & const”, “l2 & const”, “l1 & const”,
“robust & const”, “anti & const”,
“max & 1/sum#”, “l2 & 1/sum#”, “l1 & 1/sum#”,
“robust & 1/sum#”, “anti & 1/sum#”,
“max & sqrt(1/sum#)”, “l2 & sqrt(1/sum#)”,
“l1 & sqrt(1/sum#)”, “robust & sqrt(1/sum#)”,
“anti & sqrt(1/sum#)”, “max & 1/sumsqrt#”,
“l2 & 1/sumsqrt#”, “l1 & 1/sumsqrt#”,
“robust & 1/sumsqrt#”, “anti & 1/sumsqrt#”,
“max & #”, “l2 & #”, “l1 & #”,
“robust & #”, “anti & #”, “max & sqrt#”,
“l2 & sqrt#”, “l1 & sqrt#”, “robust & sqrt#”,
“anti & sqrt#”, “max & 1/sd”, “l2 & 1/sd”,
“l1 & 1/sd”, “robust & 1/sd”, “anti & 1/sd”,
or, equivalently,
“max & w1”, “l2 & w1”, “l1 & w1”, “robust & w1”, “anti & w1”, “max & w2”, “l2 & w2”, “l1 & w2”, “robust & w2”, “anti & w2”, “max & w3”, “l2 & w3”, “l1 & w3”, “robust & w3”, “anti & w3”, “max & w4”, “l2 & w4”, “l1 & w4”, “robust & w4”, “anti & w4”, “max & w5”, “l2 & w5”, “l1 & w5”, “robust & w5”, “anti & w5”, “max & w6”, “l2 & w6”, “l1 & w6”, “robust & w6”, “anti & w6”, “max & w7”, “l2 & w7”, “l1 & w7”, “robust & w7”, “anti & w7”
and “range” (difference largest positive and largest negative deviation for all bins), “no.bin.sq” (l2 norm where the bins are chosen so that they contain only 1 point), “no.bin.abs” (l1 norm where the bins are chosen so that they contain only 1 point)
Let n be the number of MC tests chosen by the user.
Then rfm.test
returns a list of the following elements:
E
(E function)
matrix of n columns. The number of rows depends on the input
parameters:
If only one realisation of the data
is given then
the absolute test positions of the MC test is returned, i.e. a
value between 1 and MCrepetition
+1, inclusively.
If several realisations of the data
(and the coord
)
are given, then the number of rows equals MCrepetition
+1,
and the kth row gives
the number of test statistics with position k.
The first situation is the standard one for the user. The second
situation appears when rfm.test
is recalled to calculate
the intermediate result null.hypo
, see below.
VAR
(V function)
matrix of n columns. See E
above.
SD
(the square root of the V function)
matrix of n columns. See E
above.
reject.null
list of logical matrices that indicate whether E
, VAR
or SQ
should be rejected at the given levels, i.e. whether
the positions of the tests statistics for E
, VAR
or SQ
are at least as large as the estimated reference values
given by null.sl
.
est
list of variogram models according to MCmodel
estimated
from the data.
normalize
The input parameter normalize
.
MCrepetitions
The input parameter MCrepetitions
.
MCmodel
The input parameter MCmodel
.
null.hypo
null.hypo
stores intermediate results that are usually not of
interest for the user.
n.hypo
simulations have been performed under the null
hypothesis to determine the pvalue
test positions. (The explicite
determination is necessary, since parameters of the variogram
have to be estimated within the null hypothesis.)
For these n.hypo
simulations, rfm.test
is run and
null.hypo
gives the results. Note that here, all test variants
are considered.
null.sl
List of matrices. They give the
reference values for the estimated pvalue
level.
The values are around (1-pvalue
)*(MCrepetitions
+ 1),
but can range between 1 and MCrepetitions
+ 2.
If a value of MCrepetitions
+ 2 occurs, usually,
MCrepetitions
and/or n.hypo
have been chosen too small.
bin
the binning used to calculate E
, VAR
and SQ
In comparison to version 0.1 of MarkedPointProcess and the paper by
Schlather et al. (2004), the announced positions of the test statistics
for E
, VAR
,
SD
and null.sl
are all increased by 1,
now ranging from 1 to 100 instead of 0 to 99, for the standard settings.
Martin Schlather, martin.schlather@math.uni-goettingen.de http://www.stochastik.math.uni-goettingen.de/institute
Barnard, G. (1963) Discussion paper to M.S. Barlett on “The spectral analysis of point processes”, J. R. Statist. Soc. Ser. B, 25, 294.
Besag, J. and Diggle, P. (1977) Simple Monte Carlo tests for spatial pattern. J. R. Statist. Soc. Ser. C, 26, 327–333.
Schlather, M., Ribeiro, P. and Diggle, P. (2004) Detecting Dependence Between Marks and Locations of Marked Point Processes J. R. Statist. Soc., Ser. B 66, 79-83.
mpp.characteristics
, simulateMPP
data(BITOEK) d <- steigerwald plotWithCircles(cbind(d$coord, d$diam), factor=2) mpp.characteristics(x=d$coord, data=d$diam, bin=c(-1, seq(0, 50, 2)), show=interactive()) ## testing for E=const, V=const or SD=const (this takes several minutes!) res <- rfm.test(d$coord, d$diam, MCrep=if (interactive()) 99 else 9, n.hypo=if (interactive()) 100 else 2) ## test statistics for the data res$E res$VAR ## reference values for the estimated 10%, 5% and 1% level res$null.sl ## should E=const, V=const or SD=const be rejected at the given levels? res$reject.null