fitvario {RandomFields} | R Documentation |
This function estimates arbitrary parameters of a random field specification.
fitvario(x, y=NULL, z=NULL, T=NULL, data, model, param, lower=NULL, upper=NULL, sill=NA, ...) fitvario.default(x, y=NULL, z=NULL, T=NULL, data, model, param, lower=NULL, upper=NULL, sill=NA, trend, use.naturalscaling=TRUE, PrintLevel=RFparameters()$Print, trace.optim=0, bins=20, nphi=1, ntheta=1, ntime=20, distance.factor=0.5, upperbound.scale.factor=10, lowerbound.scale.factor=20, lowerbound.scale.LS.factor=5, upperbound.var.factor=10, lowerbound.var.factor=100, lowerbound.sill=1E-10, scale.max.relative.factor=1000, minbounddistance=0.001, minboundreldist=0.02, approximate.functioncalls=50, pch="*", var.name="X", time.name="T", transform=NULL, standard.style=NULL)
x |
(n x 2)-matrix of coordinates, or vector of x-coordinates |
y |
vector of y coordinates |
z |
vector of z coordinates |
T |
vector of T coordinates; these coordinates are given in
triple notation, see GaussRF |
data |
vector or matrix of values measured at coord ;
If also a time component is given, the in the data the indices for
the spatial components run the fastest.
|
model |
string or list;
covariance model, see CovarianceFct , or
type PrintModelList() to get all options.
See also t
If model is a list, then the parameters with value NA
are estimated. Parameters that have value NaN should be
explicitely be defined by the function transform .
An alternative to define NaN values and
the function transform , is to replace the NaN
by a real-valued function with solely parameter a list defining
a covariance model. In case of the anisotropy matrix, the matrix
must be replaced by a list if functions are introduced.
Only the list elements
variance, scale or anisotropy, and kappas can be used, and
not the mean or the trend.
Further, the mean or the trend cannot be set by such a function.
See also transform below.
|
param |
vector or matrix or NULL.
If vector then
param=c(mean, variance, nugget, scale,...) ;
the parameters must be given
in this order. Further parameters are to be added in case of a
parametrised class of covariance functions, see
CovarianceFct .
Any components set to NA are estimated; the others
are kept fix.
See also model above.
|
lower |
list or vector. Lower bounds for the parameters.
If lower and param are vectors and length(lower) <
length(param)
then lower must match the number of additional parameters
a,b,c,....
If param is matrix the length of lower must match the
number columns of param or being 2 elements smaller (then
lower is filled with NA from the left.
The bounds are equally applied to all rows.
If lower is a list, then elements that are not given are
considered as NA .
If lower is not given, or lower contains NA
then the missing bounds are generated automatically.
|
upper |
list or vector. Upper bounds for the parameters. See also lower. |
sill |
If not NA the sill is kept fix. Only used if the
standard format for the covariance model is given. See Details. |
trend |
Not programmed yet.
May only be set if missing(param) ;
linear formula : uses X1 , X2 ,... and T as internal
parameters for the coordinates; all parameters are estimatedmatrix : must have the same number of rows as x fixed mean + matrix or linear formula : not possible within this function (just subtract the mean from your data before calling this function) |
... |
arguments as given in mleRF.default and listed in the
following. |
use.naturalscaling |
logical. Only used if model is given in standard (simple) way.
If TRUE then internally, rescaled
covariance functions will be used for which
cov(1)~=0.05.
use.naturalscaling has the advantage that scale
and the form parameters of the model get ‘orthogonal’,
but use.naturalscaling does not work for all models.
See Details. |
PrintLevel |
level to which messages are shown. See Details. |
trace.optim |
tracing of the function optim |
bins |
number of bins of the empirical variogram. See Details. |
nphi |
scalar or vector of 2 components. If it is a vector then the first component gives the first angle of the xy plane and the second one gives the number of directions on the half circle. If scalar then the first angle is assumed to be zero |
ntheta |
scalar or vector of 2 components. If it is a vector then the first component gives the first angle in the third direction and the second one gives the number of directions on the half circle. If scalar then the first angle is assumed to be zero. |
ntime |
scalar or vector of 2 components.
if ntimes is a vector, then the first component are the
maximum time distance (in units of the grid length T[3] ) and the
second component gives the step size (in units of the grid length
T[3] ). If scalar then the step size is assumed to 1 (in units
of the grid length T[3] .
|
distance.factor |
relative right bound for the bins. See Details. |
upperbound.scale.factor |
relative upper bound for scale in LSQ and MLE. See Details. |
lowerbound.scale.factor |
relative lower bound for scale in MLE. See Details. |
lowerbound.scale.LS.factor |
relative lower bound for scale in LSQ. See Details. |
upperbound.var.factor |
relative upper bound for variance and nugget. See Details. |
lowerbound.var.factor |
relative lower bound for variance. See Details. |
lowerbound.sill |
absolute lower bound for variance and nugget. See Details. |
scale.max.relative.factor |
relative lower bound for scale below which an additional nugget effect is detected. See Details. |
minbounddistance |
absolute distance to the bounds below which a part of the algorithm is considered as having failed. See Details. |
minboundreldist |
relative distance to the bounds below which a part of the algorithm is considered as having failed. See Details. |
approximate.functioncalls |
approximate evaluations of the ML target function on a grid. See Details. |
pch |
character shown before each step of calculation; depending on the specification there are two to five steps. Default: "*". |
var.name |
basic name for the coordinates in the formula of the trend. Default: ‘X’ |
time.name |
basic name for the time component in the formula of the trend. Default: ‘X’ |
transform |
vector of strings.
Essentially, transform allows for the definition of a parameter as a
function of other estimated or fixed parameters.
All the parameters are supposed to be in a vector called ‘param’
where the positions are given by parampositions .
An example of transform is
function(param) {param[3] <- 5 - param[1]; param} .
Any parameter that is set by transform , should be NaN
in the model definition. If it is NA a warning is given.
Note that the mean and the trend of the model can be neither set nor used in
transform . See also standard.style .
Instead of giving transform , in the model definition,
all NaN values are replaced by functions whose only parameter
is a bare model list, i.e., only the list elements
variance, scale or anisotropy, and kappas can be used, and
not the mean or the trend.
Further, the mean or the trend cannot be set by such a function.
Default: NULL |
standard.style |
logical or NULL . This variable should only be
set by the advanced user. If NULL standard.style will be
TRUE if the covariance model allows for a ‘standard’
definition (see convert.to.readable and
CovarianceFct ) and transform is NULL .
Otherwise standard.style will be FALSE .
If a ‘standard’ definition is given and both the variance and the
nugget are either not estimated or do not appear on the right hand
side of the transform , then standard.style might be set
to TRUE by the user. This accelerates the MLE algorithm.
The responsibility is completely left to the user, then.
Currently mleRF is only implemented for the
‘standard’ definition of the covariance model.
Hence standard.style must always be TRUE and consequently,
neither the variance nor the nugget may appear on either side
of the transform
|
The maximisation is performed using optim
. Since
optim
needs as input parameter an initial vector of parameters, mleRF
takes the initial parameter from the LSQ estimation.
If the best parameter vector of the MLE found so far is too close
to some given bounds, see the specific parameters below, it is
assumed that
optim
ran into a local minimum because of a bad starting
value.
In this case the MLE target function is calculated on a grid, the
best parameter vector is taken, and the optimisation is restarted with
this parameter vector.
Comments on specific parameters:
lower
If the model is given in standard form, the user may supply
the lower bounds for the whole parameter vector, or only for
the additional form parameters of the model.
The lower bound for the mean will be ignored.
lower
may contain NA
s, then these values
are generated by the
If a nested model is given, the bounds may again be supplied for all parameters or only for the additional form parameters of the model. The bounds given apply uniformely to all submodels of the nested model.
If the model
is given in list format, then
lower
is a list, where components may be missing
or NA
. These are generated by the algorithm, then.
If lower
is NULL
all lower bounds are generated
automatically.
upper.kappa
lower.kappa
.
sill
nugget
and variance
separately, they may also be estimated together under the
condition that nugget
+ variance
= sill
.
For the latter a finite value for sill
has to be supplied,
and nugget
and variance
are set to NA
.
sill
is only used for the standard model.
use.naturalscaling
TRUE
then internally, rescaled
covariance functions will be used for which
cov(1)~=0.05. However this parameter
does not influence
the output of mleRF
: the parameter vector
returned by mleRF
refers
always to the standard covariance model as given in
CovarianceFct
. (In contrast to PracticalRange
in RFparameters
.)use.naturalscaling==TRUE
:
scale
and the shape parameter of a parameterised
covariance model can be estimated better if they are estimated
simultaneously.
upperbound.scale.factor
and lowerbound.scale.factor
,
etc. might be more realistic.
Disadvantages if use.naturalscaling==TRUE
:
mleRF
.
Default: TRUE
.
PrintLevel
0
.
trace.optim
trace
of
optim
. Default: 0
.
bins
20
.
distance.factor
distance.factor
* (maximum distance
between all pairs of points). Only used if bins
is a scalar.
Default: 0.5
.
upperbound.scale.factor
upperbound.scale.factor
* (maximum distance
between all pairs of points).
Default: 10
.
lowerbound.scale.factor
(minimum distance
between different pairs of points) /
lowerbound.scale.factor
.
Default: 20
.
lowerbound.scale.LS.factor
(minimum distance
between different pairs of points) /
lowerbound.scale.LS.factor
.
Default: 5
.
upperbound.var.factor
upperbound.var.factor
*
var(data
).
Default: 10
.
lowerbound.var.factor
var(data
) /
lowerbound.var.factor
.
If a standard model definition is given and
either the nugget or the variance is fixed,
the parameter to be estimated
must also be greater than lowerbound.sill
.
If a non-standard model definition is given
then lowerbound.var.factor
is only used
for the first model; the other lower bounds for the
variance are zero.
Default: 100
.
lowerbound.sill
lowerbound.var.factor
.
Default: 1E-10
.
scale.max.relative.factor
(minimum distance
between different pairs of points) /
scale.max.relative.factor
it is assumed that a nugget effect
is present. In case the user set nugget=0
,
the ML estimation is automatically performed
for nugget=NA
instead of nugget=0
.
Note: if scale.max.relative.factor
is greater
than lowerbound.scale.LS.factor
then
nugget
is never set to NA
as
the scale has the lower bound (minimum distance
between different pairs of points) /
lowerbound.scale.LS.factor
.
Default: 1000
.
minbounddistance
minbounddistance
to any of the bounds or if any value
has a relative distance smaller than
minboundreldist
, then it is assumed that
the MLE algorithm has dropped into a local minimum,
and it will be continued with evaluating the
ML target function on a grid, cf. the beginning paragraphs
of the Details.
Default: 0.001
.
minboundreldist
minbounddistance
.
Default: 0.02
.
approximate.functioncalls
approximate.functioncalls
.
Default: 50
.
Another maximum likelihood estimator for random fields
exists as part of the package geoR
whose homepage
is at http://www.maths.lancs.ac.uk/~ribeiro/geoR.html,
with a different philosophy behind.
the function returns a list with the following elements
mle.value |
|
trend.coff |
parameters for linear trend (optional) |
mle |
fitted model |
ev |
list returned by EmpiricalVariogram |
lsq |
model fitted by least squares; trends are never taken into account |
nlsq |
weighted lsq. Weight is the square root of the number of points in the bin |
slsq |
weighted lsq. Weight is the inverse the standard deviation of the variogram cloud within the bin |
flsq |
weighted lsq. Weights are the values of the fitted variogram to the power of -2 |
mle.lower
{lower bounds for the parameters used in the
optimisation algorithm}
mle.upper
{upper bounds for the parameters used in the
optimisation algorithm}
Thanks to Paulo Ribeiro for hints and comparing mleRF
to
likfit
of the package geoR
whose homepage is at
http://www.est.ufpr.br/geoR/.
This function does not depend on the value of
RFparameters()$PracticalRange
.
The function mleRF
always uses the standard specification
of the covariance model as given in CovarianceFct
.
Martin Schlather, martin.schlather@cu.lu http://www.cu.lu/~schlathe
Ribeiro, P. and Diggle, P. (2001) Software for geostatistical analysis using R and S-PLUS: geoR and geoS, version 0.6.15. http://www.maths.lancs.ac.uk/~ribeiro/geoR.html.
CovarianceFct
,
GetPracticalRange
,
parampositions
RandomFields
,
RFparameters(Print=10) model <-"gencauchy" param <- c(0, 1, 0, 1, 1, 2) estparam <- c(0, NA, 0, NA, NA, 2) ## NA means: "to be estimated" ## sequence in `estparam' is ## mean, variance, nugget, scale, (+ further model parameters) ## So, mean, variance, and scale will be estimated here. ## Nugget is fixed and equals zero. points <- 100 x <- runif(points,0,3) y <- runif(points,0,3) ## 100 random points in square [0, 3]^2 d <- GaussRF(x=x, y=y, grid=FALSE, model=model, param=param, n=10) str(fitvario(x=cbind(x,y), data=d, model=model, param=estparam, lower=c(0.1, 0.1), upper=c(1.9, 5))) ## The next two estimations give about the same result. ## For the first the sill is fixed to 1.5. For the second the sill ## is reached if the estimated variance is smaller than 1.5 estparam <- c(0, NA, NA, NA, NA, NA) str(fitvario(x=cbind(x,y), data=d, model=model, param=estparam, sill=1.5)) estparam <- c(0, NA, NaN, NA, NA, NA) parampositions(model=model, param=estparam) f <- function(param) { param[5] <- max(0, 1.5 - param[1]) return(param) } str(fitvario(x=cbind(x,y), data=d, model=model, param=estparam, sill=1, transform=f)) ## the next call gives a warning, since the user may programme ## strange things in this setup, and the program cannot check it. estparam <- c(0, NA, NA, NA, NA, NaN) parampositions(model=model, param=estparam) f <- function(param) {param[3] <- param[2]; param} unix.time(str(fitvario(x=cbind(x,y), data=d, model=model, param=estparam, transform=f, standard.style=TRUE), vec.len=6)) ## much better programmed, but also much slower: estmodel <- list(list(model="gencauchy", var=NA, scale=NA, kappa=list(NA, function(m) m[[1]]$kappa[1]))) unix.time(str(fitvario(x=cbind(x,y), data=d, model=estmodel), vec.len=6))