risk.index {SoPhy}R Documentation

Risk index for soil leaching

Description

Estimation of the index of potential risk to groundwater

Usage

risk.index(data, selected.dist=0.95, 
           selected.rate=cbind(c(0.5, 0.8), c(0.4, 0.9), c(0.3, 1.0)),
           weights=1, measure=function(x) x^2,
           method=c('fix.m', 'optim.m', 'ml'),
           min.neg.xi = -10, max.neg.xi = -0.1, max.pos.xi = 10,
           endpoint.tolerance = 0, front.factor = 2,
           # min.no.paths=max(data[, 2]),
           max.no.paths=10 * max(data[, 2]),
           PrintLevel=RFparameters()$Print, max.rate=TRUE)

Arguments

data matrix of two columns. First column gives the distances (depths in the profile measured from the surface) and the second column the number of observed blue pixels.
selected.dist scale or vector with values in (0,1) or a vector of integers. Distances for which the form parameter of the pareto distribution is estimated; see Details. If selected.dist is a number in (0,1) the distances are 1:(1 + round(max(data[,1]-1) * selected.dist)). If it is a vector with values in (0,1), then the vector must have an even number of elements and pairs of elements are interpreted as intervals.
Otherwise the integers are interpreted as indices for data.
selected.rate vector of matrix of nrows. additionally to the indices given by selected.dist, the form parameter is estimated also for those distances where the corresponding number of observed relative stained pixels (w.r.t. to the maximum number of observed pixels) is within the interval given by the first column of selected.rate. The risk index is calculated as the median of the estimated form parameters.
In case no values are in the given interval of the first column the second column is considered etc, i.e. the first row should contain decreasing values and the second row increasing values.
weights the estimation algorithm is based on a weighted least square algorithm; weights is usually either 1 or a vector of length nrow(data).
measure instead of the default least squares another distance function can be given.
method the number of observed paths is a free parameter when fitting the Pareto distribution. It can either be set as the maximum number of stained pixels for the currently considered distances or depths ('fix.m') or fitted within the optimisation algorithm ('optim.m'). Usually, it is not worth using the slower 'optim.m' option. See also the Details.
max.neg.xi optimisation parameter : largest negative value that is allowed as shape parameter of the Pareto distribution, i.e. a negative value close to 0.
min.neg.xi optimisation parameter : smallest negative value that is allowed as shape parameter of the Pareto distribution
max.pos.xi optimisation parameter : largest allowed shape parameter of the Pareto distribution
endpoint.tolerance optimisation parameter. If the shape parameter is negative then the distribution has a finite upper endpoint. Hence, mathematically, the lowest upper end point of the Pareto distribution is given as the largest distance for which at least one stained pixel is observed. For stability reasons and because the observed data might be a scale mixure of Pareto distribution it is advantageous to allow for some tolerance of the minimal upper end point.
If endpoint.tolerance is positive then the lower threshold for the upper end point is the largest distance for which the number of observed stained pixels is larger than endpoint.tolerance.
If endpoint.tolerance is negative then the lower threshold equals largest distance for which at least one stained pixel is observed minus the modulus of endpoint.tolerance.
front.factor optimisation parameter . The upper bound for the upper endpoint equals the front.factor times the largest distance for which at least one stained pixel is observed.
The value should best not be changed.
max.no.paths the number of paths is estimated as nuisance parameter when estimating the risk index; max.no.paths give the upper bound for the nuisance parameter in the optimisation.
PrintLevel The higher the value of PrintLevel the more tracing information is given. Up to value 1, no information is given. Note that if PrintLevel>=2 a running counter is shown that includes the printing of backspaces (^H). The backspaces may have undesirable interactions with some few other R functions, e.g. Sweave. See package RandomFields for the default option RFparameters()$Print.
max.rate logical. If TRUE then the lines for which m(D) / m(0) is in selected.rate are used to calculate the final risk index. Here m(D) gives the maximum of p(d), d=D, D+1, ... where p(d) the number of stained pixels in depth d. If FALSE then the criterion m(D) / m(0) is replaced by p(D) / p(0).

Details

Denote by f(d) the number of blue pixels registered at depth d (or distances from the soil surface). Then, the risk index is by definition a shape parameter of f(d) for large distances d. Since the term large cannot be defined precisely, the shape parameter is calculated for the function values f(d) for distances d>=d_i and several fixed starting distances d_i. The distances d_i are given by selected.rate. (The approach is similar to that for analyzing extremal events.)

The selection criterion m(D) / m(0) is always based on method='fix.m', whatever method is chosen to estimate xi.

Value

list of the following components

par matrix of estimated parameters; first row: risk index; second row: scale parameter; third row: estimated maximum number of paths m(D) except m(0) that is given by max.freq and is always set to the maximum number of pixels. forth row: D (sel.dist).
data the input data except for some reordering
weights the input weights expcept for some reordering
selected.dist the selected distances in form of indices (in clear text, in case they were given in form of a real value in (0,1)).
selected.rate range of the selected number of stained pixels
sel.rate index set for the data where the observed number of stained pixels are within selected.rate
sel.dist the index set containing selected.dist and sel.rate
max.freq maximum number of observed stained pixels
values the minimal least squares values
method the input parameter method
measure the input parameter measure
raw.risk.index risk index calculated as median of the estimated form parameters for selected.rate
risk.index the median is calculated only for values greater than 0.999 min.neg.xi and less than 0.999 max.pos.xi

Author(s)

Martin Schlather, martin.schlather@math.uni-goettingen.de http://www.stochastik.math.uni-goettingen.de/institute

References

Embrechts, P., Klueppelberg, C. and Mikosch, T. (1997) Modelling Extremal Events. Berlin: Springer.

Schlather, M. and Huwe, B. (2005) A risk index for characterising flow pattern in soils using dye tracer distributions. J. Contam. Hydrol. 79, to appear.

See Also

SoPhy, analyse.profile, xswms2d

Examples

sample.depth <- 1 : 100
d <- rexp(1000, 1/25)
freq <- numeric(length(sample.depth))
for (i in 1:length(sample.depth)) freq[i] <- sum(d>=sample.depth[i])
cr <- risk.index(cbind(sample.depth, freq),
                 selected.rate=c(0.95, 0.9), 
                 endpoint.tolerance=20,  method="fix.m")
cr$risk.index ## the true value is 0

[Package SoPhy version 1.0.34 Index]