gofCopula {copula} | R Documentation |
Goodness-of-fit tests for copulas based on the empirical process comparing the empirical copula with a parametric estimate of the copula derived under the null hypothesis. Approximate p-values for the test statistic can be obtained either using the parametric bootstrap (see the two first references) or by means of a fast multiplier approach that can be used when the parametric bootstrap is too slow (see the two last references).
gofCopula(copula, x, N = 1000, method = "mpl", simulation = "pb", grid = "h0", R = 10, m = nrow(x), G = nrow(x), M = 2500, print.every = 100, optim.method = "Nelder-Mead")
copula |
object of class "copula" representing the
hypothesized copula family. |
x |
a data matrix that will be transformed to pseudo-observations. |
N |
number of bootstrap or multiplier iterations to be used to simulate realizations of the test statistic under the null hypothesis. |
method |
estimation method to be used to estimate the
dependence parameter(s); can be either "mpl"
(maximum pseudo-likelihood), "itau" (inversion of
Kendall's tau) or "irho" (inversion of Spearman's rho). |
simulation |
simulation method for generating realizations
of the test statistic under the null hypothesis; can be either
"pb" (parametric bootstrap) or "mult" (multiplier). |
grid |
for simulation method "mult" , grid points at which the
goodness-of-fit process is evaluated; can be either "h0" (the
goodness-of-fit process is evaluated at points randomly generated
from the hypothesized copula) or "po" (the
goodness-of-fit process is evaluated at the available
pseudo-observations); see the two last references. |
R |
for simulation method "mult" , number of replications of the
basic test; should be set to 1 for very large samples. |
m |
for simulation method "mult" , size of the sample
used to compute the influence functions; see the two last references. |
G |
for simulation method "mult" , size of the grid
if "grid" is set to "h0" ; see the two last references. |
M |
for simulation method "mult" , size of the Monte Carlo
integration sample; see the two last references. |
print.every |
progress is printed every "print.every"
iterations. |
optim.method |
the method for "optim" . |
If the parametric bootstrap is used, the dependence parameters of the hypothesized copula family can be estimated either by maximizing the pseudo-likelihood or by inverting Kendall's tau or Spearman's rho. If the multiplier is used, any estimation method can be used in the bivariate case, but only maximum pseudo-likelihood estimation can be used in the multivariate (multiparameter) case.
For the normal and t copulas, several dependence structures can be
hypothesized: "ex"
for exchangeable, "ar1"
for AR(1),
"toep"
for Toeplitz, and "un"
for unstructured (see
ellipCopula
). For the t copula, "df.fixed"
has to
be set to TRUE
, which implies that the degrees of freedom are
not considered as a parameter to be estimated.
Thus far, the multiplier approach is implemented for six copula families: the Clayton, Gumbel, Frank, Plackett, normal and t.
The parameter "R"
, used when the simulation method is "mult"
,
specifies the number of times that the multiplier test will be
repeated. Indeed, when the available sample is smalle, the multiplier
approach can show poor repeatability with respect to the returned p-value (see the last two
references). As the sample size increases, the value of
"R"
can be decreased.
The parameter "grid"
, used when the simulation method is
"mult"
, should be set as follows: if the sample size is smaller
than 500, "grid"
should be set to "h0"
(otherwise the
test will be too liberal); when the sample size is greater than 500,
"grid"
should be set to "po"
, as this decreases
repeatability problems.
Although the processes involved in the multiplier and the parametric bootstrap-based test are asymptotically equivalent, the finite-sample behavior of the two tests might differ significantly.
Returns a list whose attributes are:
statistic |
value of the test statistic, or median of the test
statistics if "R" multiplier replications are performed. |
pvalue |
corresponding approximate p-value, or median of the
p-values if "R" multiplier replications are performed. |
sd.pvalue |
standard deviation of the pvalues when
the multiplier test is replicated "R" times. |
parameters |
estimates of the parameters for the hypothesized copula family. |
C. Genest and B. Remillard (2008). Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models. Annales de l'Institut Henri Poincare: Probabilites et Statistiques, 44, 1096-1127.
C. Genest, B. Remillard and D. Beaudoin (2008). Goodness-of-fit tests for copulas: A review and a power study. Insurance: Mathematics and Economics, 44, in press.
I. Kojadinovic and J. Yan (2008). Fast large-sample goodness-of-fit tests for copulas. Submitted.
I. Kojadinovic and J. Yan (2008). A goodness-of-fit test for multivariate multiparameter copulas based on multiplier central limit theorems. Submitted.
## the following example is available in batch through ## demo(gofCopula) ## Not run: ## A two-dimensional data example x <- rcopula(claytonCopula(3), 200) ## Does the Gumbel family seem to be a good choice? gofCopula(gumbelCopula(1), x) ## What about the Clayton family? gofCopula(claytonCopula(1), x) ## The same with a different estimation method gofCopula(gumbelCopula(1), x, method="itau") gofCopula(claytonCopula(1), x, method="itau") ## A three-dimensional example x <- rcopula(tCopula(c(0.5, 0.6, 0.7), dim = 3, dispstr = "un"),200) ## Does the Clayton family seem to be a good choice? gofCopula(gumbelCopula(1, dim = 3), x) ## What about the t copula? t.copula <- tCopula(rep(0, 3), dim = 3, dispstr = "un", df.fixed=TRUE) gofCopula(t.copula, x) ## The same with a different estimation method gofCopula(gumbelCopula(1, dim = 3), x, method="itau") gofCopula(t.copula, x, method="itau") ## The same using the multiplier approach gofCopula(gumbelCopula(1, dim = 3), x, simulation="mult") gofCopula(t.copula, x, simulation="mult") ## End(Not run)