surv2.neyman {surv2sample} | R Documentation |
Compares survival distributions in two samples of censored data using (possibly data-driven) Neyman's smooth test.
surv2.neyman(x, group, data.driven = FALSE, subsets = "nested", d = ifelse(data.driven, 5, 3), d0 = 0, basis = "legendre", time.transf = "F", approx = "perm", nsim = 2000, choltol = 1e-07) ## S3 method for class 'surv2.neyman': summary(object, ...)
x |
a "Surv" object, as returned by the Surv
function. |
group |
a vector indicating to which group each observation belongs. May contain values 1 and 2 only. |
data.driven |
Should the test be data-driven? |
subsets |
the class of subsets of basis functions amonng which
the data-driven test selects. Possible values are "nested"
and "all" . |
d |
the number of basis functions for the test with fixed dimension, the maximum number of basis functions for the data-driven test. |
d0 |
the number of high-priority functions for the data-driven test.
The selection rule selects among subsets containing basis functions
1,...,d0 . For nested subsets, d0 equal to 0 or 1
is equvalent. For all subsets, d0 equal to 0 means that there
is no high-priority function and any nonempty subset may be selected. |
basis |
the basis of functions. Possible values are "legendre"
for Legendre polynomials and "cos" for cosines. |
time.transf |
the time transformation for basis functions.
Possible values are "F" for the
distribution function (F(t)/F(tau))
(recommended), "A" for the
cumulative hazard (A(t)/A(tau)) and "I"
for no transformation (the linear transformation t/tau). |
approx |
the method of approximating the distribution of the
test statistic. Possible values are "perm" for permutations,
"boot" for the bootstrap, "asympt" for asymptotics. |
nsim |
the number of simulations. This means the number of
permutations or bootstrap samples when approx is
"perm" or "boot" . When approx is "asympt" ,
nsim is the number of simulations to approximate the asymptotic
distribution (only needed for the data-driven test with all subsets
and d0 equal to 0). |
choltol |
a tolerance parameter for the Cholesky decomposition. |
object |
an object of class "surv2.neyman" , as returned
by the function surv2.neyman . |
... |
further parameters for printing. |
In general, Neyman's smooth tests are based on embedding the null hypothesis in a d-dimensional alternative. The embedding is here formulated in terms of hazard functions. The logarithm of the hazard ratio is expressed as a combination of d basis functions (Legendre polynomials or cosines) in transformed time, and their significance is tested by a score test. See Kraus (2007a) for details. The quadratic test statistic is asymptotically chi-square distributed with d degrees of freedom.
A data-driven choice of basis functions is possible. The selection is based on a Schwarz-type criterion which is the maximiser of penalised score statistics for over a class of nonempty subsets of {1,...,d}. Either nested subsets with increasing dimension or all subsets may be used. By choosing d0>0 one requires that functions with indexes {1,...,d0} be always included in subsets.
If all subsets are used with d0=0, the data-driven test statistic is asymptotically distributed as the maximum of (generally dependent) chi-square variables with 1 d.f. This asymptotic approximation is accurate. In other cases, the statistic is asymptotically chi-square distributed with d^*=max(1,d0) degrees of freedom. For nested subsets with d^*=1 a two-term approximation may be used (see Kraus (2007b), eq. (12)). Otherwise the asymptotics is unreliable.
In any case, one may use permutations or the bootstrap.
If the test is data-driven, the summary
method prints details
on the selection procedure (statistics and penalised statistics for
each considered subset). This is equivalent to print(x, detail=TRUE, ...)
.
A list of class "surv2.neyman"
and "neyman.test"
,
with main components:
stat |
the test statistic. |
pval |
the p-value. |
stats, stats.penal |
the score statistic and penalised score statistic for each considered subset (only for data-driven tests). |
S.dim |
the dimension of the selected set (only for data-driven tests). |
S.set |
the selected set (only for data-driven tests). |
Most input parameters and some further components are included.
David Kraus (http://www.davidkraus.net/)
Kraus, D. (2007a) Adaptive Neyman's smooth tests of homogeneity of two samples of survival data. Research Report 2187, Institute of Information Theory and Automation, Prague. Available at http://www.davidkraus.net/surv2sample/.
Kraus, D. (2007b) Data-driven smooth tests of the proportional hazards assumption. Lifetime Data Anal. 13, 1–16.
surv2.logrank
, surv2.ks
,
survdiff
, survfit
## gastric cancer data data(gastric) ## test with fixed dimension surv2.neyman(Surv(gastric$time, gastric$event), gastric$treatment, data.driven = FALSE) ## data-driven test with nested subsets ## without minimum dimension (i.e., minimum dimension 1) summary(surv2.neyman(Surv(gastric$time, gastric$event), gastric$treatment, data.driven = TRUE, subsets = "nested")) ## with minimum dimension 3 summary(surv2.neyman(Surv(gastric$time, gastric$event), gastric$treatment, data.driven = TRUE, subsets = "nested", d0 = 3)) ## data-driven test with all subsets ## without high-priority functions summary(surv2.neyman(Surv(gastric$time, gastric$event), gastric$treatment, data.driven = TRUE, subsets = "all")) ## with 2 high-priority functions summary(surv2.neyman(Surv(gastric$time, gastric$event), gastric$treatment, data.driven = TRUE, subsets = "all", d0 = 2))