surv2.neyman {surv2sample}R Documentation

Two-Sample Neyman's Smooth Test for Censored Data

Description

Compares survival distributions in two samples of censored data using (possibly data-driven) Neyman's smooth test.

Usage

surv2.neyman(x, group, data.driven = FALSE, subsets = "nested",
             d = ifelse(data.driven, 5, 3), d0 = 0,
             basis = "legendre", time.transf = "F",
             approx = "perm", nsim = 2000, choltol = 1e-07)

## S3 method for class 'surv2.neyman':
summary(object, ...)

Arguments

x a "Surv" object, as returned by the Surv function.
group a vector indicating to which group each observation belongs. May contain values 1 and 2 only.
data.driven Should the test be data-driven?
subsets the class of subsets of basis functions amonng which the data-driven test selects. Possible values are "nested" and "all".
d the number of basis functions for the test with fixed dimension, the maximum number of basis functions for the data-driven test.
d0 the number of high-priority functions for the data-driven test. The selection rule selects among subsets containing basis functions 1,...,d0. For nested subsets, d0 equal to 0 or 1 is equvalent. For all subsets, d0 equal to 0 means that there is no high-priority function and any nonempty subset may be selected.
basis the basis of functions. Possible values are "legendre" for Legendre polynomials and "cos" for cosines.
time.transf the time transformation for basis functions. Possible values are "F" for the distribution function (F(t)/F(tau)) (recommended), "A" for the cumulative hazard (A(t)/A(tau)) and "I" for no transformation (the linear transformation t/tau).
approx the method of approximating the distribution of the test statistic. Possible values are "perm" for permutations, "boot" for the bootstrap, "asympt" for asymptotics.
nsim the number of simulations. This means the number of permutations or bootstrap samples when approx is "perm" or "boot". When approx is "asympt", nsim is the number of simulations to approximate the asymptotic distribution (only needed for the data-driven test with all subsets and d0 equal to 0).
choltol a tolerance parameter for the Cholesky decomposition.
object an object of class "surv2.neyman", as returned by the function surv2.neyman.
... further parameters for printing.

Details

In general, Neyman's smooth tests are based on embedding the null hypothesis in a d-dimensional alternative. The embedding is here formulated in terms of hazard functions. The logarithm of the hazard ratio is expressed as a combination of d basis functions (Legendre polynomials or cosines) in transformed time, and their significance is tested by a score test. See Kraus (2007a) for details. The quadratic test statistic is asymptotically chi-square distributed with d degrees of freedom.

A data-driven choice of basis functions is possible. The selection is based on a Schwarz-type criterion which is the maximiser of penalised score statistics for over a class of nonempty subsets of {1,...,d}. Either nested subsets with increasing dimension or all subsets may be used. By choosing d0>0 one requires that functions with indexes {1,...,d0} be always included in subsets.

If all subsets are used with d0=0, the data-driven test statistic is asymptotically distributed as the maximum of (generally dependent) chi-square variables with 1 d.f. This asymptotic approximation is accurate. In other cases, the statistic is asymptotically chi-square distributed with d^*=max(1,d0) degrees of freedom. For nested subsets with d^*=1 a two-term approximation may be used (see Kraus (2007b), eq. (12)). Otherwise the asymptotics is unreliable.

In any case, one may use permutations or the bootstrap.

If the test is data-driven, the summary method prints details on the selection procedure (statistics and penalised statistics for each considered subset). This is equivalent to print(x, detail=TRUE, ...).

Value

A list of class "surv2.neyman" and "neyman.test", with main components:

stat the test statistic.
pval the p-value.
stats, stats.penal the score statistic and penalised score statistic for each considered subset (only for data-driven tests).
S.dim the dimension of the selected set (only for data-driven tests).
S.set the selected set (only for data-driven tests).

Most input parameters and some further components are included.

Author(s)

David Kraus (http://www.davidkraus.net/)

References

Kraus, D. (2007a) Adaptive Neyman's smooth tests of homogeneity of two samples of survival data. Research Report 2187, Institute of Information Theory and Automation, Prague. Available at http://www.davidkraus.net/surv2sample/.

Kraus, D. (2007b) Data-driven smooth tests of the proportional hazards assumption. Lifetime Data Anal. 13, 1–16.

See Also

surv2.logrank, surv2.ks, survdiff, survfit

Examples

## gastric cancer data
data(gastric)

## test with fixed dimension
surv2.neyman(Surv(gastric$time, gastric$event), gastric$treatment,
    data.driven = FALSE)

## data-driven test with nested subsets
## without minimum dimension (i.e., minimum dimension 1)
summary(surv2.neyman(Surv(gastric$time, gastric$event),
    gastric$treatment, data.driven = TRUE, subsets = "nested"))
## with minimum dimension 3
summary(surv2.neyman(Surv(gastric$time, gastric$event),
    gastric$treatment, data.driven = TRUE, subsets = "nested",
    d0 = 3))

## data-driven test with all subsets
## without high-priority functions
summary(surv2.neyman(Surv(gastric$time, gastric$event),
    gastric$treatment, data.driven = TRUE, subsets = "all"))
## with 2 high-priority functions
summary(surv2.neyman(Surv(gastric$time, gastric$event),
    gastric$treatment, data.driven = TRUE, subsets = "all",
    d0 = 2))

[Package surv2sample version 0.1-2 Index]