nested.stdsurv {NestedCohort} | R Documentation |
The function nested.stdsurv
fits the Cox model to estimate
standardized survival curves and attributable risks for covariates
that are missing data on some cohort members. All covariates must be
factor variables.
nested.stdsurv
requires knowledge of the variables that
missingness depends on, with missingness probability modeled through a
glm
sampling model. Often, the data is in the form of a
case-control sample taken within a cohort. nested.stdsurv
allows
cases to have missing data, and can extract efficiency from auxiliary
variables by including them in the sampling model. nested.stdsurv
requires coxph
from the survival package.
nested.stdsurv(outcome, exposures, confounders, samplingmod, data, exposureofinterest = "", timeofinterest = Inf,cuminc=FALSE, plot = FALSE, plotfilename = "", glmlink = binomial(link = "logit"), glmcontrol = glm.control(epsilon = 1e-10, maxit = 10, trace = FALSE), coxphcontrol = coxph.control(eps = 1e-10, iter.max = 50), missvarwarn = TRUE, ...)
Required arguments:
outcome |
Survival outcome of interest, must be a
Surv object |
exposures |
The part of the right side of the Cox model that parameterizes the
exposures. Never use '*' for interaction, use
interaction . Survival probabilities will be computed
for each level of the exposures. |
confounders |
The part of the right side of the Cox model that
parameterizes the confounders. Never use '*' for interaction, use
interaction . |
samplingmod |
Right side of the formula for the glm
sampling model that models the probability of missingness |
data |
Data Frame that all variables are in |
exposureofinterest |
The name of the level of the exposures for which attributable risk is desired. Default is the first level of the exposure. |
timeofinterest |
The time at which survival probabilities and attributable risks are desired. Default is the last event time. |
cuminc |
Set to T if you want output as cumulative incidence, F for survival |
plot |
If T, plot the standardized survivals. Default is F. |
plotfilename |
A string for the filename to save the plot as |
glmlink |
Sampling model link function, default is logistic regression |
glmcontrol |
See glm.control |
coxphcontrol |
See coxph.control |
missvarwarn |
|
... |
Any additional arguments to be passed on to glm
or coxph |
If nested.stdsurv
reports that the sampling model "failed to converge",
the sampling model will be returned for your inspection. Note that if
some sampling probabilities are estimated at 1, the model technically
cannot converge, but you get very close to 1, and nested.stdsurv
will not report non-convergence for this situation.
interaction
A List with the following components:
coxmod |
The fitted Cox model |
samplingmod |
The fitted glm sampling model |
survtable |
Standardized survival (and inference) for each exposure level |
riskdifftable |
Standardized survival (risk) differences (and inference) for each exposure level, relative to the exposure of interest. |
PARtable |
Population Attributable Risk (and inference) for the exposure of interest |
plotdata |
A matrix with data needed to plot the survivals: time, standardized survival for each exposure level, and crude survival. Name of each exposure level is converted to a proper R variable name (these are the column labels). |
Requires the MASS library from the VR bundle that is available from the CRAN website.
Hormuzd A. Katki
Mark, S.D. and Katki, H.A. Specifying and Implementing Nonparametric and Semiparametric Survival Estimators in Two-Stage (sampled) Cohort Studies with Missing Case Data. Journal of the American Statistical Association, 2006, 101, 460-471.
Mark SD, Katki H. Influence function based variance estimation and missing data issues in case-cohort studies. Lifetime Data Analysis, 2001; 7; 329-342
Christian C. Abnet, Barry Lai, You-Lin Qiao, Stefan Vogt, Xian-Mao Luo, Philip R. Taylor, Zhi-Wei Dong, Steven D. Mark, Sanford M. Dawsey. Zinc concentration in esophageal biopsies measured by X-ray fluorescence and cancer risk. Journal of the National Cancer Institute, 2005; 97(4) 301-306
See Also: nested.coxph
, zinc
,
nested.km
, coxph
, glm
## Simple analysis of zinc and esophageal cancer data: ## We sampled zinc (variable znquartiles) on a fraction of the subjects, with ## sampling fractions depending on cancer status and baseline histology. ## We observed the confounding variables on almost all subjects. data(zinc) mod <- nested.stdsurv(outcome="Surv(futime01,ec01==1)", exposures="znquartiles", confounders="sex+agestr+smoke+drink+mildysp+moddysp+sevdysp+anyhist", samplingmod="ec01*basehist",exposureofinterest="Q4",data=zinc) # This is the output: # Standardized Survival for znquartiles by time 5893 # Survival StdErr 95 # Q1 0.5443 0.07232 0.3932 0.6727 # Q2 0.7595 0.07286 0.5799 0.8703 # Q3 0.7045 0.07174 0.5383 0.8203 # Q4 0.8911 0.06203 0.6863 0.9653 # Crude 0.7784 0.02491 0.7249 0.8228 # Standardized Risk Differences vs. znquartiles = Q4 by time 5893 # Risk Difference StdErr 95 # Q4 - Q1 0.3468 0.10376 0.143412 0.5502 # Q4 - Q2 0.1316 0.09605 -0.056694 0.3198 # Q4 - Q3 0.1866 0.09355 0.003196 0.3699 # Q4 - Crude 0.1126 0.06353 -0.011871 0.2372 # PAR if everyone had znquartiles = Q4 # Estimate StdErr 95 # PAR 0.5084 0.2777 -0.03585 1.0526 # log(1-PAR) -0.7100 0.5648 -0.48723 0.8375