ROC {DiagnosisMed}R Documentation

Draw a ROC curve, estimate good cut-offs and compute validity measures for each cut-off

Description

Draw a non-parametric (empirical) ROC curve and compute test sensitivity, specificity, predictive values and likelihood ratios (and respective confidence limits) for each decision threshold. Estimate good decision threshold by a variety of methods.

Usage

ROC(gold,
    test,
    CL = 0.95,
    Cost = 1,
    Prevalence = 0,
    Plot = TRUE,
    Plot.point = "Min.ROC.Dist",
    Print.full = FALSE,
    Print = TRUE)

Arguments

gold The reference standard. A column in a data frame indicating the classification by the reference test. The reference standard must have two levels: must be coded either as 0 - without target disease - or 1 - with the target disease; or could be coded as.factor with the words "negative" - without target disease - and "positive" - with the target disease.
test The index test or test under evaluation. A column in a dataframe or vector indicating the test results in a continuous scale. May also work with discrete ordinal scale.
CL Confidence limit. The limits of the confidence interval. Must be coded as number in a range from 0 to 1. Default value is 0.95
Cost Cost = cost(FN)/cost(FP). MCT will be used to estimate a good cut-off. It is a value in a range from 0 to infinite. Could be financial cost or a health outcome with the perception that FN are more undesirable than FP (or the other way around). This item will run into MCT (misclassification cost term) - (1-prevalence)*(1-Sp)+Cost*prevalence(1-Se). Cost=1 means FN and FP have even cost. Cost = 0.9 means FP are 10 percent more costly. Cost = 0.769 means that FP are 30 percent more costly. Cost = 0.555 means that FP are 80 percent more costly. Cost = 0.3 means that FP are 3 times more costly. Cost = 0.2 means that FP are 5 times more costly. Also, it can be inserted as any ratio such as 1/2.5 or 1/4.
Prevalence Prevalence of the disease in the population who the test will be performed. If left 0 (the default value), this will be replaced by the disease prevalence in the sample. This values will be used in the MCT and Efficiency formulas to estime good cut-offs.
Plot If FALSE, the ROC curve plot will not be displayed. Default is TRUE.
Plot.point The method of best cut-off estimation which will be displayed at ROC curve as a dot. Default is "Min.ROC.Dist". Possible options are:
"Max.Accuracy" - the cut-off which maximize the accuracy;
"Max.DOR" - the cut-off which maximize the diagnostic odds ratio;
"Error.rate" - the cut-off which minimizes the error rate;
"Max.Accuracy.area" - the cut-off which maximize the accuracy area;
"Max.Sens+Spec" - the cut-off which maximize the sum of sensitivity with specificity;
"Max.Youden" - the cut-off which maximize the Youden index;
"Se=Sp" - the cut-off which Sensitivity is equal to Specificity;
"Min.ROC.Dist" - the cut-off which minimize the distance between the curve and the upper left corner of the graph;
"Max.Efficiency" - the cut-off which maximize the efficiency;
"Min.MCT" - the cut-off which minimize the misclassification cost term.
Print.full If TRUE, a table with sensitivity, specificity, predictive values and likelihood ratios (and respective confidence limits) for each decision threshold will be displayed.
Print If FALSE, no results (detailed below in vlaues section) will be displayed on the output window. Default is TRUE

Details

Tests results matching the cut-off values will be considered a positive test. ROC assumes that subjects with higher values of the test are with the target condition and those with lower values are without the target condition. Tests that behave like glucose (middle values are supposed to be normal and extreme values are supposed to be abnormal) and immunefluorescence (lower values - higher dilutions - are suppose to be abnormal) will not be correctly analyzed. In the latter, multiplying the test results by -1 or other transformation before analysis could make it work. The result table with the Print.full option, may have more columns than can be shown in the screen. R automatically shows these columns below, therefore one has to be careful when relating the corresponding lines. The AUC (area under the ROC curve) is estimated by the trapezoidal method (also known as Mann-Whitney statistic), its confidence interval is estimated by DeLong method. The AUC confidence limits should be used only to compare the AUC with the null value for AUC which is 0.5 and not to compare the AUC from different tests. The validity measures such as Sensitivity, Specificity and Likelihood ratios and its confidence limits are estimated as in diagnosis function.

Diagnostic odds ratio: DOR = (TP*TN)/(FN*FP); the same as: DOR = PLR/NLR

Accuracy area: AA = (TP*TN)/((TP+FN)*(FP+TN))

Youden index: Y = Se+Sp-1; the same as: Y = Se-FPR

Minimum ROC distance: m ROC Dis = (Sp-1)^2+(1-Se)^2

Efficiency: Ef = Se*prevalence+(1-prevalence)*Sp

Misclassification Cost Term: MCT = (1-prevalence)*(1-Sp)+(cost(FN)/cost(FP))*prevalence*(1-Se)

Value

pop.prevalence The disease prevalence informed by the user. If not informed, it will be the same as the sample prevalence.
sample.prevalence The disease prevalence in the sample
sample.size The number of subjects analyzed
test.summary A table showing the quintiles, mean and standard deviation of overall test results, test results from those with the target condition and without the target condition
AUC.summary A table showing the AUC estimated by DeLong method (trapezoidal) and its confidence limits.
test.best.cutoff A table showing the best cut-offs estimated by methods described above, its corresponding sensitivity, specificity and positive likelihood ratio (and their confidence limits)

Note

Bug reports, malfunctioning, or suggestions for further improvements or contributions can be sent, preferentially, through the DiagnosisMed email list, or R-Forge website https://r-forge.r-project.org/projects/diagnosismed/.

Author(s)

Pedro Brasil - diagnosismed-list@lists.r-forge.r-project.org

References

Knotterus. The Evidence Based Clinical Diagnosis; BMJBooks, 2002.

Xiou-Hua Zhou, Nancy A Obuchowsky, Donna McClish. Statistical Methods in diagnostic Medicine; Wiley, 2002.

Simel D, Samsa G, Matchar D (1991). Likelihood ratios with confidence: Sample size estimation for diagnostic test studies. Journal of Clinical Epidemiology 44: 763 - 770

S.B. Cantor, C.C. Sun, G. Tortolero-Luna, R. Richards-Kortum, and M. Follen. (1999) A comparison of C/B ratios from studies using receiver operating characteristic curve analysis. Journal of Clinical Epidemiology, 52(9):885-892.

Greiner, M. (1996) Two-graph receiver operating characteristic (TG-ROC): update version supports optimisation of cut-off values that minimize overall misclassification costs. J.Immunol.Methods 191:93-94.

Gengsheng Qin, Lejla Hotilovac. Comparison of non-parametric confidence intervals for the area under the ROC curve of a continuous-scale disagnostic test. Statistical Methods in Medical Research 2008; 17:207-221.

See Also

binom.conf.int,diagnosis,interact.ROC,performance

Examples

# loading a dataset
data(tutorial)
# The reference standard is not in the correct format.
# Recoding the reference standard to "positive" & "negative" in a new variable.
tutorial$Gold2<-as.factor(ifelse(tutorial$Gold=="pos","positive","negative"))
# Attaching the data set with the modifications.
attach(tutorial)
# A little description of the data set to check if it is ok!
str(tutorial)
# Running ROC analysis with the full table option.
ROC(Gold2,Test_B,Print.full=TRUE)
# Adding a title to the graph.
title(main="ROC graph")

[Package DiagnosisMed version 0.1.2.3 Index]