ROC {DiagnosisMed} | R Documentation |
Draw a non-parametric (empirical) ROC curve and compute test sensitivity, specificity, predictive values and likelihood ratios (and respective confidence limits) for each decision threshold. Estimate good decision threshold by a variety of methods.
ROC(gold, test, CL = 0.95, Cost = 1, Prevalence = 0, Plot = TRUE, Plot.point = "Min.ROC.Dist", Print.full = FALSE, Print = TRUE)
gold |
The reference standard. A column in a data frame indicating the classification by the reference test. The reference standard must have two levels: must be coded either as 0 - without target disease - or 1 - with the target disease; or could be coded as.factor with the words "negative" - without target disease - and "positive" - with the target disease. |
test |
The index test or test under evaluation. A column in a dataframe or vector indicating the test results in a continuous scale. May also work with discrete ordinal scale. |
CL |
Confidence limit. The limits of the confidence interval. Must be coded as number in a range from 0 to 1. Default value is 0.95 |
Cost |
Cost = cost(FN)/cost(FP). MCT will be used to estimate a good cut-off. It is a value in a range from 0 to infinite. Could be financial cost or a health outcome with the perception that FN are more undesirable than FP (or the other way around). This item will run into MCT (misclassification cost term) - (1-prevalence)*(1-Sp)+Cost*prevalence(1-Se). Cost=1 means FN and FP have even cost. Cost = 0.9 means FP are 10 percent more costly. Cost = 0.769 means that FP are 30 percent more costly. Cost = 0.555 means that FP are 80 percent more costly. Cost = 0.3 means that FP are 3 times more costly. Cost = 0.2 means that FP are 5 times more costly. Also, it can be inserted as any ratio such as 1/2.5 or 1/4. |
Prevalence |
Prevalence of the disease in the population who the test will be performed. If left 0 (the default value), this will be replaced by the disease prevalence in the sample. This values will be used in the MCT and Efficiency formulas to estime good cut-offs. |
Plot |
If FALSE, the ROC curve plot will not be displayed. Default is TRUE. |
Plot.point |
The method of best cut-off estimation which will be displayed
at ROC curve as a dot. Default is "Min.ROC.Dist". Possible options are:
"Max.Accuracy" - the cut-off which maximize the accuracy; "Max.DOR" - the cut-off which maximize the diagnostic odds ratio; "Error.rate" - the cut-off which minimizes the error rate; "Max.Accuracy.area" - the cut-off which maximize the accuracy area; "Max.Sens+Spec" - the cut-off which maximize the sum of sensitivity with specificity; "Max.Youden" - the cut-off which maximize the Youden index; "Se=Sp" - the cut-off which Sensitivity is equal to Specificity; "Min.ROC.Dist" - the cut-off which minimize the distance between the curve and the upper left corner of the graph; "Max.Efficiency" - the cut-off which maximize the efficiency; "Min.MCT" - the cut-off which minimize the misclassification cost term. |
Print.full |
If TRUE, a table with sensitivity, specificity, predictive values and likelihood ratios (and respective confidence limits) for each decision threshold will be displayed. |
Print |
If FALSE, no results (detailed below in vlaues section) will be displayed on the output window. Default is TRUE |
Tests results matching the cut-off values will be considered a positive test.
ROC assumes that subjects with higher values of the test are with the target
condition and those with lower values are without the target condition. Tests
that behave like glucose (middle values are supposed to be normal and
extreme values are supposed to be abnormal) and immunefluorescence (lower
values - higher dilutions - are suppose to be abnormal) will not be correctly
analyzed. In the latter, multiplying the test results by -1 or other
transformation before analysis could make it work. The result table with the
Print.full option, may have more columns than can be shown in the screen. R
automatically shows these columns below, therefore one has to be careful when
relating the corresponding lines. The AUC (area under the ROC curve) is estimated
by the trapezoidal method (also known as Mann-Whitney statistic), its confidence
interval is estimated by DeLong method. The AUC confidence limits should be used only
to compare the AUC with the null value for AUC which is 0.5 and not to compare
the AUC from different tests. The validity measures such as Sensitivity,
Specificity and Likelihood ratios and its confidence limits are estimated as
in diagnosis
function.
Diagnostic odds ratio: DOR = (TP*TN)/(FN*FP); the same as: DOR = PLR/NLR
Accuracy area: AA = (TP*TN)/((TP+FN)*(FP+TN))
Youden index: Y = Se+Sp-1; the same as: Y = Se-FPR
Minimum ROC distance: m ROC Dis = (Sp-1)^2+(1-Se)^2
Efficiency: Ef = Se*prevalence+(1-prevalence)*Sp
Misclassification Cost Term: MCT = (1-prevalence)*(1-Sp)+(cost(FN)/cost(FP))*prevalence*(1-Se)
pop.prevalence |
The disease prevalence informed by the user. If not informed, it will be the same as the sample prevalence. |
sample.prevalence |
The disease prevalence in the sample |
sample.size |
The number of subjects analyzed |
test.summary |
A table showing the quintiles, mean and standard deviation of overall test results, test results from those with the target condition and without the target condition |
AUC.summary |
A table showing the AUC estimated by DeLong method (trapezoidal) and its confidence limits. |
test.best.cutoff |
A table showing the best cut-offs estimated by methods described above, its corresponding sensitivity, specificity and positive likelihood ratio (and their confidence limits) |
Bug reports, malfunctioning, or suggestions for further improvements or contributions can be sent, preferentially, through the DiagnosisMed email list, or R-Forge website https://r-forge.r-project.org/projects/diagnosismed/.
Pedro Brasil - diagnosismed-list@lists.r-forge.r-project.org
Knotterus. The Evidence Based Clinical Diagnosis; BMJBooks, 2002.
Xiou-Hua Zhou, Nancy A Obuchowsky, Donna McClish. Statistical Methods in diagnostic Medicine; Wiley, 2002.
Simel D, Samsa G, Matchar D (1991). Likelihood ratios with confidence: Sample size estimation for diagnostic test studies. Journal of Clinical Epidemiology 44: 763 - 770
S.B. Cantor, C.C. Sun, G. Tortolero-Luna, R. Richards-Kortum, and M. Follen. (1999) A comparison of C/B ratios from studies using receiver operating characteristic curve analysis. Journal of Clinical Epidemiology, 52(9):885-892.
Greiner, M. (1996) Two-graph receiver operating characteristic (TG-ROC): update version supports optimisation of cut-off values that minimize overall misclassification costs. J.Immunol.Methods 191:93-94.
Gengsheng Qin, Lejla Hotilovac. Comparison of non-parametric confidence intervals for the area under the ROC curve of a continuous-scale disagnostic test. Statistical Methods in Medical Research 2008; 17:207-221.
binom.conf.int,diagnosis
,interact.ROC
,performance
# loading a dataset data(tutorial) # The reference standard is not in the correct format. # Recoding the reference standard to "positive" & "negative" in a new variable. tutorial$Gold2<-as.factor(ifelse(tutorial$Gold=="pos","positive","negative")) # Attaching the data set with the modifications. attach(tutorial) # A little description of the data set to check if it is ok! str(tutorial) # Running ROC analysis with the full table option. ROC(Gold2,Test_B,Print.full=TRUE) # Adding a title to the graph. title(main="ROC graph")