TY - JOUR
T1 - Classification versus association models
T2 - Should the same methods apply?
AU - Feng, Ziding
N1 - Funding Information:
This work is supported in part by the National Institutes of Health (U01 CA086368, P01 CA53996).
PY - 2010
Y1 - 2010
N2 - Association and classification models differ fundamentally in objectives, measurements, and clinical context specificity. Association studies aim to identify biomarker association with disease in a study population and provide etiologic insights. Common association measurements are odds ratio, hazard ratio, and correlation coefficient. Classification studies aim to evaluate biomarker use in aiding specific clinical decisions for individual patients. Common classification measurements are sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Good association is usually a necessary, but not a sufficient, condition for good classification. Methods for developing classification models have mainly used the criteria for association models, usually minimizing total classification error without consideration of clinical application settings, and therefore are not optimal for classification purposes. We suggest that developing classification models by focusing on the region of receiver operating characteristic (ROC) curve relevant to the intended clinical application optimizes the model for the intended application setting.
AB - Association and classification models differ fundamentally in objectives, measurements, and clinical context specificity. Association studies aim to identify biomarker association with disease in a study population and provide etiologic insights. Common association measurements are odds ratio, hazard ratio, and correlation coefficient. Classification studies aim to evaluate biomarker use in aiding specific clinical decisions for individual patients. Common classification measurements are sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Good association is usually a necessary, but not a sufficient, condition for good classification. Methods for developing classification models have mainly used the criteria for association models, usually minimizing total classification error without consideration of clinical application settings, and therefore are not optimal for classification purposes. We suggest that developing classification models by focusing on the region of receiver operating characteristic (ROC) curve relevant to the intended clinical application optimizes the model for the intended application setting.
KW - Association
KW - Biomarkers
KW - Classification
KW - Likelihood
KW - Logistic regression
KW - Odds ratio
KW - ROC curve
KW - Sensitivity
KW - Specificity
UR - http://www.scopus.com/inward/record.url?scp=77953157234&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77953157234&partnerID=8YFLogxK
U2 - 10.3109/00365513.2010.493387
DO - 10.3109/00365513.2010.493387
M3 - Review article
C2 - 20515278
AN - SCOPUS:77953157234
SN - 0036-5513
VL - 70
SP - 53
EP - 58
JO - Scandinavian Journal of Clinical and Laboratory Investigation
JF - Scandinavian Journal of Clinical and Laboratory Investigation
IS - SUPPL. 242
ER -