Computer programs to estimate overoptimism in measures of discrimination for predicting the risk of cardiovascular diseases.

Pubmed ID: 22409210

Journal: Journal of evaluation in clinical practice

Publication Date: April 1, 2013

Affiliation: Department of Epidemiology & Preventive Medicine, Monash University, Melbourne, Victoria, Australia. haider.mannan@monash.edu

MeSH Terms: Humans, Cardiovascular Diseases, Risk Factors, Risk Assessment, Proportional Hazards Models, Models, Statistical, Software, Victoria

Authors: Mannan HR, McNeil JJ

Cite As: Mannan HR, McNeil JJ. Computer programs to estimate overoptimism in measures of discrimination for predicting the risk of cardiovascular diseases. J Eval Clin Pract 2013 Apr;19(2):358-62. Epub 2012 Mar 12.

Studies:

Abstract

BACKGROUND: Development of chronic disease risk prediction models has become a growing area of research in recent years. The internal validity of such models is sometimes lower than estimated from the development sample. Overfitting or overoptimism of the developed model and/or differences between the samples are likely causes for this. For modelling of an uncommon outcome, bootstrapping for overoptimism is the preferred method for afterwards shrinking of regression coefficients and the model's discrimination and calibration for overoptimism. However, computer programs for different types of bootstrap validation are not readily available. We developed two SAS macro programs--one for the simple bootstrap that compares the discriminatory performance of the Cox proportional hazards model from the original sample in bootstrap samples; and another (which is more efficient), known as stepwise bootstrap validation, that makes the same comparison but from models developed by variable selection from bootstrap samples in the original sample. These are illustrated through an example from cardiovascular disease (CVD) risk prediction. METHODS: Two SAS macro programs for Cox proportional hazards model using Proc PHREG were developed for estimating overoptimism in Harrell's C and Somers' D statistics. The computer programs were applied to data on CVD incidence for a Framingham cohort that combined both the original and offspring exams. The risk factors considered were current smoking, diabetes, age, sex, systolic blood pressure, diastolic blood pressure, total cholesterol, high-density lipoprotein cholesterol, triglycerides and body mass index. RESULTS: The degree of overoptimism in both Harrell's C and Somers' D statistics were low. Both these statistics were corrected for overoptimism by subtracting overoptimism from their observed values. Between the two bootstrap validation algorithms, the degree of overoptimism was estimated to be higher for stepwise bootstrap validation. CONCLUSION: The programs are very useful for evaluating the 'overoptimism corrected' predictive performance of Cox proportional hazards model.