Comparison of Bayesian model averaging and stepwise methods for model selection in logistic regression.

Pubmed ID: 15505893

Journal: Statistics in medicine

Publication Date: Nov. 30, 2004

Affiliation: Department of Epidemiology and Population Health, Medical Statistics Unit, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK. duolao.wang@lshtm.ac.uk

MeSH Terms: Humans, Male, Female, Aged, Risk Factors, Bayes Theorem, Age Factors, Logistic Models, Middle Aged, Smoking, Body Mass Index, Coronary Disease, Sex Factors, Blood Pressure, Computer Simulation, Models, Statistical, Cholesterol, Glucose Intolerance

Authors: Wang D, Zhang W, Bakhai A

Cite As: Wang D, Zhang W, Bakhai A. Comparison of Bayesian model averaging and stepwise methods for model selection in logistic regression. Stat Med 2004 Nov 30;23(22):3451-67.

Studies:

Abstract

Logistic regression is the standard method for assessing predictors of diseases. In logistic regression analyses, a stepwise strategy is often adopted to choose a subset of variables. Inference about the predictors is then made based on the chosen model constructed of only those variables retained in that model. This method subsequently ignores both the variables not selected by the procedure, and the uncertainty due to the variable selection procedure. This limitation may be addressed by adopting a Bayesian model averaging approach, which selects a number of all possible such models, and uses the posterior probabilities of these models to perform all inferences and predictions. This study compares the Bayesian model averaging approach with the stepwise procedures for selection of predictor variables in logistic regression using simulated data sets and the Framingham Heart Study data. The results show that in most cases Bayesian model averaging selects the correct model and out-performs stepwise approaches at predicting an event of interest.