The information in covariate imbalance in studies of hormone replacement therapy

Journal: Annals of Applied Statistics

Publication Date: Dec. 1, 2021

Authors: Yu Ruoqi, Small Dylan S., Rosenbaum Paul R.

Cite As: Yu R, Small D, Rosenbaum P. The information in covariate imbalance in studies of hormone replacement therapy. Ann Appl Stat 2021 Dec;15(4):2023-42.

Studies:

Abstract

A widely noted failure of causal inference occurred when several observational studies claimed that hormone replacement therapy (HRT) reduced risk of cardiovascular disease; yet, subsequent randomized trials found an increased, not a decreased, cardiovascular risk. We take a close look at covariate imbalances in one of the observational data sets. We use some old, some recent, and some new methods, plus we update an important, simple but largely forgotten suggestion of William Cochran about screening covariates and other variables. In particular, a tapered match shows the impact on all covariates of gradually matching for additional covariates. An exterior match examines the change in the control group as additional covariates are included, and the consequences for outcomes. Because covariates are sometimes continuous, sometimes binary, sometimes ordinal, sometimes missing, we suggest keeping track of magnitudes of aggregate bias in observed covariates using a new estimate of the Kullback–Leibler information between covariate distributions in treated and matched control groups, a flexible measure with several attractive properties. The initial studies ignored some enormous imbalances in socioeconomic covariates that predict the outcomes under study. Our more comprehensive analyses mimic some post-game reanalyses done subsequent to the randomized trials; however, even these omit a large imbalance in a consequential covariate discovered by Cochran’s quick but expansive screening suggestion. Our sense is that a closer examination of covariate imbalance would not have led to a correct conclusion about the effects of HRT, but it would have heightened concerns about the magnitude of the problems in the observational studies, and it would have raised doubts about the ability of a few regression coefficients to eliminate all biases, observed and unobserved, in the comparison. Medical journals need to recognize that certain sources of uncertainty cannot be eliminated from certain necessary types of empirical investigation; moreover, these journals need to learn new ways to describe these sources of uncertainty with objectivity and candor.