Exploring trade-offs in equitable stroke risk prediction with parity-constrained and race-free models.

Pubmed ID: 40253926

Pubmed Central ID: PMC12133243

Journal: Artificial intelligence in medicine

Publication Date: June 1, 2025

MeSH Terms: Humans, Male, Female, Aged, Risk Factors, Middle Aged, Risk Assessment, Stroke, Machine Learning, Neural Networks, Computer, Black or African American, White

Grants: N01 HC025195, N01 HC095159, U01 NS041588, UL1 RR024156, N01 HC095167, N01 HC095161, N01 HC095164, N01 HC095166, N01 HC095160, N01 HC095169, N01 HC095165, N01 HC095168, N01 HC095163, N01 HC095162, HHSN268201700004I, HHSN268201500001I, HHSN268201500001C, HHSN268201700001I, HHSN268201700003I, HHSN268201700005I, HHSN268201700002I, HHSN268201700002C, HHSN268201700005C, HHSN268201700001C, HHSN268201700003C, HHSN268201700004C, K01 MH127309, R01 HL136666, R61 NS120246, R33 NS120246

Authors: Wojdyla D, Wang H, Henao R, Pencina M, Engelhard M

Cite As: Engelhard M, Wojdyla D, Wang H, Pencina M, Henao R. Exploring trade-offs in equitable stroke risk prediction with parity-constrained and race-free models. Artif Intell Med 2025 Jun;164:103130. Epub 2025 Apr 10.

Studies:

Abstract

A recent analysis of common stroke risk prediction models showed that performance differs between Black and White subgroups, and that applying standard machine learning methods does not reduce these disparities. There have been calls in the clinical literature to correct such disparities by removing race as a predictor (i.e., race-free models). Alternatively, a variety of machine learning methods have been proposed to constrain differences in model predictions between racial groups. In this work, we compare these approaches for equitable stroke risk prediction. We begin by proposing a discrete-time, neural network-based time-to-event model that incorporates a parity constraint designed to make predictions more similar between groups. Using harmonized data from Framingham Offspring, MESA, and ARIC studies, we develop both parity-constrained and unconstrained stroke risk prediction models, then compare their performance with race-free models in a held-out test set and a secondary validation set (REGARDS). Our evaluation includes both intra-group and inter-group performance metrics for right-censored time to event outcomes. Results illustrate a fundamental trade-off in which parity-constrained models must sacrifice intra-group calibration to improve inter-group discrimination performance, while the race-free models strike a balance between the two. Consequently, the choice of model must depend on the potential benefits and harms associated with the intended clinical use. All models as well as code implementing our approach are available in a public repository. More broadly, these results provide a roadmap for development of equitable clinical risk prediction models and illustrate both merits and limitations of a race-free approach.