The BIOPREVENT machine-learning algorithm predicts chronic graft-versus-host disease and mortality risk using posttransplant biomarkers.
Pubmed ID: 41697751
Pubmed Central ID: PMC12904722
Journal: The Journal of clinical investigation
Publication Date: Feb. 16, 2026
MeSH Terms: Humans, Male, Adult, Female, Algorithms, Middle Aged, Chronic Disease, Graft vs Host Disease, Biomarkers, Hematopoietic Stem Cell Transplantation, Machine Learning
Grants: R01 CA168814, R01 CA264921
Authors: Logan BR, Paczesny S, Ritz J, Martens MJ, Dutta D, Yu Y, Rein LE
Cite As: Martens MJ, Dutta D, Yu Y, Rein LE, Ritz J, Logan BR, Paczesny S. The BIOPREVENT machine-learning algorithm predicts chronic graft-versus-host disease and mortality risk using posttransplant biomarkers. J Clin Invest 2026 Feb 16;136. (4). doi: 10.1172/JCI195228. eCollection 2026 Feb 16.
Studies:
- Blood and Marrow Clinical Trials Network (BMT CTN) Comparing Peripheral Blood Stem Cell Transplantation Versus Bone Marrow Transplantation in Individuals With Hematologic Cancers (0201)
- Blood and Marrow Clinical Trials Network (BMT CTN) Prospective Multi-Center Cohort for the Evaluation of Biomarkers Predicting Risk of Complications and Mortality Following Allogeneic HCT (1202)
Abstract
BACKGROUNDChronic graft-versus-host disease (cGVHD) is a major contributor to nonrelapse mortality (NRM) following hematopoietic cell transplantation (HCT). Whether machine-learning (ML) models with biomarkers improve the accuracy for predicting future cGVHD/NRM is not established.METHODSWe developed BIOPREVENT (BIOmarkers PREVENTion), a ML algorithm using data from 1,310 HCT recipients, incorporating 7 plasma proteins measured at Day 90/100 post-HCT and 9 clinical variables. Patients were divided into training and validation datasets. ML models - including CoxXGBoost, Group SCAD, Adaptive Group Lasso, Random Survival Forests, and Bayesian Additive Regression Trees (BART) - were used to estimate time-varying Area Under the ROC Curve (AUCt) at Days 180, 270, 360, and 540. Deep learning models were also evaluated.RESULTSML models with biomarkers outperformed clinical-only models for predicting cGVHD, with BART and CoxXGBoost achieving AUCt greater than 0.65 at 1 year. For NRM, models with biomarkers achieved AUCt ranging from 0.75-0.91. Deep learning did not outperform other ML approaches. BART consistently demonstrated high predictive accuracy and was selected for the final BIOPREVENT model. Calibration curves aligned with observed values. Variable importance analysis identified MMP3 and CXCL9 as key for cGVHD prediction and IL1RL1 and sCD163 for NRM. Cumulative incidences of cGVHD and NRM differed significantly based on BIOPREVENT-defined cutpoints.CONCLUSIONBIOPREVENT accurately predicts individual risk of future cGVHD and NRM using biomarkers at 3 months post-HCT. A publicly available R Shiny web application supports its clinical use. Further studies are needed to explore its role in guiding preemptive therapy.TRIAL REGISTRATIONBMTCTN 0201, BMTCTN 1202, and NCT02194439.FUNDINGR01CA264921, U10HL069294, U24HL138660, R01HD074587, and P01HL158505.