TY - JOUR
T1 - Comparison among random forest, logistic regression, and existing clinical risk scores for predicting outcomes in patients with atrial fibrillation
T2 - A report from the J-RHYTHM registry
AU - Watanabe, Eiichi
AU - Noyama, Shunsuke
AU - Kiyono, Ken
AU - Inoue, Hiroshi
AU - Atarashi, Hirotsugu
AU - Okumura, Ken
AU - Yamashita, Takeshi
AU - Lip, Gregory Y.H.
AU - Kodani, Eitaro
AU - Origasa, Hideki
N1 - Funding Information:
This study was planned by the Japanese Society of Electrocardiology and supported by grants from the Japan Heart Foundation and Grants‐in‐Aid for Scientific Research from Japan Society for the Promotion of Science (Grant Number 21K08140). This study was registered in the University Hospital Medical Information Network Clinical Trials Registry 000001569. The authors thank the J‐RHYTHM Registry staff and patients for their contribution to this work.
Funding Information:
Dr. Watanabe received lecture fees from Daiichi‐Sankyo; Dr. Noyama has no COI; Dr. Kiyono received research funding from Kurabo Industries ltd.; Dr. Inoue received remuneration from Daiichi ‐Sankyo, Bayer Healthcare and Bristol‐Myers Squibb; Dr. Atarashi received lecture fees from Daiichi‐Sankyo; Dr. Okumura received research funding from Boehringer Ingelheim and Daiichi ‐Sankyo, and remuneration from Boehringer Ingelheim, Bayer Healthcare, Daiichi‐Sankyo, and Pfizer; Dr. Yamashita received research funding from Daiichi‐Sankyo, Bayer Healthcare and Bristol‐Myers Squibb, and remuneration from Boehringer Ingelheim, Daiichi ‐Sankyo, Bayer Healthcare, Pfizer, Bristol‐Myers Squibb, Ono Pharmaceutical and Toa Eiyo; Dr. Lip served as a consultant and speaker for BMS/Pfizer, Boehringer Ingelheim and Daiichi‐Sankyo; No fees are received personally. Dr. Kodani received lecture fees from Daiichi‐Sankyo; Dr. Origasa received lecture fees from Daiichi ‐Sankyo.
Publisher Copyright:
© 2021 The Authors. Clinical Cardiology published by Wiley Periodicals LLC.
PY - 2021/9
Y1 - 2021/9
N2 - Background: Machine learning (ML) has emerged as a promising tool for risk stratification. However, few studies have applied ML to risk assessment of patients with atrial fibrillation (AF). Hypothesis: We aimed to compare the performance of random forest (RF), logistic regression (LR), and conventional risk schemes in predicting the outcomes of AF. Methods: We analyzed data from 7406 nonvalvular AF patients (median age 71 years, female 29.2%) enrolled in a nationwide AF registry (J-RHYTHM Registry) and who were followed for 2 years. The endpoints were thromboembolisms, major bleeding, and all-cause mortality. Models were generated from potential predictors using an RF model, stepwise LR model, and the thromboembolism (CHADS2 and CHA2DS2-VASc) and major bleeding (HAS-BLED, ORBIT, and ATRIA) scores. Results: For thromboembolisms, the C-statistic of the RF model was significantly higher than that of the LR model (0.66 vs. 0.59, p =.03) or CHA2DS2-VASc score (0.61, p <.01). For major bleeding, the C-statistic of RF was comparable to the LR (0.69 vs. 0.66, p =.07) and outperformed the HAS-BLED (0.61, p <.01) and ATRIA (0.62, p <.01) but not the ORBIT (0.67, p =.07). The C-statistic of RF for all-cause mortality was comparable to the LR (0.78 vs. 0.79, p =.21). The calibration plot for the RF model was more aligned with the observed events for major bleeding and all-cause mortality. Conclusions: The RF model performed as well as or better than the LR model or existing clinical risk scores for predicting clinical outcomes of AF.
AB - Background: Machine learning (ML) has emerged as a promising tool for risk stratification. However, few studies have applied ML to risk assessment of patients with atrial fibrillation (AF). Hypothesis: We aimed to compare the performance of random forest (RF), logistic regression (LR), and conventional risk schemes in predicting the outcomes of AF. Methods: We analyzed data from 7406 nonvalvular AF patients (median age 71 years, female 29.2%) enrolled in a nationwide AF registry (J-RHYTHM Registry) and who were followed for 2 years. The endpoints were thromboembolisms, major bleeding, and all-cause mortality. Models were generated from potential predictors using an RF model, stepwise LR model, and the thromboembolism (CHADS2 and CHA2DS2-VASc) and major bleeding (HAS-BLED, ORBIT, and ATRIA) scores. Results: For thromboembolisms, the C-statistic of the RF model was significantly higher than that of the LR model (0.66 vs. 0.59, p =.03) or CHA2DS2-VASc score (0.61, p <.01). For major bleeding, the C-statistic of RF was comparable to the LR (0.69 vs. 0.66, p =.07) and outperformed the HAS-BLED (0.61, p <.01) and ATRIA (0.62, p <.01) but not the ORBIT (0.67, p =.07). The C-statistic of RF for all-cause mortality was comparable to the LR (0.78 vs. 0.79, p =.21). The calibration plot for the RF model was more aligned with the observed events for major bleeding and all-cause mortality. Conclusions: The RF model performed as well as or better than the LR model or existing clinical risk scores for predicting clinical outcomes of AF.
UR - http://www.scopus.com/inward/record.url?scp=85111398093&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85111398093&partnerID=8YFLogxK
U2 - 10.1002/clc.23688
DO - 10.1002/clc.23688
M3 - Article
C2 - 34318510
AN - SCOPUS:85111398093
VL - 44
SP - 1305
EP - 1315
JO - Clinical Cardiology
JF - Clinical Cardiology
SN - 0160-9289
IS - 9
ER -