Assessing knowledge about medical physics in language-generative AI with large language model: using the medical physicist exam

Noriyuki Kadoya, Kazuhiro Arai, Shohei Tanaka, Yuto Kimura, Ryota Tozuka, Keisuke Yasui, Naoki Hayashi, Yoshiyuki Katsuta, Haruna Takahashi, Koki Inoue, Keiichi Jingu

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

This study aimed to evaluate the performance for answering the Japanese medical physicist examination and providing the benchmark of knowledge about medical physics in language-generative AI with large language model. We used questions from Japan’s 2018, 2019, 2020, 2021 and 2022 medical physicist board examinations, which covered various question types, including multiple-choice questions, and mainly focused on general medicine and medical physics. ChatGPT-3.5 and ChatGPT-4.0 (OpenAI) were used. We compared the AI-based answers with the correct ones. The average accuracy rates were 42.2 ± 2.5% (ChatGPT-3.5) and 72.7 ± 2.6% (ChatGPT-4), showing that ChatGPT-4 was more accurate than ChatGPT-3.5 [all categories (except for radiation-related laws and recommendations/medical ethics): p value < 0.05]. Even with the ChatGPT model with higher accuracy, the accuracy rates were less than 60% in two categories; radiation metrology (55.6%), and radiation-related laws and recommendations/medical ethics (40.0%). These data provide the benchmark for knowledge about medical physics in ChatGPT and can be utilized as basic data for the development of various medical physics tools using ChatGPT (e.g., radiation therapy support tools with Japanese input).

Original languageEnglish
Pages (from-to)929-937
Number of pages9
JournalRadiological Physics and Technology
Volume17
Issue number4
DOIs
Publication statusPublished - 12-2024

All Science Journal Classification (ASJC) codes

  • Radiation
  • Physical Therapy, Sports Therapy and Rehabilitation
  • Radiology Nuclear Medicine and imaging

Fingerprint

Dive into the research topics of 'Assessing knowledge about medical physics in language-generative AI with large language model: using the medical physicist exam'. Together they form a unique fingerprint.

Cite this