Performance of ChatGPT as an AI-assisted decision support tool in medicine: a proof-of-concept study for interpreting symptoms and management of common cardiac conditions (AMSTELHEART-2)

Ralf E. Harskamp; Lukas de Clercq

doi:10.1080/00015385.2024.2303528

Performance of ChatGPT as an AI-assisted decision support tool in medicine: a proof-of-concept study for interpreting symptoms and management of common cardiac conditions (AMSTELHEART-2)

Ralf E. Harskamp, Lukas de Clercq

Research output: Contribution to journal › Article › Academic › peer-review

2 Citations (Scopus)

Abstract

Background: It is thought that ChatGPT, an advanced language model developed by OpenAI, may in the future serve as an AI-assisted decision support tool in medicine. Objective: To evaluate the accuracy of ChatGPT’s recommendations on medical questions related to common cardiac symptoms or conditions. Methods: We tested ChatGPT’s ability to address medical questions in two ways. First, we assessed its accuracy in correctly answering cardiovascular trivia questions (n = 50), based on quizzes for medical professionals. Second, we entered 20 clinical case vignettes on the ChatGPT platform and evaluated its accuracy compared to expert opinion and clinical course. Lastly, we compared the latest research version (v3.5; 27 September 2023) with a prior version (v3.5; 30 January 2023) to evaluate improvement over time. Results: We found that ChatGPT latest version correctly answered 92% of the trivia questions, with slight variation in accuracy in the domains coronary artery disease (100%), pulmonary and venous thrombotic embolism (100%), atrial fibrillation (90%), heart failure (90%) and cardiovascular risk management (80%). In the 20 case vignettes, ChatGPT’s response matched in 17 (85%) of the cases with the actual advice given. Straightforward patient-to-physician questions were all answered correctly (10/10). In more complex cases, where physicians (general practitioners) asked other physicians (cardiologists) for assistance or decision support, ChatGPT was correct in 70% of cases, and otherwise provided incomplete, inconclusive, or inappropriate recommendations when compared with expert consultation. ChatGPT showed significant improvement over time; as the January version correctly answered 74% (vs 92%) of trivia questions (p = 0.031), and correctly answered a mere 50% of complex cases. Conclusions: Our study suggests that ChatGPT has potential as an AI-assisted decision support tool in medicine, particularly for straightforward, low-complex medical questions, but further research is needed to fully evaluate its potential.

Original language	English
Journal	Acta Cardiologica
Early online date	2024
DOIs	https://doi.org/10.1080/00015385.2024.2303528
Publication status	E-pub ahead of print - 2024

Keywords

Artificial intelligence
cardiovascular medicine
chatbot
ehealth

Access to Document

10.1080/00015385.2024.2303528

Cite this

@article{b8f31f7ab8894045b688a68a04dc7dd0,

title = "Performance of ChatGPT as an AI-assisted decision support tool in medicine: a proof-of-concept study for interpreting symptoms and management of common cardiac conditions (AMSTELHEART-2)",

abstract = "Background: It is thought that ChatGPT, an advanced language model developed by OpenAI, may in the future serve as an AI-assisted decision support tool in medicine. Objective: To evaluate the accuracy of ChatGPT{\textquoteright}s recommendations on medical questions related to common cardiac symptoms or conditions. Methods: We tested ChatGPT{\textquoteright}s ability to address medical questions in two ways. First, we assessed its accuracy in correctly answering cardiovascular trivia questions (n = 50), based on quizzes for medical professionals. Second, we entered 20 clinical case vignettes on the ChatGPT platform and evaluated its accuracy compared to expert opinion and clinical course. Lastly, we compared the latest research version (v3.5; 27 September 2023) with a prior version (v3.5; 30 January 2023) to evaluate improvement over time. Results: We found that ChatGPT latest version correctly answered 92% of the trivia questions, with slight variation in accuracy in the domains coronary artery disease (100%), pulmonary and venous thrombotic embolism (100%), atrial fibrillation (90%), heart failure (90%) and cardiovascular risk management (80%). In the 20 case vignettes, ChatGPT{\textquoteright}s response matched in 17 (85%) of the cases with the actual advice given. Straightforward patient-to-physician questions were all answered correctly (10/10). In more complex cases, where physicians (general practitioners) asked other physicians (cardiologists) for assistance or decision support, ChatGPT was correct in 70% of cases, and otherwise provided incomplete, inconclusive, or inappropriate recommendations when compared with expert consultation. ChatGPT showed significant improvement over time; as the January version correctly answered 74% (vs 92%) of trivia questions (p = 0.031), and correctly answered a mere 50% of complex cases. Conclusions: Our study suggests that ChatGPT has potential as an AI-assisted decision support tool in medicine, particularly for straightforward, low-complex medical questions, but further research is needed to fully evaluate its potential.",

keywords = "Artificial intelligence, cardiovascular medicine, chatbot, ehealth",

author = "Harskamp, {Ralf E.} and {de Clercq}, Lukas",

note = "Publisher Copyright: {\textcopyright} 2024 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.",

year = "2024",

doi = "10.1080/00015385.2024.2303528",

language = "English",

journal = "Acta Cardiologica",

issn = "0001-5385",

publisher = "Acta Cardiologica",

}

TY - JOUR

T1 - Performance of ChatGPT as an AI-assisted decision support tool in medicine

T2 - a proof-of-concept study for interpreting symptoms and management of common cardiac conditions (AMSTELHEART-2)

AU - Harskamp, Ralf E.

AU - de Clercq, Lukas

PY - 2024

Y1 - 2024

N2 - Background: It is thought that ChatGPT, an advanced language model developed by OpenAI, may in the future serve as an AI-assisted decision support tool in medicine. Objective: To evaluate the accuracy of ChatGPT’s recommendations on medical questions related to common cardiac symptoms or conditions. Methods: We tested ChatGPT’s ability to address medical questions in two ways. First, we assessed its accuracy in correctly answering cardiovascular trivia questions (n = 50), based on quizzes for medical professionals. Second, we entered 20 clinical case vignettes on the ChatGPT platform and evaluated its accuracy compared to expert opinion and clinical course. Lastly, we compared the latest research version (v3.5; 27 September 2023) with a prior version (v3.5; 30 January 2023) to evaluate improvement over time. Results: We found that ChatGPT latest version correctly answered 92% of the trivia questions, with slight variation in accuracy in the domains coronary artery disease (100%), pulmonary and venous thrombotic embolism (100%), atrial fibrillation (90%), heart failure (90%) and cardiovascular risk management (80%). In the 20 case vignettes, ChatGPT’s response matched in 17 (85%) of the cases with the actual advice given. Straightforward patient-to-physician questions were all answered correctly (10/10). In more complex cases, where physicians (general practitioners) asked other physicians (cardiologists) for assistance or decision support, ChatGPT was correct in 70% of cases, and otherwise provided incomplete, inconclusive, or inappropriate recommendations when compared with expert consultation. ChatGPT showed significant improvement over time; as the January version correctly answered 74% (vs 92%) of trivia questions (p = 0.031), and correctly answered a mere 50% of complex cases. Conclusions: Our study suggests that ChatGPT has potential as an AI-assisted decision support tool in medicine, particularly for straightforward, low-complex medical questions, but further research is needed to fully evaluate its potential.

AB - Background: It is thought that ChatGPT, an advanced language model developed by OpenAI, may in the future serve as an AI-assisted decision support tool in medicine. Objective: To evaluate the accuracy of ChatGPT’s recommendations on medical questions related to common cardiac symptoms or conditions. Methods: We tested ChatGPT’s ability to address medical questions in two ways. First, we assessed its accuracy in correctly answering cardiovascular trivia questions (n = 50), based on quizzes for medical professionals. Second, we entered 20 clinical case vignettes on the ChatGPT platform and evaluated its accuracy compared to expert opinion and clinical course. Lastly, we compared the latest research version (v3.5; 27 September 2023) with a prior version (v3.5; 30 January 2023) to evaluate improvement over time. Results: We found that ChatGPT latest version correctly answered 92% of the trivia questions, with slight variation in accuracy in the domains coronary artery disease (100%), pulmonary and venous thrombotic embolism (100%), atrial fibrillation (90%), heart failure (90%) and cardiovascular risk management (80%). In the 20 case vignettes, ChatGPT’s response matched in 17 (85%) of the cases with the actual advice given. Straightforward patient-to-physician questions were all answered correctly (10/10). In more complex cases, where physicians (general practitioners) asked other physicians (cardiologists) for assistance or decision support, ChatGPT was correct in 70% of cases, and otherwise provided incomplete, inconclusive, or inappropriate recommendations when compared with expert consultation. ChatGPT showed significant improvement over time; as the January version correctly answered 74% (vs 92%) of trivia questions (p = 0.031), and correctly answered a mere 50% of complex cases. Conclusions: Our study suggests that ChatGPT has potential as an AI-assisted decision support tool in medicine, particularly for straightforward, low-complex medical questions, but further research is needed to fully evaluate its potential.

KW - Artificial intelligence

KW - cardiovascular medicine

KW - chatbot

KW - ehealth

UR - http://www.scopus.com/inward/record.url?scp=85185494224&partnerID=8YFLogxK

U2 - 10.1080/00015385.2024.2303528

DO - 10.1080/00015385.2024.2303528

M3 - Article

C2 - 38348835

SN - 0001-5385

JO - Acta Cardiologica

JF - Acta Cardiologica

ER -

Performance of ChatGPT as an AI-assisted decision support tool in medicine: a proof-of-concept study for interpreting symptoms and management of common cardiac conditions (AMSTELHEART-2)

Abstract

Keywords

Access to Document

Other files and links

Cite this