Physician Assessment of ChatGPT and Bing Answers to American Cancer Society's Questions to Ask about Your Cancer

James R. Janopaul-Naylor; Andee Koo; David C. Qian; Neal S. McCall; Yuan Liu; Sagar A. Patel

doi:10.1097/COC.0000000000001050

Physician Assessment of ChatGPT and Bing Answers to American Cancer Society's Questions to Ask about Your Cancer

James R. Janopaul-Naylor, Andee Koo, David C. Qian, Neal S. McCall, Yuan Liu, Sagar A. Patel

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

Objectives: Artificial intelligence (AI) chatbots are a new, publicly available tool for patients to access health care-related information with unknown reliability related to cancer-related questions. This study assesses the quality of responses to common questions for patients with cancer. Methods: From February to March 2023, we queried chat generative pretrained transformer (ChatGPT) from OpenAI and Bing AI from Microsoft questions from the American Cancer Society's recommended "Questions to Ask About Your Cancer" customized for all stages of breast, colon, lung, and prostate cancer. Questions were, in addition, grouped by type (prognosis, treatment, or miscellaneous). The quality of AI chatbot responses was assessed by an expert panel using the validated DISCERN criteria. Results: Of the 117 questions presented to ChatGPT and Bing, the average score for all questions were 3.9 and 3.2, respectively (P < 0.001) and the overall DISCERN scores were 4.1 and 4.4, respectively. By disease site, the average score for ChatGPT and Bing, respectively, were 3.9 and 3.6 for prostate cancer (P = 0.02), 3.7 and 3.3 for lung cancer (P < 0.001), 4.1 and 2.9 for breast cancer (P < 0.001), and 3.8 and 3.0 for colorectal cancer (P < 0.001). By type of question, the average score for ChatGPT and Bing, respectively, were 3.6 and 3.4 for prognostic questions (P = 0.12), 3.9 and 3.1 for treatment questions (P < 0.001), and 4.2 and 3.3 for miscellaneous questions (P = 0.001). For 3 responses (3%) by ChatGPT and 18 responses (15%) by Bing, at least one panelist rated them as having serious or extensive shortcomings. Conclusions: AI chatbots provide multiple opportunities for innovating health care. This analysis suggests a critical need, particularly around cancer prognostication, for continual refinement to limit misleading counseling, confusion, and emotional distress to patients and families.

Original language	English (US)
Pages (from-to)	17-21
Number of pages	5
Journal	American Journal of Clinical Oncology: Cancer Clinical Trials
Volume	47
Issue number	1
DOIs	https://doi.org/10.1097/COC.0000000000001050
State	Published - Jan 1 2024
Externally published	Yes

Keywords

artificial intelligence
ChatGPT
health literacy
patient information

ASJC Scopus subject areas

Oncology
Cancer Research

Access to Document

10.1097/COC.0000000000001050

Cite this

@article{8d47ff4ac54b46f9a6395870ed8f0cfe,

title = "Physician Assessment of ChatGPT and Bing Answers to American Cancer Society's Questions to Ask about Your Cancer",

abstract = "Objectives: Artificial intelligence (AI) chatbots are a new, publicly available tool for patients to access health care-related information with unknown reliability related to cancer-related questions. This study assesses the quality of responses to common questions for patients with cancer. Methods: From February to March 2023, we queried chat generative pretrained transformer (ChatGPT) from OpenAI and Bing AI from Microsoft questions from the American Cancer Society's recommended {"}Questions to Ask About Your Cancer{"} customized for all stages of breast, colon, lung, and prostate cancer. Questions were, in addition, grouped by type (prognosis, treatment, or miscellaneous). The quality of AI chatbot responses was assessed by an expert panel using the validated DISCERN criteria. Results: Of the 117 questions presented to ChatGPT and Bing, the average score for all questions were 3.9 and 3.2, respectively (P < 0.001) and the overall DISCERN scores were 4.1 and 4.4, respectively. By disease site, the average score for ChatGPT and Bing, respectively, were 3.9 and 3.6 for prostate cancer (P = 0.02), 3.7 and 3.3 for lung cancer (P < 0.001), 4.1 and 2.9 for breast cancer (P < 0.001), and 3.8 and 3.0 for colorectal cancer (P < 0.001). By type of question, the average score for ChatGPT and Bing, respectively, were 3.6 and 3.4 for prognostic questions (P = 0.12), 3.9 and 3.1 for treatment questions (P < 0.001), and 4.2 and 3.3 for miscellaneous questions (P = 0.001). For 3 responses (3%) by ChatGPT and 18 responses (15%) by Bing, at least one panelist rated them as having serious or extensive shortcomings. Conclusions: AI chatbots provide multiple opportunities for innovating health care. This analysis suggests a critical need, particularly around cancer prognostication, for continual refinement to limit misleading counseling, confusion, and emotional distress to patients and families.",

keywords = "artificial intelligence, ChatGPT, health literacy, patient information",

author = "Janopaul-Naylor, {James R.} and Andee Koo and Qian, {David C.} and McCall, {Neal S.} and Yuan Liu and Patel, {Sagar A.}",

year = "2024",

month = jan,

day = "1",

doi = "10.1097/COC.0000000000001050",

language = "English (US)",

volume = "47",

pages = "17--21",

journal = "American Journal of Clinical Oncology: Cancer Clinical Trials",

issn = "0277-3732",

publisher = "Lippincott Williams and Wilkins",

number = "1",

}

TY - JOUR

T1 - Physician Assessment of ChatGPT and Bing Answers to American Cancer Society's Questions to Ask about Your Cancer

AU - Janopaul-Naylor, James R.

AU - Koo, Andee

AU - Qian, David C.

AU - McCall, Neal S.

AU - Liu, Yuan

AU - Patel, Sagar A.

PY - 2024/1/1

Y1 - 2024/1/1

N2 - Objectives: Artificial intelligence (AI) chatbots are a new, publicly available tool for patients to access health care-related information with unknown reliability related to cancer-related questions. This study assesses the quality of responses to common questions for patients with cancer. Methods: From February to March 2023, we queried chat generative pretrained transformer (ChatGPT) from OpenAI and Bing AI from Microsoft questions from the American Cancer Society's recommended "Questions to Ask About Your Cancer" customized for all stages of breast, colon, lung, and prostate cancer. Questions were, in addition, grouped by type (prognosis, treatment, or miscellaneous). The quality of AI chatbot responses was assessed by an expert panel using the validated DISCERN criteria. Results: Of the 117 questions presented to ChatGPT and Bing, the average score for all questions were 3.9 and 3.2, respectively (P < 0.001) and the overall DISCERN scores were 4.1 and 4.4, respectively. By disease site, the average score for ChatGPT and Bing, respectively, were 3.9 and 3.6 for prostate cancer (P = 0.02), 3.7 and 3.3 for lung cancer (P < 0.001), 4.1 and 2.9 for breast cancer (P < 0.001), and 3.8 and 3.0 for colorectal cancer (P < 0.001). By type of question, the average score for ChatGPT and Bing, respectively, were 3.6 and 3.4 for prognostic questions (P = 0.12), 3.9 and 3.1 for treatment questions (P < 0.001), and 4.2 and 3.3 for miscellaneous questions (P = 0.001). For 3 responses (3%) by ChatGPT and 18 responses (15%) by Bing, at least one panelist rated them as having serious or extensive shortcomings. Conclusions: AI chatbots provide multiple opportunities for innovating health care. This analysis suggests a critical need, particularly around cancer prognostication, for continual refinement to limit misleading counseling, confusion, and emotional distress to patients and families.

AB - Objectives: Artificial intelligence (AI) chatbots are a new, publicly available tool for patients to access health care-related information with unknown reliability related to cancer-related questions. This study assesses the quality of responses to common questions for patients with cancer. Methods: From February to March 2023, we queried chat generative pretrained transformer (ChatGPT) from OpenAI and Bing AI from Microsoft questions from the American Cancer Society's recommended "Questions to Ask About Your Cancer" customized for all stages of breast, colon, lung, and prostate cancer. Questions were, in addition, grouped by type (prognosis, treatment, or miscellaneous). The quality of AI chatbot responses was assessed by an expert panel using the validated DISCERN criteria. Results: Of the 117 questions presented to ChatGPT and Bing, the average score for all questions were 3.9 and 3.2, respectively (P < 0.001) and the overall DISCERN scores were 4.1 and 4.4, respectively. By disease site, the average score for ChatGPT and Bing, respectively, were 3.9 and 3.6 for prostate cancer (P = 0.02), 3.7 and 3.3 for lung cancer (P < 0.001), 4.1 and 2.9 for breast cancer (P < 0.001), and 3.8 and 3.0 for colorectal cancer (P < 0.001). By type of question, the average score for ChatGPT and Bing, respectively, were 3.6 and 3.4 for prognostic questions (P = 0.12), 3.9 and 3.1 for treatment questions (P < 0.001), and 4.2 and 3.3 for miscellaneous questions (P = 0.001). For 3 responses (3%) by ChatGPT and 18 responses (15%) by Bing, at least one panelist rated them as having serious or extensive shortcomings. Conclusions: AI chatbots provide multiple opportunities for innovating health care. This analysis suggests a critical need, particularly around cancer prognostication, for continual refinement to limit misleading counseling, confusion, and emotional distress to patients and families.

KW - artificial intelligence

KW - ChatGPT

KW - health literacy

KW - patient information

UR - http://www.scopus.com/inward/record.url?scp=85181088199&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85181088199&partnerID=8YFLogxK

U2 - 10.1097/COC.0000000000001050

DO - 10.1097/COC.0000000000001050

M3 - Article

C2 - 37823708

AN - SCOPUS:85181088199

SN - 0277-3732

VL - 47

SP - 17

EP - 21

JO - American Journal of Clinical Oncology: Cancer Clinical Trials

JF - American Journal of Clinical Oncology: Cancer Clinical Trials

IS - 1

ER -

Physician Assessment of ChatGPT and Bing Answers to American Cancer Society's Questions to Ask about Your Cancer

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this