ChatGPT Is Great at Taking Medical Licensing Exams. But Can It Replace Doctors?

Key Takeaways

Some scientists have already been testing ChatGPT in medical education. In a preprint study, ChatGPT scored over 50% accuracy in the United States Medical Licensing Exam.

ChatGPT only has data available up until the end of 2021. The researchers used more recent exam questions in 2022, and ChatGPT still delivered an impressive performance.

“Random guessing would be 20%. A well-trained non-medical professional would not be able to exceed 40% to 50%. But ChatGPT was getting consistently in the high 50s to mid 60s, and even 70%,” said Victor Tseng, MD, a co-author of the study and medical director of AnsibleHealth based in Atlanta.

Despite its stellar capabilities in taking exams, ChatGPT is far from being employable in the medical field.

ChatGPT is a large language model that predicts how sentences would fit together based on the text data it’s been fed, but it doesn’t mean the tool has “good judgment or common sense,” according to Ignacio Fuentes, executive director of the Jameel Clinic at MIT, an initiative on AI machine learning technologies at the intersection of health care and life sciences.

Léonard Boussioux, a final year PhD student in operations research at MIT, said large language models are just threading information together, sometimes resulting in absurd mistakes said in absolute confidence.

“They make it look like they really know what they’re saying when, in fact, no, it’s mostly based on correlations,” Boussioux told Verywell.

How Can ChatGPT Help Improve Health Care?

While ChatGPT is still learning and improving through its interactions with users, experts say this tool can already be implemented in health care in useful ways.

For example, Tseng and his team use ChatGPT to write appeal letters to insurance agencies and to translate complex “medicalese” into a format that is easier for patients to understand.

“It is these small, incremental things that are already changing practice in many ways,” Tseng said.

ChatGPT can also help researchers brainstorm ideas and expedite workflow. In the future, Boussioux said, a more advanced version of ChatPT might be able to accurately diagnose certain medical conditions.

It might also be able tohelp administrative workers and clinicians select the correct codeswhen billing insurance companies for care, an otherwise time-consuming and tedious task.

Verywell

ChatGPT

Why Is ChatGPT Problematic?

ChatGPT is prone to factual errors, and OpenAI states this limitation clearly on the homepage. Not only is ChatGPT occasionally wrong, but it can also “produce harmful instructions or biased content,” according to the homepage.

Like many other language models, ChatGPT is trained on text data from the internet, which means it can reflect human biases, stereotypes, and misinformation.

“You put this in a clinical setting and it can have a lot of trouble. So we need to make sure that we do this in a safe way,” Fuentes said.

A recentTimeinvestigationrevealed that the company OpenAI hired low-wage workers in Kenya to filter toxic and harmful content from ChatGPT—a reminder that AI innovation still relies on human moderation and exploitation.

In addition to the concerns about errors and misinformation, Tseng said patient privacy is something that needs to be addressed if ChatGPT is used in a medical setting. Since ChatGPT is not HIPPA compliant, it can potentially leak patient data.

“I had to resist a lot of the momentum to push it more forcefully into patient care and actually step back and say: What are the actual milestones and benchmarks of transparency and fairness we want to make sure it hits before taking the next step?” Tseng said.

What This Means For YouWhile ChatGPT is an innovative AI tool, it should not be used in place of medical advice from a healthcare provider. Since this chatbot pulls massive text data from the internet and it’s still in the early days of development, it’s prone to biases, stereotypes, and misinformation.

What This Means For You

While ChatGPT is an innovative AI tool, it should not be used in place of medical advice from a healthcare provider. Since this chatbot pulls massive text data from the internet and it’s still in the early days of development, it’s prone to biases, stereotypes, and misinformation.

1 SourceVerywell Health uses only high-quality sources, including peer-reviewed studies, to support the facts within our articles. Read oureditorial processto learn more about how we fact-check and keep our content accurate, reliable, and trustworthy.Kung TH, Cheatham M, ChatGPT, et al.Performance of ChatGPT on USMLE: potential for ai-assisted medical education using large language models.medRxiv. Preprint posted online December 21, 2022. doi:10.1101/2022.12.19.22283643

1 Source

Verywell Health uses only high-quality sources, including peer-reviewed studies, to support the facts within our articles. Read oureditorial processto learn more about how we fact-check and keep our content accurate, reliable, and trustworthy.Kung TH, Cheatham M, ChatGPT, et al.Performance of ChatGPT on USMLE: potential for ai-assisted medical education using large language models.medRxiv. Preprint posted online December 21, 2022. doi:10.1101/2022.12.19.22283643

Kung TH, Cheatham M, ChatGPT, et al.Performance of ChatGPT on USMLE: potential for ai-assisted medical education using large language models.medRxiv. Preprint posted online December 21, 2022. doi:10.1101/2022.12.19.22283643

Meet Our Medical Expert Board

Share Feedback

Was this page helpful?Thanks for your feedback!What is your feedback?OtherHelpfulReport an ErrorSubmit

Was this page helpful?

Thanks for your feedback!

What is your feedback?OtherHelpfulReport an ErrorSubmit

What is your feedback?

Key Takeaways#

How Can ChatGPT Help Improve Health Care?#

Why Is ChatGPT Problematic?#

What This Means For You#

Key Takeaways

How Can ChatGPT Help Improve Health Care?

Why Is ChatGPT Problematic?

What This Means For You