Pulse Brain · Growing Health Evidence Index
Peer-reviewed

Identifying and Extracting Rare Diseases and Their Phenotypes with Large Language Models

Cathy Shyr, Yan Hu, Lisa Bastarache, Alex Cheng, Rizwan Hamid, Paul A. Harris, Hua Xu

Journal of Healthcare Informatics Research · 2024

Read source ↗ All evidence

Summary

Purpose: Phenotyping is critical for informing rare disease diagnosis and treatment, but disease phenotypes are often embedded in unstructured text. While natural language processing (NLP) can automate extraction, a major bottleneck is developing annotated corpora. Recently, prompt learning with large language models (LLMs) has been shown to lead to generalizable results without any (zero-shot) or few annotated samples (few-shot), but none have explored this for rare diseases. Our work is the first to study prompt learning for identifying and extracting rare disease phenotypes in the zero- and few-shot settings. Methods: We compared the performance of prompt learning with ChatGPT and fine-tuning with BioClinicalBERT. We engineered novel prompts for ChatGPT to identify and extract rare dise

Subject
Other / interdisciplinary
Source type
Peer-reviewed study
System type
Other
DOI
10.1007/s41666-023-00155-0
Catalogue ID
BFmoso8xrl-5g03xf
Pulse AI · ask about this record

Dig deeper with Pulse AI.

Pulse AI has read the whole catalogue. Ask about this record, its theme, or how the findings apply to UK farming and policy — every answer cites the underlying studies.