Cathy Shyr, Yan Hu, Lisa Bastarache, Alex Cheng, Rizwan Hamid, Paul A. Harris, Hua Xu

doi:10.1007/s41666-023-00155-0

Summary

Purpose: Phenotyping is critical for informing rare disease diagnosis and treatment, but disease phenotypes are often embedded in unstructured text. While natural language processing (NLP) can automate extraction, a major bottleneck is developing annotated corpora. Recently, prompt learning with large language models (LLMs) has been shown to lead to generalizable results without any (zero-shot) or few annotated samples (few-shot), but none have explored this for rare diseases. Our work is the first to study prompt learning for identifying and extracting rare disease phenotypes in the zero- and few-shot settings. Methods: We compared the performance of prompt learning with ChatGPT and fine-tuning with BioClinicalBERT. We engineered novel prompts for ChatGPT to identify and extract rare dise

Subject: Other / interdisciplinary
Source type: Peer-reviewed study
System type: Other
DOI: 10.1007/s41666-023-00155-0
Catalogue ID: BFmoso8xrl-5g03xf

Pulse AI · ask about this record

Dig deeper with Pulse AI.

Pulse AI has read the whole catalogue. Ask about this record, its theme, or how the findings apply to UK farming and policy — every answer cites the underlying studies.

What does the evidence say about Other / interdisciplinary?→How does this finding apply in a UK context?→What are the most cited records on this topic?→

Identifying and Extracting Rare Diseases and Their Phenotypes with Large Language Models

Summary

Dig deeper with Pulse AI.

Related evidence