Enabling Drug-Induced Liver Injury Surveillance Through Automated Medication Extraction From Clinical Notes: A Medical Information Mart for Intensive Care IV Real-World Large Language Models Validation Study

Authors

  • Thanathip Suenghataiphorn
  • Kanachai Boonpiraks
  • Vitchapong Prasitsumrit
  • Narathorn Kulthamrongsri
  • Pojsakorn Danpanichkul

DOI:

https://doi.org/10.14740/gr2062

Keywords:

Drug-induced liver injury, Artificial intelligence, Large language model, Machine learning

Abstract

Background: Drug-induced liver injury (DILI) presents a significant diagnostic challenge, often leading to delayed detection. Unstructured clinical notes contain comprehensive medication data vital for DILI surveillance but are difficult to analyze systematically. Large language models (LLMs) show promise for automated extraction but require real-world clinical data validation to assess feasibility for clinical applications like DILI surveillance.

Methods: We retrospectively validated an LLM system on 100 randomly sampled Medical Information Mart for Intensive Care IV (MIMIC-IV) discharge summaries. Gold standard unique medication lists were derived via manual annotation and manual deduplication based on normalized drug names. LLM outputs underwent identical deduplication. Performance was assessed using precision, recall, and F1-score comparing deduplicated lists. MIMIC-IV data use agreement (DUA) compliance was ensured.

Results: Comparison yielded a precision of 0.85, recall of 1.00, and an F1-score of 0.92 for unique medication identification. The 174 false positives resulted from parsing or normalization errors; no medication hallucinations occurred. A subsequent DILI database lookup failed for approximately 6.2% of correctly identified unique medications, evaluated as a separate feasibility measure.

Conclusions: The LLM demonstrates high accuracy and perfect recall for unique medication extraction and identification from complex clinical notes, establishing technical feasibility. This represents a feasible and possible integration of LLM towards developing automated tools for enhanced DILI surveillance and improved patient safety.

Author Biography

  • Thanathip Suenghataiphorn, Griffin Hospital, Derby, CT, USA

    Department of Internal Medicine, Griffin Hospital, Derby, CT, United States

Published

2025-10-09

Issue

Section

Original Articles

How to Cite

1.
Suenghataiphorn T, Boonpiraks K, Prasitsumrit V, Kulthamrongsri N, Danpanichkul P. Enabling Drug-Induced Liver Injury Surveillance Through Automated Medication Extraction From Clinical Notes: A Medical Information Mart for Intensive Care IV Real-World Large Language Models Validation Study. Gastroenterol Res. 2025;18(5):247-253. doi:10.14740/gr2062