Enabling Drug-Induced Liver Injury Surveillance Through Automated Medication Extraction From Clinical Notes: A Medical Information Mart for Intensive Care IV Real-World Large Language Models Validation Study
DOI:
https://doi.org/10.14740/gr2062Keywords:
Drug-induced liver injury, Artificial intelligence, Large language model, Machine learningAbstract
Background: Drug-induced liver injury (DILI) presents a significant diagnostic challenge, often leading to delayed detection. Unstructured clinical notes contain comprehensive medication data vital for DILI surveillance but are difficult to analyze systematically. Large language models (LLMs) show promise for automated extraction but require real-world clinical data validation to assess feasibility for clinical applications like DILI surveillance.
Methods: We retrospectively validated an LLM system on 100 randomly sampled Medical Information Mart for Intensive Care IV (MIMIC-IV) discharge summaries. Gold standard unique medication lists were derived via manual annotation and manual deduplication based on normalized drug names. LLM outputs underwent identical deduplication. Performance was assessed using precision, recall, and F1-score comparing deduplicated lists. MIMIC-IV data use agreement (DUA) compliance was ensured.
Results: Comparison yielded a precision of 0.85, recall of 1.00, and an F1-score of 0.92 for unique medication identification. The 174 false positives resulted from parsing or normalization errors; no medication hallucinations occurred. A subsequent DILI database lookup failed for approximately 6.2% of correctly identified unique medications, evaluated as a separate feasibility measure.
Conclusions: The LLM demonstrates high accuracy and perfect recall for unique medication extraction and identification from complex clinical notes, establishing technical feasibility. This represents a feasible and possible integration of LLM towards developing automated tools for enhanced DILI surveillance and improved patient safety.

Published
Issue
Section
License
Copyright (c) 2025 The authors

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.