PubMed Literature Agent – Center for Applied AI Hub

on May 13, 2026

Summary

Uncovering University of Kentucky Studies that “Meet People Where They Are”

“Meeting people where they are” is a healthcare principle for providing empathetic care to patients in their homes, encampments, or other locations, with respect to a patient’s circumstances. However, traditional database searching relies on metadata tags and keyword matching, which is ineffective for nuanced concepts that lack standardization. Our solution leverages artificial intelligence (AI) agents with large language models (LLMs) to improve searching the PubMed database. The agent uses tailored prompt engineering to focus the generator on specific parts of the text, using deep semantic analysis for contextual alignment.

CAAI has developed a solution that leverages artificial intelligence agents with LLMs to improve searching the PubMed database. The agent uses tailored prompt engineering to focus an LLM on specific parts of the text, using deep semantic analysis for contextual alignment.

By contextualizing research methods within the specified principle, the agent significantly improved the precision of the extracted information, resulting in a more targeted set of results. The inclusion of selection rationale in the standard output allows a top-level view for quicker human evaluation. The deployment of this AI-driven approach has the potential to revolutionize literature search practices, particularly in identifying studies that prioritize health equity through non-traditional outreach methods. The methodology can be adapted for broader applications where nuanced contextual understanding is critical for information retrieval. This model will use LLM to scrape abstracts from UK articles living in PubMed to make connections and pull points of interest.

Datasets/Model

The agent preprocesses articles by applying MeSH (Medical Subject Headings) terms and filter tags to pull full-text articles describing human-subject research conducted by UK investigators. Eligible records were retrieved from the PubMed Central (PMC) application programming interface (API). Each article undergoes a two-stage natural language processing (NLP) pipeline.

Filtering Stage: Articles are processed by DeepSeek-R1 with an instruction based on a Jinja template, filled in with a comprehensive description of the “meet people where they are” principle and defining parameters, such as the delivery of healthcare services in patient-referred locations or using unconventional intervention methods.

Feature Extraction Stage: Retained studies were re-analysed using a method-specific Jinja template to extract structured attributes, including:

Location: the location(s) where the study took place
Participating organizations(s): the names of the organizations involved in the study
Data Collection Site(s): where data was collected from participants
Patient Recruitment Site: the location where the patients were recruited
Community Engagement: any community engagement (e.g., county extension agents, fire departments, libraries, etc.)
Categories of Treatment: list of categories (e.g., medicine, intervention, therapy, etc.)
Types of Studies: list of study types (e.g., cross-sectional study, retrospective cohort, prospective cohort, clinical trial, none, etc.)

To handle large quantities of reports in PMC, the data was run in batches across task queues. The analysis was saved to a disk during the run to allow for checkpointing and deduplication. This standardizes the analysis outputs for expert validation and supports continuous updating as new PubMed articles are published, while preventing reanalysis of previously seen articles.

Access

The data used for this project is from research conducted by the University of Kentucky, which is within PubMed.

Ownership

This project has been completed.

This poster was published and presented at the AMIA 2025 Annual Symposium.

Resources Utilized

Emily Collier, Sam Armstrong, Noah Perry, and Dr. Cody Bumgardner worked on the PubMed Literature Agent.

LLM Factory was used for this project.

Categories:

Project Research & Discovery

Tags:

agent llm