LLM Factory: A Specialized, Customizable, Secure Self-Service Tool

on August 9, 2024

Summary

LLM Factory is a self-service tool developed by the University of Kentucky’s Center for Applied AI (CAAI). It is a locally controlled platform that empowers users to interact with the latest open-source large language models (LLMs) through an intuitive web interface or programmatically via API calls. As the foundation for all of CAAI’s internal LLM research & applications, the platform offers proven versatility. With LLM Factory, users can securely fine-tune their own Large Language Models (LLMs) and query the model on a platform controlled by the University of Kentucky. LLM Factory offers users control, flexibility, and security.

Cutting edge models like Deep Seek R1, Llama 3.2 90B Vision, and Llama 3.0 8B are accessible via LLM Factory’s chat interface or through OpenAI compatible API. The OpenAI library, and other tools compatible with Open AI, work with LLM Factory’s API, enabling users to leverage a range of advanced features, including chat, embeddings, transcription, function calling, and more. The power of having an OpenAI compatible API is versatility; as new models become available or the OpenAI library expands, LLM Factory will evolve in parallel.

Datasets/Models

How LLM Factory Works

An Introduction to fine-tuning, data security, API Requests, and hardware

LLM Factory leverages Parameter-Efficient Fine-Tuning (PEFT) methods like Low-Rank Adaptation (LoRA) to make model customization efficient and scalable. Users can upload their own datasets, train adapters by layering data on a base model, privately configure and explore models, and expose models to external, OpenAI compatible API.

Layering additional information, called adapters, on top of a base model enables fine-tuning on only a small subset of parameters. Previously, custom model training required significant computational resources. Now, with LoRA methods, we can achieve comparable performance to traditional full fine-tuning, but with significantly reduced memory usage and trainable parameters. The latest open-source models and adapter training are supported by our NVIDIA DGX computing cluster, with 3.2TB of VRAM.

A key advantage of this locally controlled platform is data security. With LLM Factory, user’s data, chat history, and API interactions are secure and private. Unlike commercial services with ambiguous data policies, user interactions with LLM Factory are only stored locally. For projects with sensitive information, individual HIPAA-compliant instances are available upon request and evaluated on a needs-based basis.

Importantly, LLM Factory’s API endpoints are OpenAI compatible. This means that all other libraries, tools, and systems that are OpenAI compatible (the industry standard) can be integrated seamlessly with LLM Factory. LLM Factory’s User Guide goes in-depth and covers things like adapter training, tool/function calling, embeddings, and transcription.

Access

An informational video on LLM Factory is available on YouTube.

LLM Factory is available through collaboration with CAAI. You must be granted the necessary permissions from a CAAI Administrator in order to access the platform. Please contact us for access or fill out our collaboration form.

Users of LLM Factory must agree to acknowledge the tool in their research, publications, and/or products. Read our paper on the Institutional Platform for Secure Self-Service Large Language Model Exploration in PubMed.

Ownership

Collaborative Projects using LLM Factory

A few of the many experimental projects supported by LLM Factory include:

One Good Choice – A tool that encourages healthier decisions, developed with University Health Services and the College of Health Sciences
Lantern – An interface leveraging code generation capabilities to navigate Kentucky’s educational data (KyStats) with natural language queries
BLUE – A web-based platform for Kentuckians that connects them to vetted agricultural resources through retrieval augmented generation (RAG) methods
Synthetic Personas in Educational and Training Scenarios – Conversational AI agents that transform static content into interactive, vignette-driven AI personas. In one application for the Office of Medical Education, the LLM is leveraged to mimic doctor-patient interactions, facilitating consent assessment (based on U-ARE criteria), and communication skill development
CELT Look-up – Enables conversational navigation of CELT’s resources and website to help users navigate and find resources

Resources Utilized

LLM Factory relies on local cyberinfrastructure, including: inference servers, storage, and in the case of adapter training, the DGX. This cyberinfrastructure is managed by CAAI. The lead developer behind LLM Factory is Mitchell Klusty.

LLM Factory is available at no cost, please reach out to us or submit our collaboration form to create your account. Access LLM Factory through data.ai.uky.edu.

Categories:

LLM Self-Service Tool

Tags:

embeddings large language models RAG tool calling