Summary

In collaboration with University Health Services and the College of Health Sciences, we are building an accessible support tool that provides nutritional guidance by delivering actionable, research backed insights that help users make healthier choices at the point of decision making. Our One Good Choice (1GC) platform combines large language models, machine learning, and curated data sources to predict caloric density and Food Compass Score a from a simple text description, offering personalized healthier alternatives.

Unlike traditional calorie or food trackers which passively track food intake, 1GC is a decision-making tool that proactively connects with an individual, educating and offering personalized recommendations to better inform their dietary decisions, one meal at a time. This tool is being developed with health and wellness experts from the College of Health Sciences and University Health Services. The primary goal of this effort is to help Kentuckians improve their nutrition through an accessible, easy to use, informative, and judgement free platform. 

A demo if the current system is available here: One Good Choice

This project is still in development.

Datasets/Models

The Food Compass, developed by researchers at Tufts University, is a comprehensive nutrition profiling system that scores foods from 1 to 100 based on their overall nutritional value. It considers factors such as vitamins, minerals, additives, level of processing, and more to generate a single, interpretable score: higher is better.

This model was selected by our collaborators for its level of detail and ease of interpretation. Calculating the Food Compass Score for a given item requires robust nutritional data; 1GC is designed to be simple to use, focused on just text inputs. To bridge that gap, we focus on predicting the Food Compass Score, and calorie content, based only on a food’s description.

To do this, we combine two public datasets:

  • Food Compass 2.0, which provides both food descriptions and their corresponding health scores
  • USDA FoodData Central, which offers calorie content and detailed nutritional metadata for thousands of foods

By aligning records from these sources, we generate a rich training set where each item includes:

  • A text-based food description
  • A Food Compass Score
  • A calorie count per 100g

This pairing enables us to train models that learn to map food descriptions to nutritional predictions.

Using modern language models and regression techniques, we predict both Food Compass Scores and calorie density from free-text food descriptions. When a user enters a phrase like “a slice of pepperoni pizza” or “turkey burger with coleslaw,” our backend executes a seven-step workflow:

  1. Baseline profiling uses our core engine to predict calories and Food Compass Score (FCS), establishing a reference health metric.
  2. An LLM generates 15 distinct healthier alternatives across taste similarity, cooking methods, and creative substitutions, avoiding minor adjective swaps.
  3.  Each alternative is evaluated in parallel through the same nutritional prediction pipeline.
  4. Suggestions are filtered by goals: improved FCS, lower calories, or both.
  5. A search tool discovers high quality recipes from trusted sites as identified by nutrition experts
  6. An LLM ranks recipe URLs by semantic relevance to each suggested alternative.
  7. The most relevant options are displayed in a clean, intuitive interface.

This process balances flexibility with deterministic filtering, search, and presentation from LLMs to ensure results are auditable and suitable for clinical workflows.

The AI: Embeddings and Predictive Models

To convert food descriptions into something a computer can learn from, we first embed each food description into a high-dimensional vector using LLM-Factory, our in-house AI service at CAAI that enables fast, scalable, and secure access to advanced language model embeddings.

For each food item in the dataset, we take its plain-text description (e.g. “fried chicken sandwich with mayo”) and generate a semantic embedding that captures both linguistic structure and conceptual meaning. These embeddings become the input features for our predictive models.

We maintain two complementary PyTorch models to handle complex, user-described meals:

  1. Food Compass Score Prediction uses sentence-transformer embeddings (768 dimensions) followed by a feedforward network (768→256→128→64→1) to output a 1–100 score. This captures cooking methods and ingredient relationships more holistically than keyword-based approaches.
  2. Calorie Estimation decomposes meals into fundamental components via LLM prompts, then engineers hybrid features (TF-IDF, cooking-method multipliers, average category calories, and portion multipliers). A two-stage model encodes these features into 1024-dimensional embeddings before regressing to calories per 100g. Final calorie counts sum component predictions weighted by portion multipliers.

Results

This project is still in development.

Initial results show strong correlations between the predicted and actual values, indicating that much of the nutritional profile of a food can be inferred in how we describe it. By combining structured nutrition data with unstructured language inputs, we’re making food health knowledge more accessible and scalable.

Categories: