Chapters: 

SPEC-1-Nutrition-Intelligence-Runtime

Background

The project is a nutrition intelligence platform built incrementally since January, not a recipe-only application. Existing components already cover several layers end to end: a maintained EuroFIR-style nutrient table (data/eurofir_mediterranean.csv), a direct nutrition lookup utility (nutrition_lookup.py), a Chroma enrichment pipeline (rag_setup/enrich_nutrition_db.py), and a stateful multi-agent chatbot runtime (multi_agent_chatbot/agentic_chatbot.py).

A lightweight Flask viewer exists for inspecting and editing normalized ingredient lines, but that viewer is only an operational aid. The primary architectural goal is to restore and extend the site’s “smarts”: reliable nutrient retrieval, structured interpretation of user inputs, agent/tool orchestration, and delivery of nutrient information from the EuroDATA/EuroFIR-backed knowledge layer.

Requirements

Must have

  • Establish a single authoritative nutrient source and runtime lookup path.
  • Preserve data/eurofir_mediterranean.csv as the maintained source-of-truth input unless explicitly superseded.
  • Support deterministic nutrient lookup for direct food queries.
  • Support agent-accessible nutrient retrieval during stateful conversations.
  • Keep ingestion, compatibility exports, retrieval, and runtime serving clearly separated.
  • Allow ingredient or food text to be resolved into structured nutrient payloads.
  • Return a stable result shape including nutrient values and source/match metadata.

Should have

  • Reuse one shared lookup contract across chatbot, web app, and scripts.
  • Add confidence/review signaling for weak text matches.
  • Persist approved corrections or alias mappings so matching improves over time.
  • Distinguish exact nutrient lookups from semantic retrieval/RAG answers.

Could have

  • Hybrid retrieval using deterministic lookup first and Chroma fallback second.
  • Admin workflow for curating aliases and canonical food mappings.
  • Runtime observability for lookup quality, misses, and unresolved queries.

Method

1. Separate the system into four explicit layers

The current repo already implies four responsibilities, but they are not yet formalized as a runtime contract.

  1. Data maintenance layer
    • Authoritative editable file: data/eurofir_mediterranean.csv
    • Human-curated nutrient rows and metadata
  2. Index/build layer
    • rag_setup/convert_eurofir_to_calories.py for backward compatibility exports
    • rag_setup/enrich_nutrition_db.py for Chroma document/metadata upserts
  3. Nutrition intelligence layer
    • nutrition_lookup.py for deterministic structured lookup
    • Future alias matcher / confidence scorer / canonical food resolver
  4. Experience/runtime layer
    • multi_agent_chatbot/agentic_chatbot.py for conversational access
    • tools/recipe_viewer/app.py as operational inspection UI
    • Any future site/API should call the same nutrition service contract

2. Make deterministic lookup the primary runtime path

The current NutritionLookup.lookup(query) is the best candidate for the authoritative runtime primitive because it returns a structured result and is already reused by agent flows.

Recommended runtime policy:

  • Step 1: normalize input text
  • Step 2: attempt deterministic lookup against canonical food rows and aliases
  • Step 3: if confidence is high, return structured nutrient result
  • Step 4: if confidence is low or no match exists, optionally use Chroma/RAG to retrieve likely candidates
  • Step 5: return either a resolved canonical result or a review-needed response

This keeps nutrient values grounded in curated tabular data, while RAG stays a support mechanism rather than the source of truth.

3. Introduce a shared nutrition service contract

Create one internal service module that all front ends call.

Proposed contract:

@dataclass
class NutritionQuery:
    raw_text: str
    locale: str | None = None
    context_food_group: str | None = None

@dataclass
class NutritionMatch:
    query: str
    canonical_food_name: str | None
    match_type: str  # exact | alias | prefix | substring | semantic | none
    confidence: float
    source: str      # eurofir_csv | chroma
    per_100g: dict[str, float | None]
    flags: dict[str, bool]
    notes: list[str]

Authoritative call shape:

resolve_nutrition(query: NutritionQuery) -> NutritionMatch

This should wrap the existing NutritionLookup first, instead of replacing it.

4. Add an alias/mapping table instead of overloading free-text matching

The current _match_score logic is useful but too fragile as the main long-term strategy. It should be supplemented with a curated alias table.

Proposed minimal table:

field

type

purpose

alias_text

string

user-facing or imported variant

canonical_food_name

string

exact FoodName target

status

enum

proposed / approved / rejected

source

string

manual / imported / runtime_feedback

created_at

datetime

audit

updated_at

datetime

audit

Suggested storage for MVP: SQLite.

Resolution order:

  • exact canonical food match
  • approved alias match
  • deterministic prefix/substring match
  • semantic retrieval fallback
  • unresolved

5. Keep Chroma for retrieval, not nutrient truth

The enriched Chroma collection should remain useful for:

  • semantic search
  • explanatory answers
  • broader nutrition Q&A
  • candidate generation when a direct match fails

It should not override curated nutrient fields when a canonical row exists in the CSV. Instead, Chroma should point the runtime back toward likely canonical foods.

6. Recommended component flow

@startuml
actor User
participant "Web UI / Chat UI" as UI
participant "Nutrition Service" as NS
database "Alias DB (SQLite)" as ADB
database "EuroFIR CSV Loader" as CSV
database "Chroma Nutrition DB" as CH

User -> UI : Ask for nutrient info
UI -> NS : resolve_nutrition(raw_text, context)
NS -> ADB : find approved alias
ADB --> NS : alias or none
NS -> CSV : deterministic lookup(canonical/raw)
CSV --> NS : structured nutrient row or none

alt high-confidence match
  NS --> UI : canonical nutrient payload
else low-confidence or no match
  NS -> CH : semantic candidate retrieval
  CH --> NS : likely candidates
  NS --> UI : candidate/review-needed response
end
@enduml

7. Immediate implementation direction

The repo appears ready to resume the smart layer if the next work item is to consolidate runtime lookup behind a single service boundary.

Recommended immediate sequence:

  1. Wrap NutritionLookup in a new resolve_nutrition() service.
  2. Add source/match/confidence metadata to the returned payload.
  3. Add an alias table in SQLite.
  4. Update chatbot and Flask/runtime entrypoints to call the same service.
  5. Use Chroma only as fallback candidate generation.

Implementation

Phase 1: Introduce a shared nutrition service

Create a new internal module, for example nutrition_service.py, that becomes the only runtime entry point for nutrient resolution.

Responsibilities:

  • accept raw food text
  • normalize the text
  • consult SQLite aliases first
  • call NutritionLookup for canonical CSV-backed lookup
  • invoke Chroma only when deterministic matching is weak or missing
  • return one stable payload shape with match/source/confidence metadata

Suggested functions:

def normalize_food_text(text: str) -> str: ...
def resolve_nutrition(raw_text: str, context: dict | None = None) -> NutritionMatch: ...
def lookup_alias(normalized_text: str) -> str | None: ...
def semantic_candidates(raw_text: str, limit: int = 5) -> list[dict]: ...

Phase 2: Add SQLite alias persistence

Create a small SQLite database, for example data/nutrition_runtime.db.

Proposed schema:

CREATE TABLE IF NOT EXISTS food_aliases (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    alias_text TEXT NOT NULL,
    alias_normalized TEXT NOT NULL,
    canonical_food_name TEXT NOT NULL,
    status TEXT NOT NULL CHECK (status IN ('proposed', 'approved', 'rejected')),
    source TEXT NOT NULL DEFAULT 'manual',
    notes TEXT,
    created_at TEXT NOT NULL,
    updated_at TEXT NOT NULL
);

CREATE UNIQUE INDEX IF NOT EXISTS idx_food_alias_unique
ON food_aliases(alias_normalized, canonical_food_name);

CREATE INDEX IF NOT EXISTS idx_food_alias_status
ON food_aliases(status);

Resolution logic:

  • use only approved aliases at runtime
  • store uncertain user corrections as proposed until reviewed
  • never mutate the CSV automatically from runtime behavior

Phase 3: Extend result metadata

Preserve the existing NutritionResult nutrient payload, but wrap or extend it with runtime metadata.

Suggested response shape:

@dataclass
class NutritionMatch:
    query: str
    canonical_food_name: str | None
    match_type: str
    confidence: float
    source: str
    per_100g: dict[str, float | None]
    flags: dict[str, bool]
    review_needed: bool
    candidate_foods: list[str]
    notes: list[str]

Suggested confidence rules for MVP:

  • exact canonical match: 1.0
  • approved alias match: 0.98
  • prefix match: 0.85
  • substring match: 0.65
  • Chroma candidate-assisted match: 0.40-0.75
  • no match: 0.0

Set review_needed = True for anything below 0.90.

Phase 4: Refactor current entry points to use the shared service

Chatbot path

Update multi_agent_chatbot/nutrition_agent.py and related tool wrappers so food/nutrient requests call resolve_nutrition() instead of importing NutritionLookup directly.

Benefits:

  • one matching policy everywhere
  • one confidence model everywhere
  • alias reuse across all interfaces

Flask/debug UI path

If the Flask viewer remains in use, add a lightweight nutrient-resolution panel for each line that calls the same service. That preserves its role as an inspection and correction surface without making it a separate logic path.

CLI/scripts

Any script that currently imports NutritionLookup directly should either:

  • keep doing so for build-time tasks only, or
  • switch to resolve_nutrition() if it is acting as runtime/user-facing behavior

Phase 5: Keep Chroma as fallback retrieval only

Update the retrieval path so rag_setup/enrich_nutrition_db.py continues to build semantic documents from the EuroFIR CSV, but runtime usage follows this order:

  1. canonical CSV exact match
  2. approved alias match
  3. deterministic prefix/substring match
  4. Chroma semantic candidate retrieval
  5. unresolved response

Chroma output should be treated as candidate generation, not canonical truth substitution.

Phase 6: Add logging and quality measurement

For each runtime query, log:

  • raw query
  • normalized query
  • chosen canonical food
  • match type
  • confidence
  • whether review was needed
  • whether Chroma fallback was used

Suggested MVP log destination:

  • SQLite table or structured JSONL log file under data/ or logs/

This is necessary to improve alias coverage and identify systematic lookup failures.

Phase 7: Review workflow for alias promotion

Add a simple review mechanism, initially manual.

Minimal operational loop:

  • unresolved or low-confidence queries are logged
  • approved corrections are inserted into food_aliases
  • future lookups resolve through aliases before fallback matching

This gives the system a durable learning loop without introducing unsafe automatic self-modification of nutrient truth.

Milestones

Milestone 1 — Standard lookup function in place

  • Create nutrition_service.py
  • Add resolve_nutrition(raw_text: str) -> NutritionMatch
  • Keep NutritionLookup underneath it for the first version

Milestone 2 — Alias persistence working

  • Add data/nutrition_runtime.db
  • Create food_aliases table
  • Runtime uses approved aliases before loose text matching

Milestone 3 — Runtime adoption

  • Update chatbot nutrition calls to use resolve_nutrition()
  • Update Flask/debug UI to use the same function where nutrient lookup is shown

Milestone 4 — Chroma fallback integrated

  • Use Chroma only after deterministic matching fails or is weak
  • Return candidates/review-needed responses without replacing canonical nutrient truth

Milestone 5 — Review loop active

  • Log misses and low-confidence matches
  • Approve useful aliases into SQLite
  • Improve lookup quality over time without changing the curated CSV automatically

Gathering Results

The main question after launch is whether one standard nutrition lookup function used by all parts of the system is actually improving correctness, consistency, and maintainability.

1. Measure lookup success

Track these core runtime metrics:

  • resolution rate: percentage of queries that return a canonical food match
  • high-confidence resolution rate: percentage of queries resolved without review
  • review-needed rate: percentage of queries flagged for human attention
  • unresolved rate: percentage of queries that return no usable match
  • alias-assisted resolution rate: percentage of queries resolved through approved aliases
  • Chroma fallback rate: percentage of queries requiring semantic retrieval

These show whether the standard lookup path is becoming more reliable over time.

2. Measure correctness

Create a small evaluation set of representative food queries drawn from real usage:

  • exact food names
  • common alternate names
  • spelling variations
  • branded or informal user phrasing
  • ambiguous foods

For each query, record:

  • expected canonical food
  • actual canonical food returned
  • match type
  • confidence
  • whether review was required

Primary quality measures:

  • top-1 match accuracy on the evaluation set
  • false-positive rate where the system returns the wrong canonical food with high confidence
  • manual correction rate after user or reviewer intervention

False positives matter more than unresolved results. It is better for the system to ask for review than to return the wrong nutrient answer confidently.

3. Measure consistency across interfaces

The same input should resolve the same way regardless of where it is asked.

Validate that:

  • chatbot queries
  • Flask/debug UI lookups
  • future API lookups

all return the same NutritionMatch payload for the same input text and the same data version.

This confirms that the project is truly using one standard nutrition lookup function used by all parts of the system, rather than quietly diverging into multiple logic paths again.

4. Measure operational improvement

The new architecture is successful if maintenance gets easier.

Operational indicators:

  • fewer duplicate lookup implementations
  • fewer one-off match fixes inside UI or agent code
  • faster correction of recurring misses through alias approval
  • reduced need to edit business logic when adding food-name variants

5. Review loop effectiveness

Evaluate the alias workflow monthly or at another regular interval.

Questions to ask:

  • which unresolved queries appear repeatedly?
  • which low-confidence queries became approved aliases?
  • how many new aliases reduced future misses?
  • are reviewers approving useful mappings or compensating for poor canonical data?

If alias growth improves resolution rate without increasing false positives, the review loop is working.

6. Production acceptance criteria

The MVP can be considered successful when:

  • most common food queries resolve to a canonical result
  • high-confidence matches clearly outnumber review-needed cases
  • unresolved queries are captured for follow-up instead of disappearing silently
  • the same lookup result is returned across chatbot and web/runtime surfaces
  • approved aliases measurably reduce repeated lookup failures
  • no runtime process silently changes canonical nutrient truth

7. Post-production evaluation outcome

After deployment, the system should produce a simple recurring report covering:

  • total query volume
  • resolution rate
  • unresolved rate
  • review-needed rate
  • top recurring misses
  • newly approved aliases
  • any confirmed false-positive matches

That report becomes the basis for deciding whether the next investment should go into better normalization, more aliases, richer candidate handling, or improvements to the canonical EuroFIR data itself.

 



Nutrition Intelligence Runtime — Terminology Alignment Note


Purpose

This note exists to prevent terminology drift and maintain alignment with the established architecture.

It defines what we call things, what we do not rename, and how to reason about the system consistently as it evolves.


1. Canonical System Description

The project is a nutrition intelligence platform, not a recipe-only application.

It is built incrementally and already contains multiple layers:

  • maintained nutrient data (data/eurofir_mediterranean.csv)
  • deterministic lookup (nutrition_lookup.py)
  • enrichment/retrieval (rag_setup/*)
  • multi-agent runtime (multi_agent_chatbot/*)
  • UI surfaces (Chainlit, Flask viewer)

The goal is:

Restore and extend the system’s “smarts” through reliable nutrient retrieval, structured input interpretation, and agent/tool orchestration.


2. Canonical Layer Names (DO NOT RENAME)

Always use these four layers:

1. Data Maintenance Layer

  • Source: data/eurofir_mediterranean.csv
  • Role: curated nutrient data
  • Status: authoritative input

2. Index / Build Layer

  • Source: rag_setup/*
  • Role:
    • enrichment
    • compatibility exports
    • Chroma preparation
  • Notes:
    • internal shaping allowed
    • not part of runtime contract

3. Nutrition Intelligence Layer

  • Source:
    • nutrition_lookup.py
    • nutrition_calculator.py
  • Role:
    • deterministic nutrient lookup
    • structured nutrient aggregation
  • Output:
    • stable runtime payloads (contracted)

4. Experience / Runtime Layer

  • Source:
    • multi_agent_chatbot/*
    • Chainlit
    • Flask viewer
  • Role:
    • user interaction
    • agent orchestration
    • formatting and delivery

3. Contract Boundary (DO NOT MOVE)

The contract boundary exists between:

Nutrition Intelligence Layer
        ↓
Experience / Runtime Layer

Meaning:

  • runtime consumers must use structured outputs
  • internal data formats must not leak upward
  • ingestion/enrichment code must not be consumed directly

4. Terminology Rules

Use these terms consistently

  • “nutrition intelligence platform”
  • “deterministic lookup”
  • “nutrition intelligence layer”
  • “experience/runtime layer”
  • “enrichment” (not “source of truth”)
  • “contract” (for runtime payload shape)
  • “consumer” (anything reading outputs)

Avoid introducing new labels for:

  • the overall system (do not rename it casually)
  • the core lookup layer (do not rebrand it midstream)
  • the architecture unless a real change occurs

5. What is NOT a Structural Change

The following are allowed evolutions, not architecture changes:

  • stabilizing result shapes
  • improving lookup matching
  • adding signals or metadata
  • refining agent formatting
  • expanding nutrition use cases (iron, carbs, etc.)
  • adding multi-agent reasoning on top

These are:

extensions of the existing architecture


6. What WOULD be a Structural Change

Treat these as significant and require explicit discussion:

  • replacing CSV as the primary nutrient source
  • bypassing deterministic lookup with direct RAG answers
  • removing separation between ingestion and runtime
  • making UI depend on raw data or Chroma outputs
  • replacing Chainlit as a primary interface

7. Ingestion vs Runtime Rule

Internal shaping code can be flexible. Runtime outputs cannot.

  • rag_setup/* may use:
    • raw rows
    • alternate schemas
    • enrichment metadata
  • Runtime must use:
    • structured outputs from lookup/calculator

8. Current State (Pinned)

  • Architecture matches the original spec
  • No structural shift has occurred
  • Runtime behavior has been stabilized
  • Consumers are aligned with the nutrition intelligence layer
  • Chainlit remains active and protected

9. Working Principle

Do not redesign. Clarify, stabilize, and extend.


10. Mental Model

Data (EuroFIR CSV)
        ↓
Nutrition Intelligence (lookup + calculator)
        ↓
Agent / Runtime (multi-agent orchestration)
        ↓
Interfaces (Chainlit, UI, CLI)
        ↓
Optional Support (Chroma / enrichment)

Final Note

When in doubt:

  • compare against this document
  • compare against the original spec
  • prefer consistency over novelty
  • avoid renaming unless meaning has truly changed


{
 "raw_text": "1 large yellow onion, chopped",
 "lookup_name": "yellow onion",
 "quantity": 1,
 "unit": "large",
 "kind": "ingredient",
 "meta": {
   "preparation": "chopped",
   "descriptors": ["large"]
 }
}
• Explored
  └ Read nutrition_calculator.py

• I'm tackling ingredient normalization to improve lookup accuracy by cleaning names and converting packaging units like
  ounces and cans into grams. This involves creating a normalization module that processes ingredient names—removing
  descriptors, applying aliases, and handling packaging info—to produce standardized quantities and units primarily in
  grams. I'll integrate this into the nutrition calculator to use the normalized data for accurate macro calculations,
  carefully preserving original ingredient data and managing edge cases like zero-impact ingredients and count-based
  units. This approach prioritizes enabling lookups over perfect mass conversions given complexity and time constraints.

• Explored
  └ Read eurofir_mediterranean.csv
    Search pepper in eurofir_mediterranean.csv
    Read eurofir_mediterranean.csv
    Search onion in eurofir_mediterranean.csv
    Search black in eurofir_mediterranean.csv
    Search peas in eurofir_mediterranean.csv