SPEC-1-Nutrition-Intelligence-Runtime with Terminology alignment

Chapters:

SPEC-1-Nutrition-Intelligence-Runtime

Background

The project is a nutrition intelligence platform built incrementally since January, not a recipe-only application. Existing components already cover several layers end to end: a maintained EuroFIR-style nutrient table (data/eurofir_mediterranean.csv), a direct nutrition lookup utility (nutrition_lookup.py), a Chroma enrichment pipeline (rag_setup/enrich_nutrition_db.py), and a stateful multi-agent chatbot runtime (multi_agent_chatbot/agentic_chatbot.py).

A lightweight Flask viewer exists for inspecting and editing normalized ingredient lines, but that viewer is only an operational aid. The primary architectural goal is to restore and extend the site’s “smarts”: reliable nutrient retrieval, structured interpretation of user inputs, agent/tool orchestration, and delivery of nutrient information from the EuroDATA/EuroFIR-backed knowledge layer.

Requirements

Must have

Establish a single authoritative nutrient source and runtime lookup path.
Preserve data/eurofir_mediterranean.csv as the maintained source-of-truth input unless explicitly superseded.
Support deterministic nutrient lookup for direct food queries.
Support agent-accessible nutrient retrieval during stateful conversations.
Keep ingestion, compatibility exports, retrieval, and runtime serving clearly separated.
Allow ingredient or food text to be resolved into structured nutrient payloads.
Return a stable result shape including nutrient values and source/match metadata.

Should have

Reuse one shared lookup contract across chatbot, web app, and scripts.
Add confidence/review signaling for weak text matches.
Persist approved corrections or alias mappings so matching improves over time.
Distinguish exact nutrient lookups from semantic retrieval/RAG answers.

Could have

Hybrid retrieval using deterministic lookup first and Chroma fallback second.
Admin workflow for curating aliases and canonical food mappings.
Runtime observability for lookup quality, misses, and unresolved queries.

Method

1. Separate the system into four explicit layers

The current repo already implies four responsibilities, but they are not yet formalized as a runtime contract.

Data maintenance layer
- Authoritative editable file: data/eurofir_mediterranean.csv
- Human-curated nutrient rows and metadata
Index/build layer
- rag_setup/convert_eurofir_to_calories.py for backward compatibility exports
- rag_setup/enrich_nutrition_db.py for Chroma document/metadata upserts
Nutrition intelligence layer
- nutrition_lookup.py for deterministic structured lookup
- Future alias matcher / confidence scorer / canonical food resolver
Experience/runtime layer
- multi_agent_chatbot/agentic_chatbot.py for conversational access
- tools/recipe_viewer/app.py as operational inspection UI
- Any future site/API should call the same nutrition service contract

2. Make deterministic lookup the primary runtime path

The current NutritionLookup.lookup(query) is the best candidate for the authoritative runtime primitive because it returns a structured result and is already reused by agent flows.

Recommended runtime policy:

Step 1: normalize input text
Step 2: attempt deterministic lookup against canonical food rows and aliases
Step 3: if confidence is high, return structured nutrient result
Step 4: if confidence is low or no match exists, optionally use Chroma/RAG to retrieve likely candidates
Step 5: return either a resolved canonical result or a review-needed response

This keeps nutrient values grounded in curated tabular data, while RAG stays a support mechanism rather than the source of truth.

3. Introduce a shared nutrition service contract

Create one internal service module that all front ends call.

Proposed contract:

@dataclass
class NutritionQuery:
    raw_text: str
    locale: str | None = None
    context_food_group: str | None = None

@dataclass
class NutritionMatch:
    query: str
    canonical_food_name: str | None
    match_type: str  # exact | alias | prefix | substring | semantic | none
    confidence: float
    source: str      # eurofir_csv | chroma
    per_100g: dict[str, float | None]
    flags: dict[str, bool]
    notes: list[str]

Authoritative call shape:

resolve_nutrition(query: NutritionQuery) -> NutritionMatch

This should wrap the existing NutritionLookup first, instead of replacing it.

4. Add an alias/mapping table instead of overloading free-text matching

The current _match_score logic is useful but too fragile as the main long-term strategy. It should be supplemented with a curated alias table.

Proposed minimal table:

field	type	purpose
alias_text	string	user-facing or imported variant
canonical_food_name	string	exact `FoodName` target
status	enum	proposed / approved / rejected
source	string	manual / imported / runtime_feedback
created_at	datetime	audit
updated_at	datetime	audit

Suggested storage for MVP: SQLite.

Resolution order:

exact canonical food match
approved alias match
deterministic prefix/substring match
semantic retrieval fallback
unresolved

5. Keep Chroma for retrieval, not nutrient truth

The enriched Chroma collection should remain useful for:

semantic search
explanatory answers
broader nutrition Q&A
candidate generation when a direct match fails

It should not override curated nutrient fields when a canonical row exists in the CSV. Instead, Chroma should point the runtime back toward likely canonical foods.

6. Recommended component flow

@startuml
actor User
participant "Web UI / Chat UI" as UI
participant "Nutrition Service" as NS
database "Alias DB (SQLite)" as ADB
database "EuroFIR CSV Loader" as CSV
database "Chroma Nutrition DB" as CH

User -> UI : Ask for nutrient info
UI -> NS : resolve_nutrition(raw_text, context)
NS -> ADB : find approved alias
ADB --> NS : alias or none
NS -> CSV : deterministic lookup(canonical/raw)
CSV --> NS : structured nutrient row or none

alt high-confidence match
  NS --> UI : canonical nutrient payload
else low-confidence or no match
  NS -> CH : semantic candidate retrieval
  CH --> NS : likely candidates
  NS --> UI : candidate/review-needed response
end
@enduml

7. Immediate implementation direction

The repo appears ready to resume the smart layer if the next work item is to consolidate runtime lookup behind a single service boundary.

Recommended immediate sequence:

Wrap NutritionLookup in a new resolve_nutrition() service.
Add source/match/confidence metadata to the returned payload.
Add an alias table in SQLite.
Update chatbot and Flask/runtime entrypoints to call the same service.
Use Chroma only as fallback candidate generation.

Implementation

Phase 1: Introduce a shared nutrition service

Create a new internal module, for example nutrition_service.py, that becomes the only runtime entry point for nutrient resolution.

Responsibilities:

accept raw food text
normalize the text
consult SQLite aliases first
call NutritionLookup for canonical CSV-backed lookup
invoke Chroma only when deterministic matching is weak or missing
return one stable payload shape with match/source/confidence metadata

Suggested functions:

def normalize_food_text(text: str) -> str: ...
def resolve_nutrition(raw_text: str, context: dict | None = None) -> NutritionMatch: ...
def lookup_alias(normalized_text: str) -> str | None: ...
def semantic_candidates(raw_text: str, limit: int = 5) -> list[dict]: ...

Phase 2: Add SQLite alias persistence

Create a small SQLite database, for example data/nutrition_runtime.db.

Proposed schema:

CREATE TABLE IF NOT EXISTS food_aliases (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    alias_text TEXT NOT NULL,
    alias_normalized TEXT NOT NULL,
    canonical_food_name TEXT NOT NULL,
    status TEXT NOT NULL CHECK (status IN ('proposed', 'approved', 'rejected')),
    source TEXT NOT NULL DEFAULT 'manual',
    notes TEXT,
    created_at TEXT NOT NULL,
    updated_at TEXT NOT NULL
);

CREATE UNIQUE INDEX IF NOT EXISTS idx_food_alias_unique
ON food_aliases(alias_normalized, canonical_food_name);

CREATE INDEX IF NOT EXISTS idx_food_alias_status
ON food_aliases(status);

Resolution logic:

use only approved aliases at runtime
store uncertain user corrections as proposed until reviewed
never mutate the CSV automatically from runtime behavior

Phase 3: Extend result metadata

Preserve the existing NutritionResult nutrient payload, but wrap or extend it with runtime metadata.

Suggested response shape:

@dataclass
class NutritionMatch:
    query: str
    canonical_food_name: str | None
    match_type: str
    confidence: float
    source: str
    per_100g: dict[str, float | None]
    flags: dict[str, bool]
    review_needed: bool
    candidate_foods: list[str]
    notes: list[str]

Suggested confidence rules for MVP:

exact canonical match: 1.0
approved alias match: 0.98
prefix match: 0.85
substring match: 0.65
Chroma candidate-assisted match: 0.40-0.75
no match: 0.0

Set review_needed = True for anything below 0.90.

Phase 4: Refactor current entry points to use the shared service

Chatbot path

Update multi_agent_chatbot/nutrition_agent.py and related tool wrappers so food/nutrient requests call resolve_nutrition() instead of importing NutritionLookup directly.

Benefits:

one matching policy everywhere
one confidence model everywhere
alias reuse across all interfaces

Flask/debug UI path

If the Flask viewer remains in use, add a lightweight nutrient-resolution panel for each line that calls the same service. That preserves its role as an inspection and correction surface without making it a separate logic path.

CLI/scripts

Any script that currently imports NutritionLookup directly should either:

keep doing so for build-time tasks only, or
switch to resolve_nutrition() if it is acting as runtime/user-facing behavior

Phase 5: Keep Chroma as fallback retrieval only

Update the retrieval path so rag_setup/enrich_nutrition_db.py continues to build semantic documents from the EuroFIR CSV, but runtime usage follows this order:

canonical CSV exact match
approved alias match
deterministic prefix/substring match
Chroma semantic candidate retrieval
unresolved response

Chroma output should be treated as candidate generation, not canonical truth substitution.

Phase 6: Add logging and quality measurement

For each runtime query, log:

raw query
normalized query
chosen canonical food
match type
confidence
whether review was needed
whether Chroma fallback was used

Suggested MVP log destination:

SQLite table or structured JSONL log file under data/ or logs/

This is necessary to improve alias coverage and identify systematic lookup failures.

Phase 7: Review workflow for alias promotion

Add a simple review mechanism, initially manual.

Minimal operational loop:

unresolved or low-confidence queries are logged
approved corrections are inserted into food_aliases
future lookups resolve through aliases before fallback matching

This gives the system a durable learning loop without introducing unsafe automatic self-modification of nutrient truth.

Milestones

Milestone 1 — Standard lookup function in place

Create nutrition_service.py
Add resolve_nutrition(raw_text: str) -> NutritionMatch
Keep NutritionLookup underneath it for the first version

Milestone 2 — Alias persistence working

Add data/nutrition_runtime.db
Create food_aliases table
Runtime uses approved aliases before loose text matching

Milestone 3 — Runtime adoption

Update chatbot nutrition calls to use resolve_nutrition()
Update Flask/debug UI to use the same function where nutrient lookup is shown

Milestone 4 — Chroma fallback integrated

Use Chroma only after deterministic matching fails or is weak
Return candidates/review-needed responses without replacing canonical nutrient truth

Milestone 5 — Review loop active

Log misses and low-confidence matches
Approve useful aliases into SQLite
Improve lookup quality over time without changing the curated CSV automatically

Gathering Results

The main question after launch is whether one standard nutrition lookup function used by all parts of the system is actually improving correctness, consistency, and maintainability.

1. Measure lookup success

Track these core runtime metrics:

resolution rate: percentage of queries that return a canonical food match
high-confidence resolution rate: percentage of queries resolved without review
review-needed rate: percentage of queries flagged for human attention
unresolved rate: percentage of queries that return no usable match
alias-assisted resolution rate: percentage of queries resolved through approved aliases
Chroma fallback rate: percentage of queries requiring semantic retrieval

These show whether the standard lookup path is becoming more reliable over time.

2. Measure correctness

Create a small evaluation set of representative food queries drawn from real usage:

exact food names
common alternate names
spelling variations
branded or informal user phrasing
ambiguous foods

For each query, record:

expected canonical food
actual canonical food returned
match type
confidence
whether review was required

Primary quality measures:

top-1 match accuracy on the evaluation set
false-positive rate where the system returns the wrong canonical food with high confidence
manual correction rate after user or reviewer intervention

False positives matter more than unresolved results. It is better for the system to ask for review than to return the wrong nutrient answer confidently.

3. Measure consistency across interfaces

The same input should resolve the same way regardless of where it is asked.

Validate that:

chatbot queries
Flask/debug UI lookups
future API lookups

all return the same NutritionMatch payload for the same input text and the same data version.

This confirms that the project is truly using one standard nutrition lookup function used by all parts of the system, rather than quietly diverging into multiple logic paths again.

4. Measure operational improvement

The new architecture is successful if maintenance gets easier.

Operational indicators:

fewer duplicate lookup implementations
fewer one-off match fixes inside UI or agent code
faster correction of recurring misses through alias approval
reduced need to edit business logic when adding food-name variants

5. Review loop effectiveness

Evaluate the alias workflow monthly or at another regular interval.

Questions to ask:

which unresolved queries appear repeatedly?
which low-confidence queries became approved aliases?
how many new aliases reduced future misses?
are reviewers approving useful mappings or compensating for poor canonical data?

If alias growth improves resolution rate without increasing false positives, the review loop is working.

6. Production acceptance criteria

The MVP can be considered successful when:

most common food queries resolve to a canonical result
high-confidence matches clearly outnumber review-needed cases
unresolved queries are captured for follow-up instead of disappearing silently
the same lookup result is returned across chatbot and web/runtime surfaces
approved aliases measurably reduce repeated lookup failures
no runtime process silently changes canonical nutrient truth

7. Post-production evaluation outcome

After deployment, the system should produce a simple recurring report covering:

total query volume
resolution rate
unresolved rate
review-needed rate
top recurring misses
newly approved aliases
any confirmed false-positive matches

That report becomes the basis for deciding whether the next investment should go into better normalization, more aliases, richer candidate handling, or improvements to the canonical EuroFIR data itself.

Nutrition Intelligence Runtime — Terminology Alignment Note

Purpose

This note exists to prevent terminology drift and maintain alignment with the established architecture.

It defines what we call things, what we do not rename, and how to reason about the system consistently as it evolves.

1. Canonical System Description

The project is a nutrition intelligence platform, not a recipe-only application.

It is built incrementally and already contains multiple layers:

maintained nutrient data (data/eurofir_mediterranean.csv)
deterministic lookup (nutrition_lookup.py)
enrichment/retrieval (rag_setup/*)
multi-agent runtime (multi_agent_chatbot/*)
UI surfaces (Chainlit, Flask viewer)

The goal is:

Restore and extend the system’s “smarts” through reliable nutrient retrieval, structured input interpretation, and agent/tool orchestration.

2. Canonical Layer Names (DO NOT RENAME)

Always use these four layers:

1. Data Maintenance Layer

Source: data/eurofir_mediterranean.csv
Role: curated nutrient data
Status: authoritative input

2. Index / Build Layer

Source: rag_setup/*
Role:
- enrichment
- compatibility exports
- Chroma preparation
Notes:
- internal shaping allowed
- not part of runtime contract

3. Nutrition Intelligence Layer

Source:
- nutrition_lookup.py
- nutrition_calculator.py
Role:
- deterministic nutrient lookup
- structured nutrient aggregation
Output:
- stable runtime payloads (contracted)

4. Experience / Runtime Layer

Source:
- multi_agent_chatbot/*
- Chainlit
- Flask viewer
Role:
- user interaction
- agent orchestration
- formatting and delivery

3. Contract Boundary (DO NOT MOVE)

The contract boundary exists between:

Nutrition Intelligence Layer
        ↓
Experience / Runtime Layer

Meaning:

runtime consumers must use structured outputs
internal data formats must not leak upward
ingestion/enrichment code must not be consumed directly

4. Terminology Rules

Use these terms consistently

“nutrition intelligence platform”
“deterministic lookup”
“nutrition intelligence layer”
“experience/runtime layer”
“enrichment” (not “source of truth”)
“contract” (for runtime payload shape)
“consumer” (anything reading outputs)

Avoid introducing new labels for:

the overall system (do not rename it casually)
the core lookup layer (do not rebrand it midstream)
the architecture unless a real change occurs

5. What is NOT a Structural Change

The following are allowed evolutions, not architecture changes:

stabilizing result shapes
improving lookup matching
adding signals or metadata
refining agent formatting
expanding nutrition use cases (iron, carbs, etc.)
adding multi-agent reasoning on top

These are:

extensions of the existing architecture

6. What WOULD be a Structural Change

Treat these as significant and require explicit discussion:

replacing CSV as the primary nutrient source
bypassing deterministic lookup with direct RAG answers
removing separation between ingestion and runtime
making UI depend on raw data or Chroma outputs
replacing Chainlit as a primary interface

7. Ingestion vs Runtime Rule

Internal shaping code can be flexible. Runtime outputs cannot.

rag_setup/* may use:
- raw rows
- alternate schemas
- enrichment metadata
Runtime must use:
- structured outputs from lookup/calculator

8. Current State (Pinned)

Architecture matches the original spec
No structural shift has occurred
Runtime behavior has been stabilized
Consumers are aligned with the nutrition intelligence layer
Chainlit remains active and protected

9. Working Principle

Do not redesign. Clarify, stabilize, and extend.

10. Mental Model

Data (EuroFIR CSV)
        ↓
Nutrition Intelligence (lookup + calculator)
        ↓
Agent / Runtime (multi-agent orchestration)
        ↓
Interfaces (Chainlit, UI, CLI)
        ↓
Optional Support (Chroma / enrichment)

Final Note

When in doubt:

compare against this document
compare against the original spec
prefer consistency over novelty
avoid renaming unless meaning has truly changed

{
 "raw_text": "1 large yellow onion, chopped",
 "lookup_name": "yellow onion",
 "quantity": 1,
 "unit": "large",
 "kind": "ingredient",
 "meta": {
   "preparation": "chopped",
   "descriptors": ["large"]
 }
}

• Explored
  └ Read nutrition_calculator.py

• I'm tackling ingredient normalization to improve lookup accuracy by cleaning names and converting packaging units like
  ounces and cans into grams. This involves creating a normalization module that processes ingredient names—removing
  descriptors, applying aliases, and handling packaging info—to produce standardized quantities and units primarily in
  grams. I'll integrate this into the nutrition calculator to use the normalized data for accurate macro calculations,
  carefully preserving original ingredient data and managing edge cases like zero-impact ingredients and count-based
  units. This approach prioritizes enabling lookups over perfect mass conversions given complexity and time constraints.

• Explored
  └ Read eurofir_mediterranean.csv
    Search pepper in eurofir_mediterranean.csv
    Read eurofir_mediterranean.csv
    Search onion in eurofir_mediterranean.csv
    Search black in eurofir_mediterranean.csv
    Search peas in eurofir_mediterranean.csv

Achitecture