AI - nutrition knowledge and assistance platform

Chapters:

Background

The system already contains agentic chat stacks, nutrition lookup tools, RAG/data-ingestion pipelines, EuroFIR-derived assets, vector stores, and supporting utilities. The Flask viewer is only a convenience surface for inspecting normalized ingredient lines, not the core purpose. The actual goal is to restore and continue the “smarts” of the site: structured nutrition understanding, retrieval, tool use, and nutrient information delivery from EuroDATA/EuroFIR-backed sources.

Requirements

Must have

Re-center architecture on the nutrition intelligence layer, not the preview UI
Identify which implemented tools are already production-usable end to end
Define the missing link between normalized inputs and nutrient-serving behavior
Clarify how EuroDATA/EuroFIR is queried, transformed, and exposed to agents/UI
Avoid redoing completed work

Should have

Inventory of current toolchain by responsibility: ingestion, retrieval, matching, calculation, session orchestration, UI
Clear boundary between experimental assets and runtime-critical components
Decision on what becomes the primary execution path for the site

Could have

Consolidation of duplicate nutrition logic across scripts/agents/viewers
Persistence for user-approved corrections
Unified service contract for all front ends

What I need from you now is just this, not a rehash of recipes:

A concise end-to-end tool summary is exactly the right next input.
Most useful format: for each implemented tool/service, give me
name → purpose → inputs → outputs → where used now → production-ready?

Best next slice for you to send is these four components only:

the nutrition lookup path
the EuroDATA/EuroFIR conversion + normalized storage path
the agent/session runner path
the current web app/runtime entrypoints

> The existing nutrition lookup “path” is encapsulated in nutrition_lookup.py (already in the repo
  before the recent Flask work). Quick recap:

  - Data source: everything reads from data/eurofir_mediterranean.csv (or any path you pass into
  the constructor).
  - Core class: NutritionLookup loads the CSV into memory (_read_rows), exposes a lookup(query:
  str) method, and can reload() if the CSV changes.
  - Lookup logic: it lowercases the query, scans the EuroFIR “FoodName” column for the best match
  (exact > prefix > substring) via _match_score, and returns a NutritionResult dataclass.
  - Result shape: NutritionResult carries the original query, the matched ingredient name, a
  per_100g dict (kcal, protein, carbs, fat, fiber), and signals flags (protein_source if ≥7g
  protein per 100g, fiber_source if ≥5g fiber).
  - Helpers: _to_float safely parses strings to float; _match_score ranks matches. Everything’s
  exported via __all__ = ["NutritionLookup", "NutritionResult"], so other modules (including the
  agent sessions) can import it.

  This is the same utility the Agentic AI demo used when “running sessions” that queried nutrition
  facts. If you need to hook it back into the chatbot or expose it over a tool call, the plumbing
  is already there—just instantiating NutritionLookup() and calling .lookup("lentils") returns the
  structured nutrition info.

---


> Here’s the euro data → normalized storage flow we’ve been using (before the recent Flask tweaks):

  - Source of truth: data/eurofir_mediterranean.csv. This hand-curated EuroFIR-style table
  (FoodName, FoodGroup, ENERC_kcal, macros, notes, etc.) is the file we edit whenever we add or
  tweak an ingredient. README and data/README.md both call it out as “the maintained file.”
  - Legacy/compat export: When we need the old calories format, we run
    python rag_setup/convert_eurofir_to_calories.py.
    This script reads the EuroFIR CSV, maps each row to the classic calories_mediterranean.csv
  schema (FoodCategory, FoodItem, per100grams, Cals_per100grams, KJ_per100grams), and writes it to
  data/calories_mediterranean.csv. It doesn’t change the source file; it just produces an updated
  compatibility export for older loaders/tools.
  - Normalized storage / vector path: For the nutrition RAG side, rag_setup/enrich_nutrition_db.py
  loads the same EuroFIR CSV, formats each row into a natural-language “document” plus metadata
  (kcal, macros, notes, etc.), and upserts them into a Chroma collection (default nutrition_db
  under chroma_mediterranean/). That’s the normalized, queryable layer the agent uses at runtime.

  So the workflow is: edit data/eurofir_mediterranean.csv → (optional) regenerate the legacy
  calories CSV for backward compatibility → run enrich_nutrition_db.py to refresh the embedded/
  normalized store.

---

> Agent/Session Runner Summary

  - Core module: multi_agent_chatbot/agentic_chatbot.py (Chainlit-based UI). On chat start it
  creates a persistent SQLiteSession and initializes the working state (recipe, ingredients,
  overrides) via ensure_state. Each incoming message passes through _extract_command_and_message so
  commands like set recipe=id or add ingredient=... mutate the working state before the agent runs.
  - State & Overrides: conversation_state.py defines the working-state schema and helper functions
  (apply_command, ensure_state, parse_command). This is where we track working_recipe_id,
  working_ingredients, ingredient_overrides, and servings_override. Agents read from this state to
  build context.
  - Tools integrated:
    • nutrition_agent.py (OpenAI Agents + MCP) handles nutrition lookups, RAG queries, EXA search
  (optional).
    • effective_recipe.py computes the reconciled recipe (base recipe + overrides) so commands like
  “swap zucchini for broccoli” or “double servings” flow into the agent response.
    • recipe_catalog.py, recipe_loader.py load the recipe corpus, and load_calories.py/
  nutrition_calculator.py compute macro totals per recipe.
    • conversation_state.py is also where tool commands are processed (reset, add ingredient,
  etc.).
  - Session model: The conversation is stateful. Commands adjust state; messages either ask for
  interpretation (current ingredient list) or trigger the full runner (Runner in agents.py) which
  invokes the nutrition agent with context (_augment_message adds the effective recipe summary).
  Outputs include nutrition summaries (total vs per serving, missing data warnings) and optional
  marketing copy (the “croissant upsell”).
  - Entry points:
    • run_chatbot.sh / run_local.sh launch the Chainlit UI.
    • The chatbot/ directory contains earlier FastAPI/CLI variants (agentic_chatbot.py,
  chainlit.md, etc.), but the multi-agent runner is the same: create session, manage state, call
  nutrition tools, stream responses.

  So the “session runner path” is Chainlit + SQLite-backed conversation state + nutrition agent +
  recipe utilities, all orchestrated in multi_agent_chatbot/agentic_chatbot.py. Let me know if you
  want details on specific commands or tools.

---

> Current runtime entry points:

  - Flask web app (debug UI): tools/recipe_viewer/app.py (run via python tools/recipe_viewer/
  app.py). This is the local viewer you’ve been using for ingredient lines, overrides, and the
  mailto link. It loads rag_setup/recipes_normalized.jsonl and serves at http://localhost:5000/.
  - Multi‑agent chatbot (Chainlit UI): multi_agent_chatbot/agentic_chatbot.py. Launch via
  chainlit run multi_agent_chatbot/agentic_chatbot.py or the provided scripts (run_chatbot.sh,
  run_local.sh). This handles session state, nutrition lookups, and recipe overrides in the
  conversational agent.
  - Other CLI/utility entry points: Scripts like rag_setup/import_recipes.py, rag_setup/
  sync_recipes.py, rag_setup/convert_eurofir_to_calories.py, multi_agent_chatbot/load_calories.py,
  etc., all have standard if __name__ == "__main__": main() blocks so you can run them directly
  with python ....

  Let me know if you need a more exhaustive list or help wiring any of them into a launcher.

Good — this is enough to stop guessing.

You already have the pieces needed to resume the smart layer. What was missing was a clear runtime contract:

curated nutrient truth in data/eurofir_mediterranean.csv
deterministic lookup in nutrition_lookup.py
semantic retrieval in Chroma as support
stateful agent orchestration in multi_agent_chatbot/agentic_chatbot.py

I’ve turned that into the next spec slice in canvas: SPEC-1-Nutrition-Intelligence-Runtime.

The key architectural call in this draft is:

deterministic CSV-backed lookup is the primary nutrient truth
Chroma is fallback/candidate generation, not truth
both chatbot and web/runtime paths should call one shared resolve_nutrition() service

The biggest gap I see now is not ingestion or UI. It is this:

your current matcher is still basically exact > prefix > substring
that is fine for demos, but weak for a site runtime
the next durable improvement is an alias/canonical mapping table plus confidence metadata

Two questions to confirm this method section:

Do you agree that Chroma should never override a canonical CSV nutrient row, and should only help when deterministic matching is weak or missing?
Are you okay making SQLite alias mappings the first persistence layer for approved food-name corrections?