Background
The system already contains agentic chat stacks, nutrition lookup tools, RAG/data-ingestion pipelines, EuroFIR-derived assets, vector stores, and supporting utilities. The Flask viewer is only a convenience surface for inspecting normalized ingredient lines, not the core purpose. The actual goal is to restore and continue the “smarts” of the site: structured nutrition understanding, retrieval, tool use, and nutrient information delivery from EuroDATA/EuroFIR-backed sources.
Requirements
Must have
- Re-center architecture on the nutrition intelligence layer, not the preview UI
- Identify which implemented tools are already production-usable end to end
- Define the missing link between normalized inputs and nutrient-serving behavior
- Clarify how EuroDATA/EuroFIR is queried, transformed, and exposed to agents/UI
- Avoid redoing completed work
Should have
- Inventory of current toolchain by responsibility: ingestion, retrieval, matching, calculation, session orchestration, UI
- Clear boundary between experimental assets and runtime-critical components
- Decision on what becomes the primary execution path for the site
Could have
- Consolidation of duplicate nutrition logic across scripts/agents/viewers
- Persistence for user-approved corrections
- Unified service contract for all front ends
What I need from you now is just this, not a rehash of recipes:
- A concise end-to-end tool summary is exactly the right next input.
-
Most useful format: for each implemented tool/service, give me
name → purpose → inputs → outputs → where used now → production-ready?
Best next slice for you to send is these four components only:
- the nutrition lookup path
- the EuroDATA/EuroFIR conversion + normalized storage path
- the agent/session runner path
- the current web app/runtime entrypoints
> The existing nutrition lookup “path” is encapsulated in nutrition_lookup.py (already in the repo
before the recent Flask work). Quick recap:
- Data source: everything reads from data/eurofir_mediterranean.csv (or any path you pass into
the constructor).
- Core class: NutritionLookup loads the CSV into memory (_read_rows), exposes a lookup(query:
str) method, and can reload() if the CSV changes.
- Lookup logic: it lowercases the query, scans the EuroFIR “FoodName” column for the best match
(exact > prefix > substring) via _match_score, and returns a NutritionResult dataclass.
- Result shape: NutritionResult carries the original query, the matched ingredient name, a
per_100g dict (kcal, protein, carbs, fat, fiber), and signals flags (protein_source if ≥7g
protein per 100g, fiber_source if ≥5g fiber).
- Helpers: _to_float safely parses strings to float; _match_score ranks matches. Everything’s
exported via __all__ = ["NutritionLookup", "NutritionResult"], so other modules (including the
agent sessions) can import it.
This is the same utility the Agentic AI demo used when “running sessions” that queried nutrition
facts. If you need to hook it back into the chatbot or expose it over a tool call, the plumbing
is already there—just instantiating NutritionLookup() and calling .lookup("lentils") returns the
structured nutrition info.
---
> Here’s the euro data → normalized storage flow we’ve been using (before the recent Flask tweaks):
- Source of truth: data/eurofir_mediterranean.csv. This hand-curated EuroFIR-style table
(FoodName, FoodGroup, ENERC_kcal, macros, notes, etc.) is the file we edit whenever we add or
tweak an ingredient. README and data/README.md both call it out as “the maintained file.”
- Legacy/compat export: When we need the old calories format, we run
python rag_setup/convert_eurofir_to_calories.py.
This script reads the EuroFIR CSV, maps each row to the classic calories_mediterranean.csv
schema (FoodCategory, FoodItem, per100grams, Cals_per100grams, KJ_per100grams), and writes it to
data/calories_mediterranean.csv. It doesn’t change the source file; it just produces an updated
compatibility export for older loaders/tools.
- Normalized storage / vector path: For the nutrition RAG side, rag_setup/enrich_nutrition_db.py
loads the same EuroFIR CSV, formats each row into a natural-language “document” plus metadata
(kcal, macros, notes, etc.), and upserts them into a Chroma collection (default nutrition_db
under chroma_mediterranean/). That’s the normalized, queryable layer the agent uses at runtime.
So the workflow is: edit data/eurofir_mediterranean.csv → (optional) regenerate the legacy
calories CSV for backward compatibility → run enrich_nutrition_db.py to refresh the embedded/
normalized store.
---
> Agent/Session Runner Summary
- Core module: multi_agent_chatbot/agentic_chatbot.py (Chainlit-based UI). On chat start it
creates a persistent SQLiteSession and initializes the working state (recipe, ingredients,
overrides) via ensure_state. Each incoming message passes through _extract_command_and_message so
commands like set recipe=id or add ingredient=... mutate the working state before the agent runs.
- State & Overrides: conversation_state.py defines the working-state schema and helper functions
(apply_command, ensure_state, parse_command). This is where we track working_recipe_id,
working_ingredients, ingredient_overrides, and servings_override. Agents read from this state to
build context.
- Tools integrated:
• nutrition_agent.py (OpenAI Agents + MCP) handles nutrition lookups, RAG queries, EXA search
(optional).
• effective_recipe.py computes the reconciled recipe (base recipe + overrides) so commands like
“swap zucchini for broccoli” or “double servings” flow into the agent response.
• recipe_catalog.py, recipe_loader.py load the recipe corpus, and load_calories.py/
nutrition_calculator.py compute macro totals per recipe.
• conversation_state.py is also where tool commands are processed (reset, add ingredient,
etc.).
- Session model: The conversation is stateful. Commands adjust state; messages either ask for
interpretation (current ingredient list) or trigger the full runner (Runner in agents.py) which
invokes the nutrition agent with context (_augment_message adds the effective recipe summary).
Outputs include nutrition summaries (total vs per serving, missing data warnings) and optional
marketing copy (the “croissant upsell”).
- Entry points:
• run_chatbot.sh / run_local.sh launch the Chainlit UI.
• The chatbot/ directory contains earlier FastAPI/CLI variants (agentic_chatbot.py,
chainlit.md, etc.), but the multi-agent runner is the same: create session, manage state, call
nutrition tools, stream responses.
So the “session runner path” is Chainlit + SQLite-backed conversation state + nutrition agent +
recipe utilities, all orchestrated in multi_agent_chatbot/agentic_chatbot.py. Let me know if you
want details on specific commands or tools.
---
> Current runtime entry points:
- Flask web app (debug UI): tools/recipe_viewer/app.py (run via python tools/recipe_viewer/
app.py). This is the local viewer you’ve been using for ingredient lines, overrides, and the
mailto link. It loads rag_setup/recipes_normalized.jsonl and serves at http://localhost:5000/.
- Multi‑agent chatbot (Chainlit UI): multi_agent_chatbot/agentic_chatbot.py. Launch via
chainlit run multi_agent_chatbot/agentic_chatbot.py or the provided scripts (run_chatbot.sh,
run_local.sh). This handles session state, nutrition lookups, and recipe overrides in the
conversational agent.
- Other CLI/utility entry points: Scripts like rag_setup/import_recipes.py, rag_setup/
sync_recipes.py, rag_setup/convert_eurofir_to_calories.py, multi_agent_chatbot/load_calories.py,
etc., all have standard if __name__ == "__main__": main() blocks so you can run them directly
with python ....
Let me know if you need a more exhaustive list or help wiring any of them into a launcher.
Good — this is enough to stop guessing.
You already have the pieces needed to resume the smart layer. What was missing was a clear runtime contract:
-
curated nutrient truth in
data/eurofir_mediterranean.csv -
deterministic lookup in
nutrition_lookup.py - semantic retrieval in Chroma as support
-
stateful agent orchestration in
multi_agent_chatbot/agentic_chatbot.py
I’ve turned that into the next spec slice in canvas: SPEC-1-Nutrition-Intelligence-Runtime.
The key architectural call in this draft is:
- deterministic CSV-backed lookup is the primary nutrient truth
- Chroma is fallback/candidate generation, not truth
-
both chatbot and web/runtime paths should call one shared
resolve_nutrition()service
The biggest gap I see now is not ingestion or UI. It is this:
-
your current matcher is still basically
exact > prefix > substring - that is fine for demos, but weak for a site runtime
- the next durable improvement is an alias/canonical mapping table plus confidence metadata
Two questions to confirm this method section:
- Do you agree that Chroma should never override a canonical CSV nutrient row, and should only help when deterministic matching is weak or missing?
- Are you okay making SQLite alias mappings the first persistence layer for approved food-name corrections?