Chapters: 

The clean source is:

an array of current ingredient lines, with raw lines preserved separately

Recommended runtime shape

  • ingredient_lines_raw: original preserved lines
  • ingredient_lines_current: current editable lines after substitutions
  • optional later: ingredient_lines_normalized or parsed structured ingredient objects

For nutrition querying, use:

  • original recipe nutrition requestingredient_lines_raw
  • post-substitution nutrition requestingredient_lines_current

That fits your contract cleanly because the Flask layer remains an editor/viewer, while the Nutrition Intelligence Layer consumes a stable ingredient array and resolves foods against data/eurofir_mediterranean.csv through deterministic lookup.

Why the array is the right source

An array gives you one ingredient per unit of work:

  • easier parsing
  • easier substitution tracking
  • easier herb/non-herb filtering
  • easier per-line nutrition resolution
  • easier comparison between original and modified recipe states

This is much cleaner than re-splitting paragraphs or scraping rendered text back out of the UI.

Clean boundary

Keep the responsibilities like this:

Experience / Runtime Layer

  • stores and edits the ingredient lines
  • decides whether the user is asking about original or modified recipe state
  • passes the selected array downward

Nutrition Intelligence Layer

  • receives one ingredient line at a time
  • extracts food candidate text
  • filters out herbs/aromatics if needed
  • resolves against deterministic nutrition lookup
  • returns structured nutrient results

That preserves the contract boundary instead of letting Flask become a nutrition parser.

Practical data model

Something like this is enough for now:

recipe_record = {
    "ingredient_lines_raw": [
        "1 onion, chopped",
        "2 zucchini, sliced",
        "1 tsp cumin",
    ],
    "ingredient_lines_current": [
        "1 onion, chopped",
        "2 broccoli crowns, cut into florets",
        "1 tsp cumin",
    ]
}

Then the nutrition path operates on whichever array is active.

Suggested rule

Use this simple rule:

lines_for_nutrition = ingredient_lines_current or ingredient_lines_raw

Better still, make it explicit:

def get_ingredient_lines(recipe: dict, mode: str = "current") -> list[str]:
    if mode == "current" and recipe.get("ingredient_lines_current"):
        return recipe["ingredient_lines_current"]
    return recipe.get("ingredient_lines_raw", [])

That avoids ambiguity later.

Best next step

Do not query nutrition directly from the Flask textarea contents unless those contents are first stored as the current ingredient-line array.

So the sequence should be:

  1. user edits/substitutes ingredients in Flask
  2. Flask updates ingredient_lines_current
  3. runtime sends that array to shared nutrition resolution
  4. each non-herb ingredient is resolved against data/eurofir_mediterranean.csv via deterministic lookup, not retrieval

Minimal design conclusion

The clean source is:

an array of current ingredient lines, with raw lines preserved separately

That gives you both:

  • original recipe integrity
  • post-substitution nutrition lookup

And it stays aligned with the existing architecture instead of sneaking parsing logic into the UI.

A good naming set would be:

  • ingredient_lines_raw
  • ingredient_lines_current
  • ingredient_lines_for_nutrition(mode="current")

That’s the neatest path through the maze.