Ingest & Curate¶

Lerim has two runtime paths that keep the shared context store accurate and clean:

Trace ingestion (hot path) -- processes supported traces or custom clean traces and extracts context records
Context curation (cold path) -- refines existing records offline

Both run automatically in the daemon loop and can also be triggered manually. Trace ingestion and context curation both use the configured agent model for structured review and synthesis.

Trace Ingestion Path¶

The ingestion path turns raw agent traces into structured context records:

Discover -- adapters scan supported session directories; custom projects scan clean .jsonl folders directly
Index -- new sessions are cataloged in sessions.sqlite3
Match to project -- sessions matching a registered project are enqueued; unmatched sessions are indexed but not extracted
Prepare trace -- supported traces are compacted and cached; custom traces are read directly because they are already canonical
Trace-to-context flow -- the ingest flow reads deterministic trace windows, observes typed findings, filters for durable signal, writes the final context payload, and stores one episode record plus zero or more durable records

Record quality contract¶

Durable records should store one reusable rule, decision, fact, preference, constraint, or reference.
Durable records should not read like meeting notes or session recap prose.
Episode records are short session recaps only. They should stay compact and should not become the main place where durable context lives.
Good durable writing is closer to "what is true, why it matters, how to apply it" than to "the user asked, then the agent did X".
A clean extraction can produce no durable records. Lerim should remember less but better.

Time window¶

Config key	Default	Description
`ingest_window_days`	`7`	How far back to look for sessions
`ingest_max_sessions`	`50`	Maximum sessions per ingest cycle

Override with CLI flags:

lerim ingest --window 14d              # last 14 days
lerim ingest --window all              # all sessions ever
lerim ingest --max-sessions 10         # limit batch size

Processing order

Normal backlog ingest claims the newest available session per project first so a fresh install surfaces recent corrections quickly. Historical replay paths can still request oldest-first ordering from the catalog API when chronological reconstruction is required.

Context Curation Path¶

The curation path runs offline refinement over stored context records, iterating over all registered projects:

Load inventory -- read a bounded set of active records in one project scope
Build candidates -- use semantic search to connect likely-neighbor records into small clusters
Review clusters -- the context curator decides whether clustered records are duplicates, replacements, complementary, or false-positive neighbors
Review health batches -- the context curator inspects records not already targeted by cluster actions for routine episodes, obsolete rows, or verbose session-report style
Apply validated actions -- apply only safe archive, revise, or supersede operations through ContextStore
Keep the store lean -- prefer stronger durable records over a noisy pile of routine episodes
Compress weak records -- rewrite useful but verbose records into compact reusable context instead of preserving recap style

Agent Budgets¶

Context curation uses its config budget as a model-call cap. Context answering uses its config budget as a retrieval-action cap after the answer planner returns a plan:

Flow	Config key	Purpose
Context curation	`curate_max_llm_calls`	Caps curation model calls per run
Context answering	`answer_max_retrieval_actions`	Caps planned retrieval actions per query

Automatic scheduling¶

The daemon runs ingest, curate, and Context Brief on independent schedules:

Path	Config key	Default (see `default.toml`)
Ingest	`ingest_interval_minutes`	`30`
Curate	`curate_interval_minutes`	`60`
Context Brief	built-in daily pass	`24h`

Ingest and curate trigger immediately on daemon startup, then repeat at their configured intervals. Context Brief also runs from the daemon loop, but it skips projects with no records changed since the current artifact was generated.

Curate also triggers Context Brief for a project when it changed records for that project. Ingest does not directly trigger Context Brief.

Local model memory management¶

When using Ollama, Lerim automatically loads the model into RAM before each cycle and unloads it after (auto_unload = true in [providers]). The model only occupies memory during active processing.

Manual triggers¶

lerim ingest                       # ingest with default settings
lerim ingest --run-id <id>         # ingest a specific session
lerim ingest --dry-run             # preview without writing
lerim curate                       # run curate cycle
lerim curate --dry-run             # preview without writing
lerim context-brief status         # check generated startup context
lerim context-brief refresh        # refresh only if records changed

How It Works

Architecture overview and deployment model.

How it works
Context Model

Types, layout, and lifecycle.

Context model
Configuration

Full TOML config reference including daemon intervals.

Configuration