Changelog¶

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Historical entries below are point-in-time snapshots. The current runtime architecture is PydanticAI-only.

[Unreleased]¶

No entries yet.

[0.1.72] - 2026-04-13¶

Fixed¶

Restored build_test_ctx compatibility helper in lerim.agents.tools, fixing tests/unit/test_tools.py import errors in CI.
Cleaned up Ruff violations in unit tests (unused imports and ambiguous loop variable names), so CI / unit-tests is green again.
Preserved queue project fallback semantics while removing an unused exception binding in API project resolution.

[0.1.71] - 2026-04-13¶

Added¶

Unified status dashboard output for lerim status and lerim status --live (same sections, live mode only refreshes).
Per-project stream visibility in status (projects[]) plus timeline activity (recent_activity[]) from sync + maintain runs.
New lerim unscoped command to inspect indexed sessions without project mapping.
Queue filtering split into exact project match (--project) and substring match (--project-like).

Changed¶

Default read scope for ask, status, and memory list is now all registered projects, with explicit --scope project --project ... for narrowing.
Status output now includes stronger action hints with full commands (lerim retry ..., lerim skip ..., lerim queue --project ...).
Canonical run telemetry now stored in service_runs.details_json with normalized metrics (metrics_version=1, sync/maintain totals, per-project metrics, events), while keeping compatibility fields for older consumers.

Fixed¶

Live status activity now surfaces currently running sync work so the activity panel no longer appears frozen during long cycles.
Fixed maintain runtime failure caused by undefined index_path.

[0.2.0] - 2026-03-25¶

Historical note: this release snapshot was later superseded by the PydanticAI-only runtime noted above.

Changed¶

Migrated to a prior ReAct stack -- the lead agent moved off the original Pydantic flow onto role-specific ReAct modules at that time.
Removed explorer subagent -- search, read, and writes moved onto lead-agent tool functions (read_file, list_files, scan_memory_manifest, write_memory) instead of a nested explorer.
Removed max_explorers config option (no longer applicable).
Removed [roles.explorer] config section.
Runtime module reorganized: agent.py replaced by runtime.py, tools.py/subagents.py replaced by tools.py, providers.py, context.py, helpers.py.

Removed¶

PydanticAI dependency and all PydanticAI-specific code.

[0.1.65] - 2026-03-14¶

Added¶

Evaluation framework — dataset pipeline to build benchmarks from real traces, LLM-as-judge scoring with Claude CLI / Codex / OpenCode, per-model eval configs (MiniMax, Ollama Qwen 3.5 4B/9B/35B), and benchmark script for model comparison.
Trace compaction — Claude and Codex adapters strip noise lines (progress updates, file snapshots, context dumps) before extraction. Claude ~80% reduction, Codex ~65%. Cached in ~/.lerim/cache/{claude,codex}/.
Parallel window processing — extraction and summarization pipelines process transcript windows in parallel via ThreadPoolExecutor. Controlled by max_workers (default: 4).
JSONL-boundary windowing — transcript windows split on line boundaries instead of mid-JSON.
max_explorers config — controls parallel explorer subagents per lead turn (default: 4).
max_workers config — controls parallel window processing threads (default: 4).
thinking config — controls model reasoning mode on all four roles (default: true).
Fallback model support in extraction/summarization pipelines (e.g. MiniMax -> Z.AI on rate limits).
455 unit tests, all passing.

Changed¶

Pipeline optimization — switched from ChainOfThought to Predict for fewer failures and lower latency. XMLAdapter hardcoded (94% success rate).
Simplified extraction signatures — removed metadata/metrics from LLM inputs, slimmed output schemas.
ID-based session skip — run ID membership check instead of SHA-256 content hashing.
Extraction pipelines use Refine(N=2) for retry on validation failure.
Fixed conftest skip scoping, _toml_value list serialization, integration test provider config.
Reduced xdist workers from auto(16) to 4 for API rate limit safety.

[0.1.60] - 2026-03-05¶

Added¶

Ollama lifecycle management — automatic model load/unload before and after each sync/maintain cycle. Controlled by auto_unload in [providers] (default: true).
vllm-mlx provider — Apple Silicon local model support via provider = "mlx".
Proxy bridge integration — routes Ollama think-off requests through a proxy bridge for PydanticAI compatibility.
Docker networking for host Ollama access (host.docker.internal).
Eval runners for sync and maintain pipelines with judge prompts and trace files.
Eval configs organized under evals/configs/.

[0.1.55] - 2026-03-02¶

Added¶

lerim skill install command — installs Lerim skill files (SKILL.md, cli-reference.md) directly into coding agent directories. No npx, no git clone needed — skill files are bundled with the pip package.
Bundled skill files in src/lerim/skills/ included as package data.
Updated skill CLI reference with missing daemon --max-sessions and dashboard --port flags.

[0.1.54] - 2026-03-02¶

Added¶

MiniMax provider support — MiniMax Coding Plan (https://api.minimax.io/v1) now available as a provider. MiniMax-M2.5 is the new default model for all roles.
Z.AI Coding Plan endpoint — Z.AI provider now uses the Coding Plan API endpoint for subscription-based pricing.
Fallback model chains — all roles default to MiniMax (primary) with Z.AI fallback.

Changed¶

Default provider switched from OpenRouter to MiniMax across all four roles.
Default fallback models switched to Z.AI (glm-4.7 for lead/explorer, glm-4.5-air for extract/summarize).
Documentation restructured with Material for MkDocs components.

[0.1.53] - 2026-03-01¶

Fixed¶

Daemon loop: maintain never triggered on startup in Docker containers where time.monotonic() reflected VM uptime smaller than the maintain interval.
Daemon loop: sync/maintain cycles produced zero log output, making lerim logs appear idle.
Session queue: NULL repo_path jobs clogged the claim queue. Added filter and guard.
DB migration: orphaned NULL repo_path pending/failed jobs purged on schema init.
Explorer subagent: switched from structured output to plain str to avoid validation failures.
Explorer failures no longer crash the lead agent; returns empty evidence with warning.
Maintain action path validation: handle list-valued paths from LLM output.
run_maintain_once accepts a trigger parameter instead of hardcoding "manual".

[0.1.52] - 2026-03-01¶

Changed¶

Chronological memory processing — sync and maintain process memories in strict oldest-first order.
Adapters sort sessions by start_time before returning.
Job queue uses start_time ASC (was DESC).
Jobs run sequentially instead of parallel to preserve ordering.
Removed sync_max_workers setting (no longer applicable).

[0.1.51] - 2026-03-01¶

Fixed¶

Stream Docker pull/build output to terminal so users see real-time progress during lerim up.

[0.1.5] - 2026-03-01¶

Added¶

Per-run LLM cost tracking via OpenRouter's usage.cost response field. Model calls were captured from both HTTP hooks and provider histories. Cost logged in activity.log and returned in API responses.
Activity log format: timestamp | op | project | stats | $cost | duration.
Multi-project maintain (iterates all registered projects).

Changed¶

Replaced raw markdown write tool with structured write_memory tool. LLM passes fields; Python builds markdown. Eliminates frontmatter format errors.
write_file_tool raises ModelRetry for memory paths, directing LLM to write_memory.
Sync prompt restructured with explicit batching instructions.
_process_claimed_jobs runs sequentially (was parallel) for chronological consistency.

Fixed¶

lerim down checks if container is running before attempting stop.
Docker restart policy changed from unless-stopped to "no" — no silent auto-restart.

Infrastructure¶

Added pytest-xdist for parallel test execution (~2x speedup for e2e).

[0.1.4] - 2026-02-28¶

Fixed¶

Multi-platform Docker build (amd64 + arm64).

[0.1.3] - 2026-02-28¶

Fixed¶

Add OCI source label to Dockerfile for GHCR repo linking.
Install ripgrep in CI for memory search tests.

[0.1.2] - 2026-02-28¶

Added¶

GHCR Docker publishing — container images published to GitHub Container Registry.
Per-project memory isolation — each registered project gets its own .lerim/ directory.

[0.1.1] - 2026-02-28¶

Changed¶

Renamed lerim chat to lerim ask across CLI, API, dashboard, tests, and docs.
lerim memory add sets source: cli in frontmatter and uses canonical filename format.
grep_files_tool uses ripgrep (rg) subprocess instead of Python regex.
Simplified tools.py by removing 11 over-engineered helper functions (~150 lines removed).
run_daemon_once accepts max_sessions parameter for bounded processing.

Fixed¶

Config _parse_string_table handles dict values from TOML tables.
Provider settings deep merge no longer re-adds removed keys.
Daemon crash on malformed session traces.
Agent tool call failures from overly complex tool signatures.
Ask missing memories due to search scope issues.
Memory search returning stale results.
Sync pipeline errors on edge-case transcripts.

Removed¶

lerim memory export command and handler.
search_memory, read_memory_frontmatter, list_memory_files from api.py.

Infrastructure¶

Dockerfile updated with ripgrep package.

[0.1.0] - 2026-02-25¶

Added¶

Continual learning layer for coding agents and projects.
Platform adapters for Claude Code, Codex CLI, Cursor, and OpenCode.
Memory extraction pipeline with transcript windowing.
Trace summarization pipeline with transcript windowing.
PydanticAI lead agent with read-only explorer subagent.
Three CLI flows: sync, maintain, and ask.
Daemon mode for continuous sync and maintain loop.
Local web dashboard.
Session catalog with SQLite FTS5.
Job queue with stale job reclamation.
TOML-layered configuration.
OpenTelemetry tracing via Logfire.
Multi-provider LLM support: MiniMax, Z.AI, OpenRouter, Ollama, OpenAI, vllm-mlx.
File-first memory model using markdown files with YAML frontmatter.
Memory primitives: decisions, learnings, and summaries.
Comprehensive test suite across unit, smoke, integration, and e2e layers.