Market Comparison¶

This is the public page for Lerim vs other agent-memory and context systems. It is intentionally market-wide. No single competitor is the organizing frame for Lerim's benchmark story.

Rows get numbers only when the number is tied to a raw artifact, an official benchmark page, a paper, or a clearly cited public report. Rows are not win/loss claims unless the benchmark boundary matches.

Where To Look¶

Need	Location
Human-readable market table	This page
Lerim-only results and commands	Lerim Results
Benchmark hub	Benchmark Overview
Raw Lerim benchmark artifacts	`benchmarks/results/raw/` in this repo
Audit/provenance reports	`benchmarks/results/reports/` in this repo
Third-party source manifest	`benchmarks/results/market-sources.json`
Imported competitor artifacts	`benchmarks/results/raw/*-baseline/` when available

The current repo includes one normalized imported competitor artifact. Other competitor rows are linked to public docs/posts until they are normalized into raw artifacts too.

Imported market baselines are pinned upstream source rows. Their local wrapper records when Lerim normalized the source material; that wrapper is not a fresh Lerim benchmark run and is exempt from first-party clean-run publication gates.

Current Market Snapshot¶

Benchmark Numbers¶

System	Product type	Benchmark number tracked today	Benchmark boundary	Source / provenance	Comparable to Lerim LongMemEval-S retrieval?
Lerim	Source-session context compiler	Hybrid R@5 96.2%, R@10 98.6%, NDCG@10 88.4%, MRR 88.1%; lexical R@5 77.0%, R@10 82.0%, NDCG@10 62.7%, MRR 64.0%	LongMemEval-S retrieval-only over 500 questions	First-party raw artifacts: `benchmarks/results/raw/longmemeval-hybrid-full/report.json` and `benchmarks/results/raw/longmemeval-lexical-full/report.json`; clean `v0.3.0` release worktree	Yes, first-party baseline
AgentMemory	Local memory engine plus MCP server	Hybrid R@5 95.2%, R@10 98.6%, NDCG@10 87.9%, MRR 88.2%; BM25 R@5 86.2%, R@10 94.6%, NDCG@10 73.0%, MRR 71.5%	LongMemEval-S retrieval-only over 500 questions	Pinned upstream raw artifact normalized in this repo at commit `68fddd418e1bbcc41d32a1c61b7a78d91eb7c4dc`; pinned public docs at https://github.com/rohitg00/agentmemory/blob/68fddd418e1bbcc41d32a1c61b7a78d91eb7c4dc/benchmark/LONGMEMEVAL.md, accessed 2026-05-19	Pinned upstream artifact, not local rerun
MemPalace	Memory system	Pinned public docs report raw ChromaDB full-500 R@5 96.6%; later held-out-450 hybrid_v4 no-rerank R@5 98.4%, R@10 99.8%; neither row is normalized locally	LongMemEval retrieval recall, but raw artifacts and method are not normalized here	MemPalace benchmark docs: https://github.com/MemPalace/mempalace/blob/1b94f4efb4949765d6965936476c236df13fd108/benchmarks/BENCHMARKS.md, develop commit checked 2026-05-20; not normalized in this repo yet	Not yet
Mem0	Memory API / cloud platform	Official Mem0 docs report LongMemEval overall 93.4 and LoCoMo overall 91.6	Official answer/judge metrics, not Lerim's retrieval-only boundary	Mem0 official evaluation docs: https://docs.mem0.ai/core-concepts/memory-evaluation, accessed 2026-05-19; not pinned or normalized locally yet	No
Letta	Agent runtime	Official Letta post reports a filesystem LoCoMo result of 74.0	LoCoMo filesystem-agent benchmark, not LongMemEval-S retrieval-only	Letta benchmark post dated 2025-08-12: https://www.letta.com/blog/benchmarking-ai-agent-memory, accessed 2026-05-19	No
Zep / Graphiti	Temporal knowledge graph memory	No number tracked in this repo yet	Pending	Not available in this repo yet	No
Supermemory	Memory infrastructure	No number tracked in this repo yet	Pending	Not available in this repo yet	No
Khoj / claude-mem / Hippo / other systems	Mixed memory systems	No number tracked in this repo yet	Pending	Not available in this repo yet	No

Third-Party Feature Snapshot¶

These are feature and retrieval claims from one public market table that covers several systems, not normalized benchmark artifacts in this repo. They are useful for market awareness, but they are not independent measurements, fresh local reruns, or Lerim raw report.json artifacts. The market-row source is https://www.agent-memory.dev/, accessed 2026-05-19.

System	Retrieval R@5	External deps	REST endpoints	MCP tools	Auto-hooks	Native plugins	Open source	Source / provenance
AgentMemory	95.2%	0	121	51	12	6	Yes, Apache-2.0	Competitor-maintained market-row source: https://www.agent-memory.dev/, accessed 2026-05-19
Mem0	81.4%	2 (Qdrant, Neo4j)	Not reported	12	0	Not reported	Yes	Competitor-maintained market-row source: https://www.agent-memory.dev/, accessed 2026-05-19
Letta	73.8%	1 (Postgres)	Not reported	18	0	Not reported	Yes	Competitor-maintained market-row source: https://www.agent-memory.dev/, accessed 2026-05-19
Cognee	78.1%	1 (Neo4j)	Not reported	9	0	Not reported	Yes	Competitor-maintained market-row source: https://www.agent-memory.dev/, accessed 2026-05-19

LongMemEval-S Retrieval¶

Retrieval-only rows. Do not treat these as answer-generation or extraction scores.

System	Mode	Questions	R@5	R@10	R@20	NDCG@10	MRR	Evidence
Lerim	Hybrid	500	96.2%	98.6%	99.6%	88.4%	88.1%	`benchmarks/results/raw/longmemeval-hybrid-full/report.json`
Lerim	Lexical	500	77.0%	82.0%	89.8%	62.7%	64.0%	`benchmarks/results/raw/longmemeval-lexical-full/report.json`
AgentMemory	BM25+Vector	500	95.2%	98.6%	99.4%	87.9%	88.2%	`benchmarks/results/raw/imported-market-baselines/report.json`
AgentMemory	BM25-only	500	86.2%	94.6%	98.6%	73.0%	71.5%	`benchmarks/results/raw/imported-market-baselines/report.json`
MemPalace	Raw ChromaDB full set	500	96.6%	Not available	Not available	Not available	Not available	Public docs only; not normalized locally
MemPalace	hybrid_v4 no-rerank held-out set	450	98.4%	99.8%	Not available	Not available	Not available	Public docs only; not normalized locally

These rows are retrieval-only. They are not official LongMemEval QA scores, do not call an LLM judge, and do not score generated answers. Imported competitor rows are pinned upstream raw artifacts normalized locally; they are not fresh local reruns. The imported LongMemEval-S rows currently available in this repo come from commit 68fddd418e1bbcc41d32a1c61b7a78d91eb7c4dc.

Extraction Comparison Status¶

Lerim's trace-to-context extraction eval is a private first-party benchmark. No competitor row below has been run on that private extraction eval yet.

System	Status on Lerim's private extraction eval
Lerim	Internal diagnostic aggregate exists; see Lerim Results for the current extraction-quality and false-positive numbers.
AgentMemory	Not available yet; not run on this private eval.
Cognee	Not available yet; not run on this private eval.
Letta	Not available yet; not run on this private eval.
Mem0	Not available yet; not run on this private eval.

Do not substitute LongMemEval-S retrieval numbers, LoCoMo answer scores, feature tables, or public marketing rows for extraction-quality scores. A fair extraction comparison requires a runner that feeds the same traces to each system, collects the system's saved memories/context records, and scores those records with the same labels and judge.

Not-Yet-Comparable Rows¶

Mem0 and Letta have useful public benchmark numbers, but they do not share the same boundary as Lerim's current retrieval-only artifacts:

Mem0 reports managed-platform answer/judge results for LoCoMo, LongMemEval, and BEAM. Its docs also describe memory extraction from submitted conversation payloads, so do not describe Mem0 as retrieval-only infrastructure.
Letta reports a LoCoMo filesystem-agent result. That is an agent/tool-use benchmark, not a LongMemEval-S retrieval-only benchmark.
Cognee has a cited third-party market row at https://www.agent-memory.dev/, but its raw benchmark artifact is not normalized locally yet.
MemPalace is close enough to track for LongMemEval-S retrieval, but the row is a pinned public source citation and is not normalized locally yet.
Zep/Graphiti, Supermemory, Khoj, claude-mem, Hippo, and other systems are listed as watchlist rows until a cited public number or local run is added.

Rows marked as not normalized locally are source citations only. Competitor-maintained market-row metrics are labeled as such; they are included for market awareness, not as pinned reproducibility artifacts.

Next Normalization Work¶

Keep Lerim public artifacts regenerated from clean release worktrees before each launch.
Normalize MemPalace if raw artifacts are available.
Add reproducible importers or fresh local runs for Mem0, Letta, Zep/Graphiti, Supermemory, Khoj, claude-mem, Hippo, and any other serious memory system.
Keep extraction-quality numbers separate from retrieval-only numbers.
Publish no market-ranking claim until rows share the same benchmark boundary.

Sources¶

Lerim raw artifacts: benchmarks/results/raw/
Normalized imported LongMemEval-S baseline currently available: benchmarks/results/raw/imported-market-baselines/, upstream commit 68fddd418e1bbcc41d32a1c61b7a78d91eb7c4dc.
Public benchmark docs for that imported baseline: https://github.com/rohitg00/agentmemory/blob/68fddd418e1bbcc41d32a1c61b7a78d91eb7c4dc/benchmark/LONGMEMEVAL.md, accessed 2026-05-19.
Public market-row source for source-reported feature metrics: https://www.agent-memory.dev/, accessed 2026-05-19.
MemPalace benchmark docs: https://github.com/MemPalace/mempalace/blob/1b94f4efb4949765d6965936476c236df13fd108/benchmarks/BENCHMARKS.md, develop commit checked 2026-05-20; not normalized locally yet.
Mem0 official evaluation docs: https://docs.mem0.ai/core-concepts/memory-evaluation, accessed 2026-05-19; not pinned or normalized locally yet.
Letta benchmark post: https://www.letta.com/blog/benchmarking-ai-agent-memory, dated 2025-08-12 and accessed 2026-05-19; not pinned or normalized locally yet.