Custom Trace Folders¶
Custom trace folders are for agents or business workflows that Lerim does not support with a built-in adapter yet.
The boundary is intentionally simple:
- Supported agents use Lerim adapters and compaction.
- Custom agents provide already-clean Lerim canonical JSONL traces.
- Lerim scans the folder as one project with type
custom. - Lerim does not compact, rewrite, normalize, or clean custom traces.
User Journey¶
- Export raw traces from your agent, ticket workflow, research workflow, or internal automation.
- Write your own cleaner that converts those raw traces into Lerim canonical JSONL.
- Put the cleaned
.jsonlfiles in one folder. - Register that folder as a custom project.
- Run Lerim ingest. Lerim indexes the clean files and extracts reusable context.
mkdir -p ~/lerim-traces/support-clean
lerim project add ~/lerim-traces/support-clean --type custom
lerim ingest --agent custom
Each .jsonl file is treated as one source session. Nested folders are fine:
Canonical JSONL Schema¶
Each non-empty line must be one JSON object with exactly these keys:
{"type":"user","message":{"role":"user","content":"Customer asked for renewal approval."},"timestamp":"2026-05-16T09:00:00Z"}
{"type":"assistant","message":{"role":"assistant","content":"Agent found approval is required above EUR 500."},"timestamp":"2026-05-16T09:02:00Z"}
Rules:
typeisuserorassistantmessage.roleisuserorassistantmessage.contentis a string or a list of structured content blockstimestampis an ISO timestamp string ornull- one file equals one agent/workflow session
Invalid files are skipped and logged. Lerim does not try to repair custom traces because cleaning belongs to the source owner.
Paste This Prompt Into Your Coding Agent¶
Use this prompt with Codex, Claude Code, or another coding agent in the folder that contains your raw trace samples.
You are helping me create a trace cleaner for Lerim.
Goal:
Convert raw agent or workflow traces into Lerim canonical JSONL files.
Important boundary:
Lerim custom mode expects already-clean traces. Lerim will not compact, rewrite,
normalize, redact, or repair these files. The cleaning script we write here is
the source-specific adapter and privacy boundary.
Input:
- Inspect the raw trace files in this folder.
- Identify what represents one completed agent/workflow run.
- Write one output .jsonl file per run.
Output schema:
Each non-empty JSONL line must be exactly:
{
"type": "user" | "assistant",
"message": {
"role": "user" | "assistant",
"content": string | list
},
"timestamp": string | null
}
Mapping guidance:
- Use "user" for the human, customer, requester, system trigger, ticket text,
workflow request, or external input.
- Use "assistant" for the agent, automation, analyst, support bot, tool-using
workflow, or generated response/action.
- Preserve useful chronology.
- Preserve decisions, constraints, evidence, assumptions, approvals, open
questions, handoffs, tool results, source links, ticket ids, account ids,
incident ids, and workflow ids when they are useful for future context.
- Drop binary blobs, screenshots, huge raw payloads, duplicate logs, progress
noise, stack traces that add no future context, and vendor metadata that does
not help future agents.
- Redact secrets, access tokens, private keys, passwords, session cookies,
regulated personal data, and any fields our retention policy forbids.
- Do not invent missing facts.
- Do not use keyword matching as the main cleaning strategy. Parse the source
structure and map its fields deliberately.
Script requirements:
- Create a Python script named clean_to_lerim_jsonl.py.
- The script should accept:
--input <raw-trace-folder-or-file>
--output <clean-output-folder>
- It should create the output folder if needed.
- It should validate every output line against the schema above.
- It should fail loudly on unknown source shapes instead of silently producing
bad traces.
- It should print a summary with files read, sessions written, rows written,
skipped items, and redaction count if redaction is implemented.
After writing the script:
1. Run it on the sample traces.
2. Show me the generated output tree.
3. Show me two short sample output lines.
4. Explain any source fields you dropped and why.
After the cleaner runs, register the clean output folder:
python clean_to_lerim_jsonl.py \
--input ./raw-traces \
--output ~/lerim-traces/support-clean
lerim project add ~/lerim-traces/support-clean --type custom
lerim ingest --agent custom
How This Differs From Supported Agents¶
Supported sources such as Claude Code, Codex CLI, Cursor, and OpenCode use Lerim adapters. Those adapters know where the source stores sessions, compact source-specific events, and place canonical files under Lerim's cache.
Custom mode skips that adapter path. It reads your cleaned .jsonl files
directly from the registered folder and indexes them as agent_type=custom.
Operational Checks¶
lerim project list
lerim ingest --agent custom --no-extract
lerim queue --status pending
lerim status --live
Use --no-extract when you want to verify that the folder is discovered before
running model-backed extraction.