MemoryEngine - MemoryOS

MemoryEngine is the runtime behind MemoryOS. It turns conversations into durable memories, retrieves the right ones at the right time, and returns prompt-ready context for your LLM. It is not a vector search layer. It combines extraction, quality gates, conflict resolution, importance scoring, lifecycle management, version history, and graceful degradation.

Write path

POST /v1/memories/add triggers this sequence:

Authenticate tenant and resolve external_user_id
Run quality gate (L1 rate → L2 quality → L3 dedup → L4 budget)
Create extraction job
Extract durable memories via LLM
Resolve conflicts against existing memories
Store in PostgreSQL
Write embeddings to Qdrant
Record version history
Run domain overlay if domain_schema is enabled

The API returns a job_id immediately. Extraction is async. With the EdTech schema enabled, the domain overlay also updates edtech_memories with grade level, weak topics, learning style, and exam context alongside the normal memory path.

Read path

POST /v1/memories/retrieve triggers this sequence:

Resolve tenant and user
Check quota and dependency mode
Load hot-tier memories from Redis
Search Qdrant for remaining slots
Rank by semantic relevance, importance, and recency
Filter archived and out-of-scope memories
Build system_prompt_addition
Prepend domain-aware context if a domain schema is active

Context formats

system_prompt_addition comes in three formats:

Format	Use when
`bullets`	Prompt-ready natural language — the default
`json`	Your app parses memory context programmatically
`xml`	Models that parse XML tags precisely

Example bullets output:

What you know about this user:
Skills & expertise:
- Comfortable with FastAPI and PostgreSQL.
Preferences:
- Prefers concise technical explanations.
- Likes Python-first examples.

Conflict resolution

When a new memory contradicts an existing one, MemoryEngine resolves the conflict rather than storing both blindly.

Old memory	New memory	Resolution
User prefers JavaScript examples.	User now prefers Python examples.	Old archived, new stored
User works at Acme.	User left Acme and joined Northstar.	Newer factual memory wins

All conflict changes appear in version history. Routing is domain-aware — personal student facts go to user-session clarification, workspace-level facts go to tenant review. Resolution paths in full:

Path	Trigger	Outcome
Automatic update	New memory supersedes older one	Old archived, new stored with `previous_version_id`
Merge	Two memories are compatible but incomplete	Merged memory stored, old archived
Reject	Duplicate or lower quality	Incoming not stored
Keep both	Both still valid	Both kept with version history
Cross-user recency	One claim is much newer	Older claim weighted down
Cross-user confidence	One claim has much higher confidence	Lower-confidence claim weighted down
User clarification	Personal conflict, only user can confirm	Question surfaced on next `get()` via `clarification_question`
Tenant review	Workspace or institution truth	Tenant resolves in dashboard Conflicts page

Importance scoring

Each memory tracks:

original_importance_score — set at extraction time
importance_score — live score, changes with usage and decay
access_count — how many times this memory has been retrieved
last_accessed_at — last retrieval timestamp

Higher-importance memories rank higher. Stale low-value memories decay over time.

Multi-provider extraction

Provider order is configurable:

LLM_PROVIDER_ORDER=gemini,openai,anthropic

If a provider fails with a timeout, 5xx, rate limit, or auth error, MemoryEngine falls through to the next one automatically.

​Write path

​Read path

​Context formats

​Conflict resolution

​Importance scoring

​Multi-provider extraction

​Related pages