Skip to main content
MemoryEngine is the runtime behind MemoryOS. It turns conversations into durable memories, retrieves the right ones at the right time, and returns prompt-ready context for your LLM. It is not a vector search layer. It combines extraction, quality gates, conflict resolution, importance scoring, lifecycle management, version history, and graceful degradation.

Write path

POST /v1/memories/add triggers this sequence:
  1. Authenticate tenant and resolve external_user_id
  2. Run quality gate (L1 rate → L2 quality → L3 dedup → L4 budget)
  3. Create extraction job
  4. Extract durable memories via LLM
  5. Resolve conflicts against existing memories
  6. Store in PostgreSQL
  7. Write embeddings to Qdrant
  8. Record version history
  9. Run domain overlay if domain_schema is enabled
The API returns a job_id immediately. Extraction is async. With the EdTech schema enabled, the domain overlay also updates edtech_memories with grade level, weak topics, learning style, and exam context alongside the normal memory path.

Read path

POST /v1/memories/retrieve triggers this sequence:
  1. Resolve tenant and user
  2. Check quota and dependency mode
  3. Load hot-tier memories from Redis
  4. Search Qdrant for remaining slots
  5. Rank by semantic relevance, importance, and recency
  6. Filter archived and out-of-scope memories
  7. Build system_prompt_addition
  8. Prepend domain-aware context if a domain schema is active

Context formats

system_prompt_addition comes in three formats:
FormatUse when
bulletsPrompt-ready natural language — the default
jsonYour app parses memory context programmatically
xmlModels that parse XML tags precisely
Example bullets output:
What you know about this user:
Skills & expertise:
- Comfortable with FastAPI and PostgreSQL.
Preferences:
- Prefers concise technical explanations.
- Likes Python-first examples.

Conflict resolution

When a new memory contradicts an existing one, MemoryEngine resolves the conflict rather than storing both blindly.
Old memoryNew memoryResolution
User prefers JavaScript examples.User now prefers Python examples.Old archived, new stored
User works at Acme.User left Acme and joined Northstar.Newer factual memory wins
All conflict changes appear in version history. Routing is domain-aware — personal student facts go to user-session clarification, workspace-level facts go to tenant review. Resolution paths in full:
PathTriggerOutcome
Automatic updateNew memory supersedes older oneOld archived, new stored with previous_version_id
MergeTwo memories are compatible but incompleteMerged memory stored, old archived
RejectDuplicate or lower qualityIncoming not stored
Keep bothBoth still validBoth kept with version history
Cross-user recencyOne claim is much newerOlder claim weighted down
Cross-user confidenceOne claim has much higher confidenceLower-confidence claim weighted down
User clarificationPersonal conflict, only user can confirmQuestion surfaced on next get() via clarification_question
Tenant reviewWorkspace or institution truthTenant resolves in dashboard Conflicts page

Importance scoring

Each memory tracks:
  • original_importance_score — set at extraction time
  • importance_score — live score, changes with usage and decay
  • access_count — how many times this memory has been retrieved
  • last_accessed_at — last retrieval timestamp
Higher-importance memories rank higher. Stale low-value memories decay over time.

Multi-provider extraction

Provider order is configurable:
LLM_PROVIDER_ORDER=gemini,openai,anthropic
If a provider fails with a timeout, 5xx, rate limit, or auth error, MemoryEngine falls through to the next one automatically.