Skip to content

Concepts

Cortex provides four distinct memory primitives, each designed for specific use cases in AI agent workflows. Understanding when to use each primitive is key to building effective memory systems.

Conversation Memory stores agent dialogue history with semantic search capabilities. It’s designed for maintaining context across multi-turn interactions.

Key Features:

  • Thread-based storage - Messages grouped by conversation thread
  • Role tracking - Distinguishes user, assistant, system, and tool messages
  • Semantic search - Find relevant past exchanges by meaning
  • Auto-summarization - Compress long histories to fit token limits

When to Use:

  • Chat applications where context matters
  • Agents that need to reference previous interactions
  • Debugging conversation flow
Terminal window
# Append a message to a thread
cortex conversation append \
--thread-id support-ticket-123 \
--role user \
--content "I can't access my account"
# Retrieve recent history
cortex conversation history --thread-id support-ticket-123 --limit 10
# Search across all conversations
cortex conversation search "account access issues"

Data Model:

FieldDescription
thread_idUnique identifier for the conversation
roleMessage author: user, assistant, system, tool
contentMessage text
metadataOptional JSON for tool calls, citations, etc.
timestampWhen the message was created
embeddingVector representation for semantic search

Knowledge Store provides vector-indexed document storage with hybrid search. It’s the foundation for RAG (Retrieval-Augmented Generation) pipelines.

Key Features:

  • Collection-based organization - Group related documents
  • Intelligent chunking - Split documents by sentence, paragraph, or semantic boundaries
  • Hybrid search - Combine vector similarity with full-text matching
  • Metadata filtering - Filter by source, date, tags

When to Use:

  • RAG pipelines for grounded responses
  • Documentation search
  • Knowledge bases and FAQs
Terminal window
# Create a collection
cortex knowledge create-collection --name docs --description "Product documentation"
# Ingest with chunking
cortex knowledge ingest \
--collection docs \
--title "User Guide" \
--file user-guide.md
# Search with hybrid mode
cortex knowledge search "password reset" --collection docs --mode hybrid

Chunking Strategies:

StrategyDescription
fixedFixed token count per chunk
sentenceSplit on sentence boundaries
paragraphSplit on paragraph breaks
semanticSplit on topic changes (requires embeddings)

Search Modes:

ModeDescription
vectorPure semantic similarity
ftsFull-text search with BM25/ts_rank
hybridCombines both using Reciprocal Rank Fusion

Workflow Context provides key-value storage for structured state that persists across agent runs. Think of it as a typed, versioned configuration store.

Key Features:

  • Run isolation - Optionally scope values to specific run IDs
  • TTL support - Auto-expire values after a duration
  • Version history - Track changes over time
  • Merge strategies - Handle concurrent updates

When to Use:

  • Storing agent configuration
  • Caching intermediate results
  • Cross-step state in workflows
  • Feature flags and settings
Terminal window
# Set a value with TTL
cortex context set "cache/user-prefs" '{"theme": "dark"}' --ttl 24h
# Get a value
cortex context get "cache/user-prefs"
# List keys by prefix
cortex context list --prefix "cache/"
# View version history
cortex context history "cache/user-prefs"

Merge Strategies:

When updating existing values, you can specify how to handle conflicts:

StrategyDescription
replaceOverwrite completely (default)
merge_shallowMerge top-level keys
merge_deepRecursive merge
appendAppend to arrays
Terminal window
cortex context merge "settings" '{"new_key": "value"}' --strategy merge_shallow

Entity Memory automatically extracts and tracks entities (people, organizations, concepts) from your data, building a knowledge graph of relationships.

Key Features:

  • Auto-extraction - Entities identified from conversations and documents
  • Relationship tracking - Connect entities with typed relationships
  • Alias handling - Multiple names for the same entity
  • Co-occurrence analysis - Entities mentioned together

When to Use:

  • CRM-style contact tracking
  • Understanding domain concepts
  • Relationship mapping
  • Entity disambiguation
Terminal window
# List entities by type
cortex entity list --type person
# Search entities semantically
cortex entity search "software engineer"
# Get relationships
cortex entity relationships entity-123 --direction both
# Merge duplicates
cortex entity merge keep-id remove-id

Entity Types:

TypeDescription
personIndividual people
organizationCompanies, teams, groups
locationPlaces, addresses
conceptAbstract ideas, topics
productProducts, services
eventDates, occurrences

Extraction Modes:

ModeDescription
offNo extraction
sampledExtract from sample of data
whitelistOnly extract specified types
fullExtract all entity types

All Cortex data is isolated by namespace. Namespaces provide:

  • Multi-tenancy - Separate data for different users or organizations
  • Environment isolation - Keep dev/staging/prod data separate
  • Project organization - Group related data together
Terminal window
# All commands support --namespace
cortex knowledge search "query" --namespace acme/research
# Restrict MCP server to a namespace
cortex serve --namespace acme/research
# View namespace statistics
cortex namespace stats --namespace acme/research

Namespace format: org/project or just project. Default is default.


Cortex supports two storage backends:

The default backend for local development and single-node deployments:

  • Zero infrastructure requirements
  • vec0 extension provides vector operations
  • FTS5 for full-text search
  • Single-file database, easy backup

Production backend for scale and multi-instance deployments:

  • pgvector extension for vector similarity
  • ts_rank for full-text search
  • Connection pooling
  • Horizontal read scaling

Both backends provide identical functionality through the storage abstraction layer.


Cortex implements the Model Context Protocol (MCP), exposing memory primitives as discoverable tools. Any MCP-compatible client can:

  1. Discover available tools via the MCP handshake
  2. Call tools with typed parameters
  3. Receive structured responses

This means Cortex works with:

  • Claude Desktop
  • MCP-enabled IDEs
  • Custom agents using MCP SDKs
  • Any future MCP-compatible tooling

Vector operations (semantic search, similarity matching) require embeddings. Cortex uses the Iris SDK to generate embeddings directly via provider APIs:

Text → Cortex → Iris SDK → Provider API (OpenAI, Anthropic, etc.) → Vector → Storage
LRU Cache (reduces redundant API calls)
embedding:
provider: openai # openai, anthropic, voyageai, gemini, ollama
model: text-embedding-3-small
dimensions: 1536
batch_size: 100 # texts per API call
cache_size: 1000 # LRU cache entries
ProviderBest ForAPI Key
openaiGeneral-purpose, production workloadsOPENAI_API_KEY
anthropicClaude ecosystem integrationANTHROPIC_API_KEY
voyageaiSpecialized retrieval modelsVOYAGEAI_API_KEY
geminiGoogle Cloud environmentsGEMINI_API_KEY
ollamaLocal/self-hosted, no API costsOLLAMA_BASE_URL

Cortex maintains an LRU cache for embeddings using SHA-256 hashes of input text as keys. This:

  • Reduces API costs for repeated text
  • Improves latency for cached content
  • Returns copies to prevent caller mutations

If embedding generation fails:

  • Chunks are stored without embeddings
  • Full-text search remains available
  • Semantic features are disabled for affected content
  • Errors are logged but don’t fail operations

By default, Cortex retains all data indefinitely. Control retention with:

  • TTL on context values - Auto-expire after duration
  • Conversation summarization - Compress old messages
  • Garbage collection - Remove orphaned data
Terminal window
# Set TTL when storing
cortex context set "temp/data" '{}' --ttl 1h
# Summarize old conversations
cortex conversation summarize --thread-id old-thread
# Run garbage collection
cortex gc --expired-ttl --old-runs
Terminal window
# Full database backup
cortex backup --output cortex-backup.db
# Export namespace to JSON
cortex export --namespace my-project --output export.json