CLI Reference
Concepts
Core Concepts
Section titled “Core Concepts”Cortex provides four distinct memory primitives, each designed for specific use cases in AI agent workflows. Understanding when to use each primitive is key to building effective memory systems.
Memory Primitives
Section titled “Memory Primitives”Conversation Memory
Section titled “Conversation Memory”Conversation Memory stores agent dialogue history with semantic search capabilities. It’s designed for maintaining context across multi-turn interactions.
Key Features:
- Thread-based storage - Messages grouped by conversation thread
- Role tracking - Distinguishes user, assistant, system, and tool messages
- Semantic search - Find relevant past exchanges by meaning
- Auto-summarization - Compress long histories to fit token limits
When to Use:
- Chat applications where context matters
- Agents that need to reference previous interactions
- Debugging conversation flow
# Append a message to a threadcortex conversation append \ --thread-id support-ticket-123 \ --role user \ --content "I can't access my account"
# Retrieve recent historycortex conversation history --thread-id support-ticket-123 --limit 10
# Search across all conversationscortex conversation search "account access issues"Data Model:
| Field | Description |
|---|---|
thread_id | Unique identifier for the conversation |
role | Message author: user, assistant, system, tool |
content | Message text |
metadata | Optional JSON for tool calls, citations, etc. |
timestamp | When the message was created |
embedding | Vector representation for semantic search |
Knowledge Store
Section titled “Knowledge Store”Knowledge Store provides vector-indexed document storage with hybrid search. It’s the foundation for RAG (Retrieval-Augmented Generation) pipelines.
Key Features:
- Collection-based organization - Group related documents
- Intelligent chunking - Split documents by sentence, paragraph, or semantic boundaries
- Hybrid search - Combine vector similarity with full-text matching
- Metadata filtering - Filter by source, date, tags
When to Use:
- RAG pipelines for grounded responses
- Documentation search
- Knowledge bases and FAQs
# Create a collectioncortex knowledge create-collection --name docs --description "Product documentation"
# Ingest with chunkingcortex knowledge ingest \ --collection docs \ --title "User Guide" \ --file user-guide.md
# Search with hybrid modecortex knowledge search "password reset" --collection docs --mode hybridChunking Strategies:
| Strategy | Description |
|---|---|
fixed | Fixed token count per chunk |
sentence | Split on sentence boundaries |
paragraph | Split on paragraph breaks |
semantic | Split on topic changes (requires embeddings) |
Search Modes:
| Mode | Description |
|---|---|
vector | Pure semantic similarity |
fts | Full-text search with BM25/ts_rank |
hybrid | Combines both using Reciprocal Rank Fusion |
Workflow Context
Section titled “Workflow Context”Workflow Context provides key-value storage for structured state that persists across agent runs. Think of it as a typed, versioned configuration store.
Key Features:
- Run isolation - Optionally scope values to specific run IDs
- TTL support - Auto-expire values after a duration
- Version history - Track changes over time
- Merge strategies - Handle concurrent updates
When to Use:
- Storing agent configuration
- Caching intermediate results
- Cross-step state in workflows
- Feature flags and settings
# Set a value with TTLcortex context set "cache/user-prefs" '{"theme": "dark"}' --ttl 24h
# Get a valuecortex context get "cache/user-prefs"
# List keys by prefixcortex context list --prefix "cache/"
# View version historycortex context history "cache/user-prefs"Merge Strategies:
When updating existing values, you can specify how to handle conflicts:
| Strategy | Description |
|---|---|
replace | Overwrite completely (default) |
merge_shallow | Merge top-level keys |
merge_deep | Recursive merge |
append | Append to arrays |
cortex context merge "settings" '{"new_key": "value"}' --strategy merge_shallowEntity Memory
Section titled “Entity Memory”Entity Memory automatically extracts and tracks entities (people, organizations, concepts) from your data, building a knowledge graph of relationships.
Key Features:
- Auto-extraction - Entities identified from conversations and documents
- Relationship tracking - Connect entities with typed relationships
- Alias handling - Multiple names for the same entity
- Co-occurrence analysis - Entities mentioned together
When to Use:
- CRM-style contact tracking
- Understanding domain concepts
- Relationship mapping
- Entity disambiguation
# List entities by typecortex entity list --type person
# Search entities semanticallycortex entity search "software engineer"
# Get relationshipscortex entity relationships entity-123 --direction both
# Merge duplicatescortex entity merge keep-id remove-idEntity Types:
| Type | Description |
|---|---|
person | Individual people |
organization | Companies, teams, groups |
location | Places, addresses |
concept | Abstract ideas, topics |
product | Products, services |
event | Dates, occurrences |
Extraction Modes:
| Mode | Description |
|---|---|
off | No extraction |
sampled | Extract from sample of data |
whitelist | Only extract specified types |
full | Extract all entity types |
Namespaces
Section titled “Namespaces”All Cortex data is isolated by namespace. Namespaces provide:
- Multi-tenancy - Separate data for different users or organizations
- Environment isolation - Keep dev/staging/prod data separate
- Project organization - Group related data together
# All commands support --namespacecortex knowledge search "query" --namespace acme/research
# Restrict MCP server to a namespacecortex serve --namespace acme/research
# View namespace statisticscortex namespace stats --namespace acme/researchNamespace format: org/project or just project. Default is default.
Storage Architecture
Section titled “Storage Architecture”Cortex supports two storage backends:
SQLite + vec0
Section titled “SQLite + vec0”The default backend for local development and single-node deployments:
- Zero infrastructure requirements
- vec0 extension provides vector operations
- FTS5 for full-text search
- Single-file database, easy backup
PostgreSQL + pgvector
Section titled “PostgreSQL + pgvector”Production backend for scale and multi-instance deployments:
- pgvector extension for vector similarity
- ts_rank for full-text search
- Connection pooling
- Horizontal read scaling
Both backends provide identical functionality through the storage abstraction layer.
MCP Integration
Section titled “MCP Integration”Cortex implements the Model Context Protocol (MCP), exposing memory primitives as discoverable tools. Any MCP-compatible client can:
- Discover available tools via the MCP handshake
- Call tools with typed parameters
- Receive structured responses
This means Cortex works with:
- Claude Desktop
- MCP-enabled IDEs
- Custom agents using MCP SDKs
- Any future MCP-compatible tooling
Embedding Pipeline
Section titled “Embedding Pipeline”Vector operations (semantic search, similarity matching) require embeddings. Cortex uses the Iris SDK to generate embeddings directly via provider APIs:
Text → Cortex → Iris SDK → Provider API (OpenAI, Anthropic, etc.) → Vector → Storage ↓ LRU Cache (reduces redundant API calls)Configuration
Section titled “Configuration”embedding: provider: openai # openai, anthropic, voyageai, gemini, ollama model: text-embedding-3-small dimensions: 1536 batch_size: 100 # texts per API call cache_size: 1000 # LRU cache entriesProvider Selection
Section titled “Provider Selection”| Provider | Best For | API Key |
|---|---|---|
openai | General-purpose, production workloads | OPENAI_API_KEY |
anthropic | Claude ecosystem integration | ANTHROPIC_API_KEY |
voyageai | Specialized retrieval models | VOYAGEAI_API_KEY |
gemini | Google Cloud environments | GEMINI_API_KEY |
ollama | Local/self-hosted, no API costs | OLLAMA_BASE_URL |
Caching
Section titled “Caching”Cortex maintains an LRU cache for embeddings using SHA-256 hashes of input text as keys. This:
- Reduces API costs for repeated text
- Improves latency for cached content
- Returns copies to prevent caller mutations
Graceful Degradation
Section titled “Graceful Degradation”If embedding generation fails:
- Chunks are stored without embeddings
- Full-text search remains available
- Semantic features are disabled for affected content
- Errors are logged but don’t fail operations
Data Lifecycle
Section titled “Data Lifecycle”Retention
Section titled “Retention”By default, Cortex retains all data indefinitely. Control retention with:
- TTL on context values - Auto-expire after duration
- Conversation summarization - Compress old messages
- Garbage collection - Remove orphaned data
# Set TTL when storingcortex context set "temp/data" '{}' --ttl 1h
# Summarize old conversationscortex conversation summarize --thread-id old-thread
# Run garbage collectioncortex gc --expired-ttl --old-runsBackup and Export
Section titled “Backup and Export”# Full database backupcortex backup --output cortex-backup.db
# Export namespace to JSONcortex export --namespace my-project --output export.jsonNext Steps
Section titled “Next Steps”MCP Tools
Knowledge Guide
Entity Guide