Cortex Overview

Cortex

Cortex is a memory and knowledge service for AI agents. It provides persistent context, vector-backed knowledge retrieval, and conversation memory through the Model Context Protocol (MCP).

# Start the MCP server
cortex serve

# Ingest documentation
cortex knowledge ingest --collection docs --title "README" --file README.md

# Search your knowledge base
cortex knowledge search "how to configure"

Why Cortex

AI agents need memory that persists across sessions, scales with usage, and integrates seamlessly with existing tooling. Cortex solves this with four purpose-built memory primitives, all accessible through a unified MCP interface.

Conversation Memory

Store and retrieve agent dialogue history with semantic search and automatic summarization. Keep context without overwhelming token limits.

Knowledge Store

Vector-indexed documents with hybrid search combining semantic similarity and full-text matching. Ingest files, chunk intelligently, and retrieve relevant context.

Workflow Context

Key-value state that persists across tasks and runs. Store structured data with optional TTL, versioning, and merge strategies.

Entity Memory

Auto-extracted knowledge graph of people, organizations, and concepts. Track relationships and co-occurrences across your data.

Production Ready

Flexible Storage

SQLite + vec0 for single-node deployments with zero infrastructure, or PostgreSQL + pgvector for production scale.

Observable

Prometheus metrics, structured logging, and health endpoints for production monitoring.

MCP Native

Works with any MCP-compatible client—Claude Desktop, IDEs, custom agents. Standard protocol, universal compatibility.

Multiple Interfaces

Web dashboard for browsing, terminal UI for quick inspection, comprehensive CLI for automation.

Design Principles

Cortex is built on principles that make it suitable for AI agent deployments:

1. Memory as a Service

Memory shouldn’t be embedded in your agent code. Cortex runs as a separate service, allowing multiple agents to share context and enabling persistence across agent restarts.

2. Protocol-First Integration

MCP provides a standardized way for AI tools to expose capabilities. Cortex implements 16+ MCP tools that any compatible client can discover and use without custom integration code.

3. Hybrid Search by Default

Vector search alone misses keyword matches; full-text search alone misses semantic relationships. Cortex combines both using Reciprocal Rank Fusion for comprehensive retrieval.

4. Namespace Isolation

All data is isolated by namespace, enabling multi-tenant deployments, environment separation (dev/staging/prod), and project isolation without infrastructure complexity.

Architecture

flowchart TB
    subgraph Clients["MCP Clients"]
        Agent[AI Agents]
        IDE[IDEs]
        Tools[Custom Tools]
    end

    subgraph Cortex["Cortex Service"]
        MCP[MCP Server]
        subgraph Primitives["Memory Primitives"]
            Conv[Conversation Memory]
            Know[Knowledge Store]
            Ctx[Workflow Context]
            Ent[Entity Memory]
        end
        Storage[Storage Backend]
        subgraph IrisSDK["Iris SDK"]
            Emb[Embedding Provider]
            LLM[LLM Provider]
            Cache[Embedding Cache]
        end
    end

    subgraph Backends["Provider APIs"]
        OpenAI[OpenAI API]
        Anthropic[Anthropic API]
        Ollama[Ollama Server]
        DB[(SQLite/PostgreSQL)]
    end

    Agent --> MCP
    IDE --> MCP
    Tools --> MCP

    MCP --> Conv
    MCP --> Know
    MCP --> Ctx
    MCP --> Ent

    Conv --> Storage
    Know --> Storage
    Ctx --> Storage
    Ent --> Storage

    Know --> Emb
    Conv --> Emb
    Ent --> LLM
    Emb --> Cache
    Cache --> OpenAI
    Cache --> Anthropic
    Cache --> Ollama
    LLM --> Anthropic
    Storage --> DB

Request Flow

MCP Client (agent, IDE, tool) discovers available Cortex tools
MCP Server receives tool calls via stdio or SSE transport
Memory Primitive processes the request (ingest, search, retrieve)
Iris SDK generates embeddings via configured provider (OpenAI, Anthropic, etc.)
Embedding Cache reduces API calls for repeated text
Storage Backend persists data to SQLite or PostgreSQL
Response flows back through MCP to the client

MCP Tools Overview

Cortex exposes 16+ MCP tools organized by memory primitive:

Primitive	Tools	Purpose
Conversation	4	Append, history, search, summarize
Knowledge	4	Ingest, bulk ingest, search, collections
Context	5	Get, set, merge, list, history
Entity	6	Query, search, relationships, update, merge, list

See the MCP Tools Reference for complete documentation.

Integration with Petal Labs Ecosystem

Cortex integrates with other Petal Labs projects:

Iris SDK for AI Operations

Cortex uses the Iris SDK directly as a Go library for embeddings, summarization, and entity extraction. No separate Iris service required—Cortex calls provider APIs directly.

Supported Embedding Providers:

Provider	API Key Environment Variable	Notes
OpenAI	`OPENAI_API_KEY`	Default, supports text-embedding-3-small/large
Anthropic	`ANTHROPIC_API_KEY`	Claude-based embeddings
VoyageAI	`VOYAGEAI_API_KEY`	Specialized embedding models
Gemini	`GEMINI_API_KEY` or `GOOGLE_API_KEY`	Google’s embedding API
Ollama	`OLLAMA_BASE_URL`	Local models, no API key required

Supported LLM Providers (for summarization and entity extraction):

Provider	Models	Use Cases
Anthropic	claude-sonnet-4-6, claude-haiku-4-5	Conversation summarization, entity extraction
OpenAI	gpt-5.4, gpt-5-mini	Alternative LLM provider
Ollama	Local models	Self-hosted inference

Configure providers in your config:

embedding:
  provider: openai
  model: text-embedding-3-small
  dimensions: 1536
  batch_size: 100
  cache_size: 1000

summarization:
  provider: anthropic
  model: claude-sonnet-4-6
  max_tokens: 1024

PetalFlow for Orchestration

Use Cortex as a memory backend for PetalFlow workflows:

// Store workflow state in Cortex
ctx.Set("user_preferences", prefs, cortex.WithTTL(24*time.Hour))

// Retrieve relevant knowledge
docs, _ := ctx.KnowledgeSearch("billing questions", cortex.Hybrid)

// Continue conversation
history, _ := ctx.ConversationHistory(threadID, 10)

What You Can Build

Long-Running Agent Memory

Agents that remember context across sessions:

# Agent stores insights as it works
cortex context set "project/insights" '{"key_files": ["main.go", "handler.go"]}'

# Later sessions retrieve context
cortex context get "project/insights"

Knowledge-Augmented Assistants

Assistants grounded in your documentation:

# Ingest your docs
cortex knowledge ingest-dir --collection docs --dir ./docs --pattern "*.md"

# Search returns relevant chunks
cortex knowledge search "authentication setup" --mode hybrid

Entity-Aware Applications

Track entities and relationships across interactions:

# Entities auto-extracted from conversations
cortex entity list --type person

# Query relationships
cortex entity relationships "user-123" --direction both

Quick Links

Getting Started

Install and configure Cortex

Concepts

Understand memory primitives

CLI Reference

Complete CLI documentation

MCP Tools

MCP tool reference

Guides

Deep-dive tutorials

Examples

Working examples