PetalTrace Overview

PetalTrace

PetalTrace is an agent observability platform that provides deep visibility into AI agent workflows. It captures the full execution lifecycle — LLM prompts and completions, tool calls, token usage, costs, and execution timelines — and exposes them through a CLI, HTTP API, and MCP server.

# Start capturing traces
petaltrace serve

# View recent runs
petaltrace runs list

# Inspect a run with full prompt details
petaltrace prompt run-01JK3ABC researcher_agent --completion

Why PetalTrace

Debugging AI agents is hard. Prompts are long, completions are unpredictable, and costs add up fast. PetalTrace solves these challenges by capturing everything agents do, making it searchable and inspectable.

Full Prompt Capture

See exactly what was sent to LLMs — system prompts, message history, tool definitions — and what came back.

Cost Tracking

Automatic cost calculation for all major providers. Track spending by workflow, provider, or model.

Run Comparison

Diff two runs to see what changed — prompt differences, output variations, cost deltas.

Replay Execution

Re-execute captured runs with different models, temperatures, or in mocked mode for testing.

Additional Capabilities

MCP Integration

Agents can query their own execution history via MCP tools — enabling self-reflective debugging patterns.

OpenTelemetry Native

Accepts standard OTLP traces. Works with any OTel-instrumented application, not just PetalFlow.

Full-Text Search

Search across prompts and completions to find specific runs or patterns.

Streaming Support

Real-time SSE feeds for monitoring active runs as they execute.

Core Concepts

Runs

A Run represents a single execution of a workflow. It contains metadata, timing, token counts, cost estimates, and user-defined tags. Runs group related spans together for analysis.

Spans

A Span represents a unit of work within a run. PetalTrace classifies spans into five kinds:

Kind	Description
`node`	Graph node execution (inputs, outputs, config)
`llm`	LLM API call (full prompt, completion, tokens, latency)
`tool`	Tool invocation (inputs, outputs)
`edge`	Data transfer between nodes
`custom`	Any other span type

Capture Modes

When using PetalFlow integration, you control how much data is captured:

Mode	What’s Captured	Use Case
`minimal`	Latency, status, token counts	Production monitoring
`standard`	+ Full prompts, completions, tool I/O	Development, debugging
`full`	+ Graph snapshots, all edge data	Replay-capable runs

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    PetalFlow / Any OTel App                     │
└─────────────────────────────┬───────────────────────────────────┘
                              │  OTLP/gRPC or OTLP/HTTP
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                         PetalTrace                              │
│                                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────┐  │
│  │  Collector   │  │  Trace Store │  │  Replay + Diff       │  │
│  │  (OTLP)      │  │  (SQLite)    │  │  Engines             │  │
│  └──────────────┘  └──────────────┘  └──────────────────────┘  │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                    HTTP API + SSE                        │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                      MCP Server                          │  │
│  │  petaltrace.trace.* · petaltrace.prompt.* · ...         │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Component	Responsibility
Collector	Receives OTLP spans, classifies by kind, enriches with costs
Trace Store	Persists runs, spans, LLM interactions in SQLite with FTS
Replay Engine	Re-executes runs with live, mocked, or hybrid modes
Diff Engine	Compares runs structurally, by content, and by cost
HTTP API	REST + SSE endpoints for querying and operations
MCP Server	Exposes trace capabilities to agents via MCP protocol