Skip to content

PetalTrace Overview

PetalTrace is an agent observability platform that provides deep visibility into AI agent workflows. It captures the full execution lifecycle — LLM prompts and completions, tool calls, token usage, costs, and execution timelines — and exposes them through a CLI, HTTP API, and MCP server.

Terminal window
# Start capturing traces
petaltrace serve
# View recent runs
petaltrace runs list
# Inspect a run with full prompt details
petaltrace prompt run-01JK3ABC researcher_agent --completion

Debugging AI agents is hard. Prompts are long, completions are unpredictable, and costs add up fast. PetalTrace solves these challenges by capturing everything agents do, making it searchable and inspectable.

Full Prompt Capture

See exactly what was sent to LLMs — system prompts, message history, tool definitions — and what came back.

Cost Tracking

Automatic cost calculation for all major providers. Track spending by workflow, provider, or model.

Run Comparison

Diff two runs to see what changed — prompt differences, output variations, cost deltas.

Replay Execution

Re-execute captured runs with different models, temperatures, or in mocked mode for testing.

MCP Integration

Agents can query their own execution history via MCP tools — enabling self-reflective debugging patterns.

OpenTelemetry Native

Accepts standard OTLP traces. Works with any OTel-instrumented application, not just PetalFlow.

Full-Text Search

Search across prompts and completions to find specific runs or patterns.

Streaming Support

Real-time SSE feeds for monitoring active runs as they execute.

A Run represents a single execution of a workflow. It contains metadata, timing, token counts, cost estimates, and user-defined tags. Runs group related spans together for analysis.

A Span represents a unit of work within a run. PetalTrace classifies spans into five kinds:

KindDescription
nodeGraph node execution (inputs, outputs, config)
llmLLM API call (full prompt, completion, tokens, latency)
toolTool invocation (inputs, outputs)
edgeData transfer between nodes
customAny other span type

When using PetalFlow integration, you control how much data is captured:

ModeWhat’s CapturedUse Case
minimalLatency, status, token countsProduction monitoring
standard+ Full prompts, completions, tool I/ODevelopment, debugging
full+ Graph snapshots, all edge dataReplay-capable runs
┌─────────────────────────────────────────────────────────────────┐
│ PetalFlow / Any OTel App │
└─────────────────────────────┬───────────────────────────────────┘
│ OTLP/gRPC or OTLP/HTTP
┌─────────────────────────────────────────────────────────────────┐
│ PetalTrace │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Collector │ │ Trace Store │ │ Replay + Diff │ │
│ │ (OTLP) │ │ (SQLite) │ │ Engines │ │
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ HTTP API + SSE │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ MCP Server │ │
│ │ petaltrace.trace.* · petaltrace.prompt.* · ... │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
ComponentResponsibility
CollectorReceives OTLP spans, classifies by kind, enriches with costs
Trace StorePersists runs, spans, LLM interactions in SQLite with FTS
Replay EngineRe-executes runs with live, mocked, or hybrid modes
Diff EngineCompares runs structurally, by content, and by cost
HTTP APIREST + SSE endpoints for querying and operations
MCP ServerExposes trace capabilities to agents via MCP protocol
ServicePortProtocol
HTTP API8090REST
OTLP gRPC4317gRPC
OTLP HTTP4318HTTP