Voyage AI

Voyage AI is a specialized provider for embeddings and reranking. It does not support chat or streaming — use it alongside a chat provider for RAG pipelines, semantic search, and information retrieval.

Quick Start

package main

import (
    "context"
    "fmt"
    "os"

    "github.com/petal-labs/iris/core"
    "github.com/petal-labs/iris/providers/voyageai"
)

func main() {
    provider := voyageai.New(os.Getenv("VOYAGE_API_KEY"))

    resp, err := provider.Embeddings(context.Background(), &core.EmbeddingRequest{
        Model: "voyage-3-large",
        Input: []core.EmbeddingInput{
            {Text: "Iris is a Go SDK for LLM providers."},
            {Text: "Voyage AI specializes in embeddings."},
        },
    })

    if err != nil {
        panic(err)
    }

    fmt.Printf("Generated %d embeddings\n", len(resp.Embeddings))
    fmt.Printf("Dimensions: %d\n", len(resp.Embeddings[0].Values))
}

Set Your API Key

CLI Keystore
Environment Variable

# Store in the encrypted keystore (recommended)
iris keys set voyageai
# Prompts for: Enter API key for voyageai: pa-...

export VOYAGE_API_KEY=pa-...

Import

import "github.com/petal-labs/iris/providers/voyageai"

Create the Provider

// From an API key string
provider := voyageai.New("pa-...")

// From the VOYAGE_API_KEY environment variable
provider, err := voyageai.NewFromEnv()
if err != nil {
    log.Fatal("VOYAGE_API_KEY not set:", err)
}

// From the Iris keystore
provider, err := voyageai.NewFromKeystore()

Configuration Options

Option	Description	Default
`WithBaseURL(url)`	Override the API base URL	`https://api.voyageai.com/v1`
`WithHTTPClient(client)`	Use a custom `*http.Client`	Default client
`WithHeader(key, value)`	Add a custom HTTP header	None
`WithTimeout(duration)`	Set the request timeout	60 seconds

provider := voyageai.New("pa-...",
    voyageai.WithTimeout(120 * time.Second),
)

Supported Features

Feature	Supported	Notes
Embeddings	✓	General and specialized models
Contextualized embeddings	✓	Document context for retrieval
Reranking	✓	Result reordering for RAG
Chat		Not supported
Streaming		Not supported
Vision		Not supported

Available Models

Embedding Models

Model	Dimensions	Best For
`voyage-3-large`	1024	Highest quality general embeddings
`voyage-3`	1024	Balanced quality and speed
`voyage-3-lite`	512	Fast, cost-effective
`voyage-code-3`	1024	Code and documentation
`voyage-finance-2`	1024	Financial documents
`voyage-law-2`	1024	Legal documents
`voyage-multilingual-2`	1024	Cross-lingual retrieval

Reranking Models

Model	Best For
`rerank-2`	General reranking
`rerank-2-lite`	Fast reranking

Basic Embeddings

resp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
    Model: "voyage-3-large",
    Input: []core.EmbeddingInput{
        {Text: "Go is a statically typed language."},
        {Text: "Python is dynamically typed."},
        {Text: "Rust has strict memory safety."},
    },
})

if err != nil {
    log.Fatal(err)
}

for i, emb := range resp.Embeddings {
    fmt.Printf("Embedding %d: %d dimensions\n", i, len(emb.Values))
}

Contextualized Embeddings

Improve retrieval quality by providing document context:

resp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
    Model: "voyage-3-large",
    Input: []core.EmbeddingInput{
        {
            Text:    "The function returns an error.",
            Context: "This is from the Go error handling documentation.",
        },
        {
            Text:    "Errors should be checked immediately.",
            Context: "This is from a Go best practices guide.",
        },
    },
})

Input Types

Optimize embeddings for queries vs documents:

// For search queries
queryResp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
    Model:     "voyage-3-large",
    Input:     []core.EmbeddingInput{{Text: "How do I handle errors in Go?"}},
    InputType: core.InputTypeQuery,
})

// For documents being indexed
docResp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
    Model: "voyage-3-large",
    Input: []core.EmbeddingInput{
        {Text: "Error handling in Go follows a simple pattern..."},
        {Text: "Always check returned errors immediately..."},
    },
    InputType: core.InputTypeDocument,
})

Reranking

Reorder search results for better relevance:

results, err := provider.Rerank(ctx, &core.RerankRequest{
    Model: "rerank-2",
    Query: "How do I implement error handling in Go?",
    Documents: []string{
        "Go uses explicit error values instead of exceptions.",
        "Python uses try/except for error handling.",
        "The error interface in Go has a single Error() method.",
        "JavaScript uses Promise.catch for async errors.",
        "Always check if err != nil after function calls.",
    },
    TopK: 3,  // Return top 3 results
})

if err != nil {
    log.Fatal(err)
}

fmt.Println("Top reranked results:")
for _, r := range results.Results {
    fmt.Printf("Score: %.4f - %s\n", r.Score, r.Document)
}

Rerank with Return Documents

results, err := provider.Rerank(ctx, &core.RerankRequest{
    Model:           "rerank-2",
    Query:           "Go error handling",
    Documents:       documents,
    TopK:            5,
    ReturnDocuments: true,  // Include document text in results
})

for _, r := range results.Results {
    fmt.Printf("[%d] %.4f: %s\n", r.Index, r.Score, r.Document)
}

RAG Pipeline Example

Combine Voyage AI embeddings with a chat provider:

import (
    "github.com/petal-labs/iris/core"
    "github.com/petal-labs/iris/providers/openai"
    "github.com/petal-labs/iris/providers/voyageai"
)

func main() {
    // Embedding provider
    embedProvider := voyageai.New(os.Getenv("VOYAGE_API_KEY"))

    // Chat provider
    chatProvider := openai.New(os.Getenv("OPENAI_API_KEY"))
    chatClient := core.NewClient(chatProvider)

    // 1. Embed the query
    queryEmb, err := embedProvider.Embeddings(ctx, &core.EmbeddingRequest{
        Model:     "voyage-3-large",
        Input:     []core.EmbeddingInput{{Text: userQuery}},
        InputType: core.InputTypeQuery,
    })

    // 2. Search vector database (pseudo-code)
    results := vectorDB.Search(queryEmb.Embeddings[0].Values, 10)

    // 3. Rerank results
    reranked, err := embedProvider.Rerank(ctx, &core.RerankRequest{
        Model:     "rerank-2",
        Query:     userQuery,
        Documents: extractTexts(results),
        TopK:      3,
    })

    // 4. Build context from top results
    context := buildContext(reranked.Results)

    // 5. Generate answer with chat model
    resp, err := chatClient.Chat("gpt-4o").
        System("Answer based on the provided context.").
        User(fmt.Sprintf("Context:\n%s\n\nQuestion: %s", context, userQuery)).
        GetResponse(ctx)

    fmt.Println(resp.Output)
}

Batch Processing

Process large document sets efficiently:

func embedDocuments(provider *voyageai.Provider, docs []string) ([][]float64, error) {
    const batchSize = 128  // Voyage AI max batch size

    var allEmbeddings [][]float64

    for i := 0; i < len(docs); i += batchSize {
        end := min(i+batchSize, len(docs))
        batch := docs[i:end]

        inputs := make([]core.EmbeddingInput, len(batch))
        for j, doc := range batch {
            inputs[j] = core.EmbeddingInput{Text: doc}
        }

        resp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
            Model:     "voyage-3-large",
            Input:     inputs,
            InputType: core.InputTypeDocument,
        })
        if err != nil {
            return nil, err
        }

        for _, emb := range resp.Embeddings {
            allEmbeddings = append(allEmbeddings, emb.Values)
        }
    }

    return allEmbeddings, nil
}

Similarity Calculation

Compute similarity between embeddings:

func cosineSimilarity(a, b []float64) float64 {
    if len(a) != len(b) {
        return 0
    }

    var dot, normA, normB float64
    for i := range a {
        dot += a[i] * b[i]
        normA += a[i] * a[i]
        normB += b[i] * b[i]
    }

    if normA == 0 || normB == 0 {
        return 0
    }

    return dot / (math.Sqrt(normA) * math.Sqrt(normB))
}

// Usage
resp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
    Model: "voyage-3-large",
    Input: []core.EmbeddingInput{
        {Text: "Go programming language"},
        {Text: "Golang tutorials"},
        {Text: "Python programming"},
    },
})

sim1 := cosineSimilarity(resp.Embeddings[0].Values, resp.Embeddings[1].Values)
sim2 := cosineSimilarity(resp.Embeddings[0].Values, resp.Embeddings[2].Values)

fmt.Printf("'Go' vs 'Golang': %.4f\n", sim1)  // High similarity
fmt.Printf("'Go' vs 'Python': %.4f\n", sim2)  // Lower similarity

Error Handling

resp, err := provider.Embeddings(ctx, req)
if err != nil {
    var apiErr *core.APIError
    if errors.As(err, &apiErr) {
        switch apiErr.StatusCode {
        case 400:
            log.Printf("Bad request: %s", apiErr.Message)
        case 401:
            log.Fatal("Invalid API key")
        case 429:
            log.Printf("Rate limited. Retry after: %s", apiErr.RetryAfter)
        case 500, 503:
            log.Printf("Voyage AI service error: %s", apiErr.Message)
        }
    }

    // Check for unsupported operations
    if errors.Is(err, core.ErrNotSupported) {
        log.Fatal("Chat is not supported by Voyage AI")
    }
}

Domain-Specific Models

Use specialized models for better results:

// For code and documentation
codeResp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
    Model: "voyage-code-3",
    Input: []core.EmbeddingInput{
        {Text: "func main() { fmt.Println(\"Hello\") }"},
        {Text: "The main function is the entry point in Go."},
    },
})

// For legal documents
legalResp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
    Model: "voyage-law-2",
    Input: []core.EmbeddingInput{
        {Text: "The defendant shall be liable for damages..."},
        {Text: "Pursuant to Section 5(a) of the Act..."},
    },
})

// For financial documents
financeResp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
    Model: "voyage-finance-2",
    Input: []core.EmbeddingInput{
        {Text: "Q3 revenue increased by 15% YoY..."},
        {Text: "The company's P/E ratio stands at 25.3..."},
    },
})

Best Practices

Choose the Right Model

Use Case	Recommended Model
General retrieval	voyage-3 or voyage-3-large
Cost-sensitive	voyage-3-lite
Code search	voyage-code-3
Legal docs	voyage-law-2
Financial docs	voyage-finance-2
Multilingual	voyage-multilingual-2
Reranking	rerank-2

Optimize for Quality

// Use contextualized embeddings for better retrieval
resp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
    Model: "voyage-3-large",
    Input: []core.EmbeddingInput{
        {Text: text, Context: documentTitle + " - " + section},
    },
    InputType: core.InputTypeDocument,
})

// Always use query type for search queries
queryResp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
    Model:     "voyage-3-large",
    Input:     []core.EmbeddingInput{{Text: query}},
    InputType: core.InputTypeQuery,
})

Handle Rate Limits

// Voyage AI has rate limits - implement retry logic
var lastErr error
for attempt := 0; attempt < 3; attempt++ {
    resp, err := provider.Embeddings(ctx, req)
    if err == nil {
        return resp, nil
    }

    var apiErr *core.APIError
    if errors.As(err, &apiErr) && apiErr.StatusCode == 429 {
        time.Sleep(time.Duration(attempt+1) * time.Second)
        lastErr = err
        continue
    }
    return nil, err
}
return nil, lastErr

Notes

Calling Chat() or StreamChat() returns core.ErrNotSupported
Use Voyage AI alongside a chat provider for RAG pipelines
Contextualized embeddings significantly improve retrieval quality
Domain-specific models outperform general models for specialized content
Maximum batch size is 128 inputs per request
The provider is safe for concurrent use after construction

Next Steps

Providers Overview

Combine with chat providers. Providers →

Telemetry Guide

Monitor embedding costs. Telemetry →