Skip to content

Gemini

The Gemini provider connects Iris to Google’s Gemini model family. Gemini excels at multimodal tasks, long-context processing (up to 1M tokens), reasoning, and offers competitive performance at attractive price points.

package main
import (
"context"
"fmt"
"os"
"github.com/petal-labs/iris/core"
"github.com/petal-labs/iris/providers/gemini"
)
func main() {
provider := gemini.New(os.Getenv("GEMINI_API_KEY"))
client := core.NewClient(provider)
resp, err := client.Chat("gemini-2.5-pro").
System("You are a helpful assistant.").
User("What are Go's strengths for backend services?").
GetResponse(context.Background())
if err != nil {
panic(err)
}
fmt.Println(resp.Output)
}
Terminal window
# Store in the encrypted keystore (recommended)
iris keys set gemini
# Prompts for: Enter API key for gemini: AI...
import "github.com/petal-labs/iris/providers/gemini"
// From an API key string
provider := gemini.New("AI...")
// From GEMINI_API_KEY or GOOGLE_API_KEY environment variable
provider, err := gemini.NewFromEnv()
if err != nil {
log.Fatal("GEMINI_API_KEY or GOOGLE_API_KEY not set:", err)
}
// From the Iris keystore (falls back to environment)
provider, err := gemini.NewFromKeystore()
OptionDescriptionDefault
WithBaseURL(url)Override the API base URLhttps://generativelanguage.googleapis.com/v1beta
WithHTTPClient(client)Use a custom *http.ClientDefault client
WithHeader(key, value)Add a custom HTTP headerNone
WithTimeout(duration)Set the request timeout60 seconds
provider := gemini.New("AI...",
gemini.WithTimeout(90 * time.Second),
gemini.WithHeader("X-Custom-Header", "value"),
)
FeatureSupportedNotes
ChatAll Gemini models
StreamingReal-time token streaming
Tool callingFunction calling with JSON Schema
VisionImage and video analysis
ReasoningThinking mode with Gemini 2.0+
Image generationImagen integration
Embeddingstext-embedding-004
Code executionBuilt-in code interpreter
GroundingGoogle Search grounding
ModelContextBest For
gemini-2.5-pro1MComplex reasoning, long documents
gemini-2.5-flash1MFast, cost-effective
ModelContextBest For
gemini-2.0-pro1MProduction workloads
gemini-2.0-flash1MSpeed and efficiency
gemini-2.0-flash-thinking32KExplicit reasoning
ModelContextBest For
gemini-1.5-pro2MMaximum context window
gemini-1.5-flash1MBalanced performance
gemini-1.5-flash-8b1MMost cost-effective
ModelDimensionsBest For
text-embedding-004768General purpose embeddings
embedding-001768Legacy embeddings
resp, err := client.Chat("gemini-2.5-pro").
System("You are a helpful coding assistant.").
User("Explain Go's interface composition pattern.").
Temperature(0.3).
MaxTokens(1000).
GetResponse(ctx)
if err != nil {
log.Fatal(err)
}
fmt.Println(resp.Output)
fmt.Printf("Tokens: %d input, %d output\n", resp.Usage.InputTokens, resp.Usage.OutputTokens)

Stream responses for real-time output:

stream, err := client.Chat("gemini-2.5-flash").
System("You are a helpful assistant.").
User("Explain Go's memory model.").
GetStream(ctx)
if err != nil {
log.Fatal(err)
}
for chunk := range stream.Ch {
fmt.Print(chunk.Content)
}
fmt.Println()
// Check for streaming errors
if err := <-stream.Err; err != nil {
log.Fatal(err)
}
// Get final response with usage stats
final := <-stream.Final
fmt.Printf("Total tokens: %d\n", final.Usage.TotalTokens)

Gemini has strong multimodal capabilities for analyzing images and videos:

// Image from URL
resp, err := client.Chat("gemini-2.5-pro").
System("You are a helpful image analyst.").
UserMultimodal().
Text("What's in this image? Describe it in detail.").
ImageURL("https://example.com/photo.jpg").
Done().
GetResponse(ctx)
imageData, err := os.ReadFile("diagram.png")
if err != nil {
log.Fatal(err)
}
base64Data := base64.StdEncoding.EncodeToString(imageData)
resp, err := client.Chat("gemini-2.5-pro").
UserMultimodal().
Text("Explain this architecture diagram.").
ImageBase64(base64Data, "image/png").
Done().
GetResponse(ctx)
resp, err := client.Chat("gemini-2.5-pro").
UserMultimodal().
Text("Compare these UI mockups and suggest improvements.").
ImageURL("https://example.com/mockup-v1.png").
ImageURL("https://example.com/mockup-v2.png").
Done().
GetResponse(ctx)

Gemini can analyze video content:

// Video from URL (YouTube, Google Drive, or direct link)
resp, err := client.Chat("gemini-2.5-pro").
UserMultimodal().
Text("Summarize the key points in this video.").
VideoURL("https://www.youtube.com/watch?v=...").
Done().
GetResponse(ctx)
TypeFormatsMax Size
ImagesPNG, JPEG, GIF, WebP, HEIC, HEIF20 MB
VideoMP4, MOV, MPEG, AVI, MKV, WEBM2 GB
AudioMP3, WAV, AIFF, AAC, OGG, FLAC40 MB
DocumentsPDF, TXT, HTML, CSS, JS, PY, etc.50 MB

Gemini 2.0+ models support thinking mode for explicit reasoning:

resp, err := client.Chat("gemini-2.0-flash-thinking").
System("You are a world-class problem solver.").
User("Solve this logic puzzle step by step...").
GetResponse(ctx)
// Access the reasoning process
if resp.Thinking != "" {
fmt.Println("=== Reasoning Process ===")
fmt.Println(resp.Thinking)
fmt.Println()
}
fmt.Println("=== Final Answer ===")
fmt.Println(resp.Output)
resp, err := client.Chat("gemini-2.5-pro").
User("Analyze this complex algorithm...").
Thinking(true).
ThinkingBudget(5000). // Max tokens for reasoning
GetResponse(ctx)

Define and use tools with Gemini:

// Define a tool
searchTool := core.Tool{
Name: "search_products",
Description: "Search the product catalog for matching items",
Parameters: map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"query": map[string]interface{}{
"type": "string",
"description": "Search query for products",
},
"category": map[string]interface{}{
"type": "string",
"enum": []string{"electronics", "clothing", "home"},
"description": "Product category filter",
},
"max_price": map[string]interface{}{
"type": "number",
"description": "Maximum price in USD",
},
},
"required": []string{"query"},
},
}
// First request - Gemini decides to call tool
resp, err := client.Chat("gemini-2.5-pro").
System("You are a helpful shopping assistant.").
User("Find me a wireless keyboard under $100").
Tools(searchTool).
GetResponse(ctx)
if len(resp.ToolCalls) > 0 {
call := resp.ToolCalls[0]
// Execute the search (your implementation)
results := searchProducts(call.Arguments)
// Continue with tool result
finalResp, err := client.Chat("gemini-2.5-pro").
System("You are a helpful shopping assistant.").
User("Find me a wireless keyboard under $100").
Tools(searchTool).
Assistant(resp.Output).
ToolCall(call.ID, call.Name, call.Arguments).
ToolResult(call.ID, results).
GetResponse(ctx)
fmt.Println(finalResp.Output)
}

Gemini can call multiple tools simultaneously:

resp, err := client.Chat("gemini-2.5-pro").
User("Check the weather in Tokyo and find flights from NYC").
Tools(weatherTool, flightTool).
ToolChoice(core.ToolChoiceAuto).
GetResponse(ctx)
// Process all tool calls
for _, call := range resp.ToolCalls {
switch call.Name {
case "get_weather":
// Handle weather
case "search_flights":
// Handle flights
}
}

Use Gemini’s built-in code interpreter:

resp, err := client.Chat("gemini-2.5-pro").
User("Calculate the compound interest on $10,000 at 5% for 10 years").
CodeExecution(true).
GetResponse(ctx)
// Access executed code and output
if resp.CodeExecution != nil {
fmt.Println("Code:", resp.CodeExecution.Code)
fmt.Println("Output:", resp.CodeExecution.Output)
}
fmt.Println("Answer:", resp.Output)

Ground responses in real-time search results:

resp, err := client.Chat("gemini-2.5-pro").
User("What are the latest Go releases and their features?").
Grounding(core.GroundingGoogleSearch).
GetResponse(ctx)
// Access grounding sources
for _, source := range resp.GroundingSources {
fmt.Printf("Source: %s - %s\n", source.Title, source.URL)
}
fmt.Println("Answer:", resp.Output)

Generate images using Imagen:

resp, err := provider.GenerateImage(ctx, &core.ImageGenerateRequest{
Model: "imagen-3.0-generate-001",
Prompt: "A futuristic city skyline at sunset with flying cars",
Size: core.ImageSize1024x1024,
N: 1,
})
if err != nil {
log.Fatal(err)
}
fmt.Printf("Image URL: %s\n", resp.Images[0].URL)
SizeAspect Ratio
1024x10241:1 (square)
1536x10243:2 (landscape)
1024x15362:3 (portrait)

Generate embeddings for semantic search:

resp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
Model: "text-embedding-004",
Input: []core.EmbeddingInput{
{Text: "Go is a statically typed language."},
{Text: "Python is dynamically typed."},
},
})
if err != nil {
log.Fatal(err)
}
for i, emb := range resp.Embeddings {
fmt.Printf("Embedding %d: %d dimensions\n", i, len(emb.Values))
}

Optimize embeddings for specific tasks:

resp, err := provider.Embeddings(ctx, &core.EmbeddingRequest{
Model: "text-embedding-004",
Input: inputs,
TaskType: core.TaskTypeRetrieval, // Optimized for RAG
})
// Task types: RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT, SEMANTIC_SIMILARITY,
// CLASSIFICATION, CLUSTERING

Leverage Gemini’s massive context window:

// Process an entire codebase
codebase := loadEntireCodebase() // Could be 500K+ tokens
resp, err := client.Chat("gemini-1.5-pro"). // 2M context
System("You are a senior software architect.").
User(fmt.Sprintf(`Analyze this entire codebase and provide:
1. Architecture overview
2. Design patterns used
3. Potential improvements
4. Security concerns
Codebase:
%s`, codebase)).
GetResponse(ctx)
// Analyze a long document
document := loadLegalContract() // 100K+ tokens
resp, err := client.Chat("gemini-2.5-pro").
System("You are a legal analyst.").
User(fmt.Sprintf(`Analyze this contract and identify:
1. Key terms and conditions
2. Potential risks
3. Unusual clauses
4. Missing protections
Contract:
%s`, document)).
GetResponse(ctx)

Force JSON output with schema:

type ProductAnalysis struct {
Name string `json:"name"`
Category string `json:"category"`
Features []string `json:"features"`
Pros []string `json:"pros"`
Cons []string `json:"cons"`
Rating float64 `json:"rating"`
}
resp, err := client.Chat("gemini-2.5-pro").
System("Analyze products and respond in JSON format.").
User("Analyze the iPhone 15 Pro").
ResponseFormat(core.ResponseFormatJSON).
GetResponse(ctx)
var analysis ProductAnalysis
json.Unmarshal([]byte(resp.Output), &analysis)
// First turn
resp1, _ := client.Chat("gemini-2.5-pro").
System("You are a helpful Go programming tutor.").
User("What are channels in Go?").
GetResponse(ctx)
// Second turn with history
resp2, _ := client.Chat("gemini-2.5-pro").
System("You are a helpful Go programming tutor.").
User("What are channels in Go?").
Assistant(resp1.Output).
User("How do buffered channels differ?").
GetResponse(ctx)

Configure content safety filters:

resp, err := client.Chat("gemini-2.5-pro").
User(prompt).
SafetySettings([]core.SafetySetting{
{Category: core.HarmCategoryHarassment, Threshold: core.BlockMediumAndAbove},
{Category: core.HarmCategoryDangerousContent, Threshold: core.BlockOnlyHigh},
}).
GetResponse(ctx)
resp, err := client.Chat("gemini-2.5-pro").User(prompt).GetResponse(ctx)
if err != nil {
var apiErr *core.APIError
if errors.As(err, &apiErr) {
switch apiErr.StatusCode {
case 400:
log.Printf("Bad request: %s", apiErr.Message)
case 401:
log.Fatal("Invalid API key")
case 403:
log.Printf("API key doesn't have permission: %s", apiErr.Message)
case 429:
log.Printf("Rate limited. Retry after: %s", apiErr.RetryAfter)
case 500, 503:
log.Printf("Gemini service error: %s", apiErr.Message)
default:
log.Printf("API error %d: %s", apiErr.StatusCode, apiErr.Message)
}
return
}
if errors.Is(err, context.DeadlineExceeded) {
log.Println("Request timed out")
} else if errors.Is(err, context.Canceled) {
log.Println("Request canceled")
}
}
TaskRecommended Model
General chatgemini-2.5-flash
Complex reasoninggemini-2.5-pro
Long documentsgemini-1.5-pro (2M context)
Fast responsesgemini-2.5-flash
Code analysisgemini-2.5-pro
Explicit reasoninggemini-2.0-flash-thinking
// Use smaller context models for simple tasks
resp, err := client.Chat("gemini-2.5-flash"). // Cheaper for short contexts
User("What is 2+2?").
GetResponse(ctx)
// Use large context models when needed
resp, err := client.Chat("gemini-1.5-pro"). // 2M context
User(veryLongDocument).
GetResponse(ctx)
client := core.NewClient(provider,
core.WithRetryPolicy(&core.RetryPolicy{
MaxRetries: 3,
InitialInterval: 1 * time.Second,
MaxInterval: 30 * time.Second,
BackoffMultiplier: 2.0,
RetryOn: []int{429, 500, 503},
}),
)
// For questions about recent events
resp, err := client.Chat("gemini-2.5-pro").
User("What happened in tech news this week?").
Grounding(core.GroundingGoogleSearch).
GetResponse(ctx)
  • NewFromEnv() checks GEMINI_API_KEY first, then falls back to GOOGLE_API_KEY
  • Authentication uses the x-goog-api-key header
  • Gemini 1.5 Pro supports up to 2 million tokens of context
  • Video analysis processes frames and counts against the context window
  • The provider is safe for concurrent use after construction

Tools Guide

Learn advanced tool calling patterns. Tools →

Images Guide

Work with vision and image generation. Images →

Providers Overview

Compare all available providers. Providers →