Z.ai

The Z.ai provider connects Iris to the GLM (General Language Model) family. GLM models offer strong multilingual capabilities, reasoning, vision, and image generation at competitive prices.

Quick Start

package main

import (
    "context"
    "fmt"
    "os"

    "github.com/petal-labs/iris/core"
    "github.com/petal-labs/iris/providers/zai"
)

func main() {
    provider := zai.New(os.Getenv("ZAI_API_KEY"))
    client := core.NewClient(provider)

    resp, err := client.Chat("glm-4").
        User("Compare REST and gRPC for microservices.").
        GetResponse(context.Background())

    if err != nil {
        panic(err)
    }
    fmt.Println(resp.Output)
}

Set Your API Key

CLI Keystore
Environment Variable

# Store in the encrypted keystore (recommended)
iris keys set zai

export ZAI_API_KEY=...

Import

import "github.com/petal-labs/iris/providers/zai"

Create the Provider

// From an API key string
provider := zai.New("...")

// From the ZAI_API_KEY environment variable
provider, err := zai.NewFromEnv()
if err != nil {
    log.Fatal("ZAI_API_KEY not set:", err)
}

// From the Iris keystore
provider, err := zai.NewFromKeystore()

Configuration Options

Option	Description	Default
`WithBaseURL(url)`	Override the API base URL	`https://open.bigmodel.cn/api/paas/v4`
`WithHTTPClient(client)`	Use a custom `*http.Client`	Default client
`WithHeader(key, value)`	Add a custom HTTP header	None
`WithTimeout(duration)`	Set the request timeout	60 seconds

provider := zai.New("...",
    zai.WithTimeout(90 * time.Second),
)

Supported Features

Feature	Supported	Notes
Chat	✓	All GLM models
Streaming	✓	Real-time token streaming
Tool calling	✓	Function calling
Vision	✓	Image analysis with GLM-4V
Reasoning	✓	Deep thinking mode
Image generation	✓	CogView models
Embeddings		Not supported

Available Models

Chat Models

Model	Context	Best For
`glm-4`	128K	Complex reasoning, general tasks
`glm-4-plus`	128K	Enhanced capabilities
`glm-4-air`	128K	Fast, cost-effective
`glm-4-airx`	8K	Ultra-fast responses
`glm-4-flash`	128K	Balanced speed and quality
`glm-4-long`	1M	Long document processing

Vision Models

Model	Context	Best For
`glm-4v`	8K	Image understanding
`glm-4v-plus`	8K	Enhanced vision

Image Generation

Model	Sizes	Best For
`cogview-3`	1024x1024	Image generation
`cogview-3-plus`	Multiple	Enhanced generation

Basic Chat

resp, err := client.Chat("glm-4").
    System("You are a helpful technical assistant.").
    User("Explain microservices architecture.").
    Temperature(0.7).
    MaxTokens(1000).
    GetResponse(ctx)

if err != nil {
    log.Fatal(err)
}
fmt.Println(resp.Output)

Streaming

stream, err := client.Chat("glm-4").
    System("You are a helpful assistant.").
    User("Write a detailed analysis of cloud computing trends.").
    GetStream(ctx)

if err != nil {
    log.Fatal(err)
}

for chunk := range stream.Ch {
    fmt.Print(chunk.Content)
}
fmt.Println()

if err := <-stream.Err; err != nil {
    log.Fatal(err)
}

Vision

Analyze images with GLM-4V:

imageData, err := os.ReadFile("diagram.png")
if err != nil {
    log.Fatal(err)
}
base64Data := base64.StdEncoding.EncodeToString(imageData)

resp, err := client.Chat("glm-4v").
    UserMultimodal().
        Text("Explain this architecture diagram.").
        ImageBase64(base64Data, "image/png").
        Done().
    GetResponse(ctx)

fmt.Println(resp.Output)

Image Generation

Generate images with CogView:

resp, err := provider.GenerateImage(ctx, &core.ImageGenerateRequest{
    Model:  "cogview-3",
    Prompt: "A futuristic cityscape with neon lights and flying vehicles",
    Size:   core.ImageSize1024x1024,
    N:      1,
})

if err != nil {
    log.Fatal(err)
}

fmt.Printf("Image URL: %s\n", resp.Images[0].URL)

Reasoning Mode

resp, err := client.Chat("glm-4").
    User("Solve this step by step: What is the derivative of x^3 + 2x^2 - 5x + 1?").
    Thinking(true).
    ThinkingBudget(3000).
    GetResponse(ctx)

if resp.Thinking != "" {
    fmt.Println("Reasoning:", resp.Thinking)
}
fmt.Println("Answer:", resp.Output)

Tool Calling

searchTool := core.Tool{
    Name:        "search_database",
    Description: "Search the product database",
    Parameters: map[string]interface{}{
        "type": "object",
        "properties": map[string]interface{}{
            "query": map[string]interface{}{
                "type":        "string",
                "description": "Search query",
            },
        },
        "required": []string{"query"},
    },
}

resp, err := client.Chat("glm-4").
    User("Find products related to wireless keyboards.").
    Tools(searchTool).
    GetResponse(ctx)

if len(resp.ToolCalls) > 0 {
    // Handle tool calls
}

Long Document Processing

Use GLM-4-Long for documents up to 1M tokens:

longDocument := loadLongDocument()  // Could be 500K+ tokens

resp, err := client.Chat("glm-4-long").
    System("You are a document analyst.").
    User(fmt.Sprintf("Summarize this document:\n\n%s", longDocument)).
    GetResponse(ctx)

Multilingual Support

GLM excels at multilingual tasks:

// Chinese
resp, err := client.Chat("glm-4").
    User("用中文解释量子计算的基本原理").
    GetResponse(ctx)

// English
resp, err = client.Chat("glm-4").
    User("Explain quantum computing basics in English").
    GetResponse(ctx)

// Translation
resp, err = client.Chat("glm-4").
    System("You are a professional translator.").
    User("Translate this to Japanese: Hello, how are you?").
    GetResponse(ctx)

Error Handling

resp, err := client.Chat("glm-4").User(prompt).GetResponse(ctx)
if err != nil {
    var apiErr *core.APIError
    if errors.As(err, &apiErr) {
        switch apiErr.StatusCode {
        case 401:
            log.Fatal("Invalid API key")
        case 429:
            log.Printf("Rate limited. Retry after: %s", apiErr.RetryAfter)
        case 500, 503:
            log.Printf("Z.ai service error: %s", apiErr.Message)
        }
    }
}

Best Practices

Choose the Right Model

Task	Recommended Model
General chat	glm-4-flash
Complex reasoning	glm-4-plus
Fast responses	glm-4-airx
Long documents	glm-4-long
Image analysis	glm-4v-plus

Handle Rate Limits

client := core.NewClient(provider,
    core.WithRetryPolicy(&core.RetryPolicy{
        MaxRetries:        3,
        InitialInterval:   1 * time.Second,
        MaxInterval:       30 * time.Second,
        BackoffMultiplier: 2.0,
        RetryOn:           []int{429, 500, 503},
    }),
)

Notes

Requests include an Accept-Language: en-US,en header by default
Uses Authorization: Bearer for authentication
GLM-4-Long supports up to 1 million tokens of context
Strong multilingual support, especially for Chinese and English
The provider is safe for concurrent use after construction

Next Steps

Tools Guide

Learn advanced tool calling. Tools →

Images Guide

Work with vision and generation. Images →

Providers Overview

Compare all available providers. Providers →