LLM Provider Module¶

The llm_provider module provides a modern, unified interface for integrating with multiple AI providers using the cutting-edge genai crate. This module enables real-time streaming responses, automatic model discovery, and consistent API behavior across different LLM services.

Core Philosophy¶

The module is designed around these principles:

Modern GenAI Integration: Built on the latest genai crate with support for newest models like o1-mini, Gemini 2.0, and Claude 3.5
Real-time Streaming: Advanced streaming with proper event handling and reasoning chunk support
Zero-Configuration: Automatic environment variable detection with manual override options
Developer-Friendly: Comprehensive logging, error handling, and debugging capabilities
Production-Ready: Thread-safe, async-first design with proper resource management

## Supported Providers

The module supports multiple LLM providers through the genai crate (v0.3.5):

OpenAI: GPT-4, GPT-3.5, GPT-4o, o1-mini, o1-preview, o3-mini, o4-mini models
Anthropic: Claude 3 (Opus, Sonnet, Haiku), Claude 3.5 models
Google: Gemini Pro, Gemini 1.5 Pro/Flash, Gemini 2.0 models
Groq: Llama 3.x models with ultra-fast inference
Cohere: Command R/R+ models
XAI: Grok models (grok-3-beta, grok-3-fast-beta, etc.)
DeepSeek: DeepSeek chat and reasoning models (deepseek-chat, deepseek-reasoner)
Ollama: Local model hosting (requires local setup)

## Architecture

The provider uses the genai crate’s Client as the underlying interface, which handles:

Authentication via environment variables
Provider-specific API endpoints and protocols
Request/response serialization
Rate limiting and retry logic

Core Types¶

GenAIProvider¶

pub struct GenAIProvider {
    client: Client,
}

Main LLM provider implementation using the genai crate for unified access to multiple AI providers.

Design Philosophy:

The provider is designed around the principle of “configure once, use everywhere”. It automatically handles provider-specific authentication requirements, API endpoints, and response formats while presenting a consistent interface to the application.

Configuration Methods:

Auto-configuration: Uses environment variables (recommended)
Explicit configuration: API keys and provider types via constructor
Runtime configuration: Dynamic provider switching (future enhancement)

Thread Safety: The provider is thread-safe and can be shared across async tasks using Arc<GenAIProvider>. The underlying genai client handles concurrent requests efficiently.

Methods:

new()¶

pub fn new() -> Result<Self>

Creates a new GenAI provider with automatic configuration.

This constructor creates a provider instance using the genai client’s default configuration, which automatically detects and uses environment variables for authentication. This is the recommended approach for production use.

Environment Variables:

The client will automatically detect and use these environment variables:

OPENAI_API_KEY: For OpenAI models
ANTHROPIC_API_KEY: For Anthropic Claude models
GEMINI_API_KEY: For Google Gemini models
GROQ_API_KEY: For Groq models
COHERE_API_KEY: For Cohere models
XAI_API_KEY: For XAI Grok models
DEEPSEEK_API_KEY: For DeepSeek models
OLLAMA_API_BASE: For Ollama local models (optional, defaults to http://localhost:11434)

Returns:

Result<Self> - A configured provider instance or configuration error

Errors:

This method can fail if:

The genai client cannot be initialized
Required system dependencies are missing
Network configuration prevents client creation

Example:

// Set environment variable first
std::env::set_var("OPENAI_API_KEY", "sk-your-key");

// Create provider with auto-configuration
let provider = GenAIProvider::new()?;

new_with_config()¶

pub fn new_with_config(provider_type: Option<&str>, api_key: Option<&str>) -> Result<Self>

Creates a new GenAI provider with explicit configuration.

This constructor allows explicit specification of provider type and API key, which is useful for CLI applications, testing, or when configuration needs to be provided at runtime rather than through environment variables.

Arguments:

provider_type - Optional provider identifier (e.g., “openai”, “anthropic”)
api_key - Optional API key for authentication

Provider Type Mapping:

"openai" → Sets OPENAI_API_KEY
"anthropic" → Sets ANTHROPIC_API_KEY
"google" or "gemini" → Sets GEMINI_API_KEY
"groq" → Sets GROQ_API_KEY
"cohere" → Sets COHERE_API_KEY
"xai" → Sets XAI_API_KEY
"deepseek" → Sets DEEPSEEK_API_KEY
"ollama" → Sets OLLAMA_API_BASE (optional)

Example:

// Create provider with explicit configuration
let provider = GenAIProvider::new_with_config(
    Some("openai"),
    Some("sk-your-api-key")
)?;

get_available_models()¶

pub async fn get_available_models(&self, provider: &str) -> Result<Vec<String>>

Retrieves all available models for a specific provider.

This method queries the specified provider’s API to get a list of all available models that can be used for chat completion. The list includes both current and legacy models, allowing users to choose the most appropriate model for their needs.

Arguments:

provider - The provider identifier (e.g., “openai”, “anthropic”, “google”)

Provider Support:

Model listing is supported for:

OpenAI: GPT-4, GPT-3.5, GPT-4o, o1 series, o3-mini, o4-mini models
Anthropic: Claude 3/3.5 series (Opus, Sonnet, Haiku)
Google: Gemini Pro, Gemini 1.5/2.0 series
Groq: Llama 3.x series with various sizes
Cohere: Command R/R+ models
XAI: Grok models (grok-3-beta, grok-3-fast-beta, etc.)
DeepSeek: DeepSeek chat and reasoning models
Ollama: Requires local setup and running instance

Returns:

Result<Vec<String>> - List of model identifiers or error

Errors:

This method can fail if:

The provider name is not recognized by genai
Network connectivity issues prevent API access
Authentication credentials are invalid or missing
The provider’s API is temporarily unavailable
Rate limits are exceeded

Example:

let provider = GenAIProvider::new()?;

// Get OpenAI models
let openai_models = provider.get_available_models("openai").await?;
for model in openai_models {
    println!("Available: {}", model);
}

// Get Anthropic models
let claude_models = provider.get_available_models("anthropic").await?;

generate_response_simple()¶

pub async fn generate_response_simple(&self, model: &str, prompt: &str) -> Result<String>

Generates a simple text response without streaming.

This method provides a straightforward way to get a complete response from an LLM without the complexity of streaming. It’s ideal for simple Q&A scenarios, testing, or when the entire response is needed before processing.

Arguments:

model - The model identifier (e.g., “gpt-4o-mini”, “claude-3-5-sonnet-20241022”)
prompt - The user’s message or prompt text

Model Compatibility:

Supports all models available through the genai crate:

OpenAI: gpt-4o, gpt-4o-mini, gpt-3.5-turbo, o1-mini, o1-preview
Anthropic: claude-3-5-sonnet-20241022, claude-3-opus-20240229, etc.
Google: gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash
Groq: llama-3.1-70b-versatile, mixtral-8x7b-32768, etc.

Returns:

Result<String> - The complete response text or error

Example:

let provider = GenAIProvider::new_with_config(
    Some("openai"),
    Some("sk-your-key")
)?;

let response = provider.generate_response_simple(
    "gpt-4o-mini",
    "What is the capital of France?"
).await?;

println!("AI: {}", response);

generate_response_stream_to_channel()¶

pub async fn generate_response_stream_to_channel(
    &self,
    model: &str,
    prompt: &str,
    tx: mpsc::UnboundedSender<String>
) -> Result<()>

Generates a streaming response and sends chunks via mpsc channel.

This is the core streaming method that provides real-time response generation, essential for creating responsive chat interfaces. It properly handles the genai crate’s streaming events and manages the async communication with the UI layer.

Streaming Architecture:

The method uses an async stream from the genai crate and processes different types of events:

Start: Indicates the beginning of response generation
Chunk: Contains incremental text content (main response text)
ReasoningChunk: Contains reasoning steps (for models like o1)
End: Indicates completion of response generation

Arguments:

model - The model identifier to use for generation
prompt - The user’s input prompt or message
tx - Unbounded mpsc sender for streaming response chunks to the UI

Channel Communication:

The method sends content chunks through the provided channel as they arrive. The receiving end (typically the UI) should listen for messages and handle:

Regular text chunks for incremental display
End-of-transmission signal (EOT_SIGNAL) indicating completion
Error messages prefixed with “Error: “ for failure cases

Event Processing:

ChatStreamEvent::Start - Logs stream initiation, no content sent
ChatStreamEvent::Chunk - Sends content immediately to channel
ChatStreamEvent::ReasoningChunk - Logs reasoning (future: may send to channel)
ChatStreamEvent::End - Logs completion, caller should send EOT signal

Error Handling:

Stream errors are handled gracefully:

Errors are logged with full context
Error messages are sent through the channel
The method returns the error for caller handling
Channel send failures are logged but don’t halt processing

Returns:

Result<()> - Success (content sent via channel) or error details

Example:

use tokio::sync::mpsc;
use perspt::EOT_SIGNAL;

let provider = GenAIProvider::new()?;
let (tx, mut rx) = mpsc::unbounded_channel();

// Start streaming in background task
let provider_clone = provider.clone();
tokio::spawn(async move {
    match provider_clone.generate_response_stream_to_channel(
        "gpt-4o-mini",
        "Tell me about Rust programming",
        tx.clone()
    ).await {
        Ok(()) => {
            let _ = tx.send(EOT_SIGNAL.to_string());
        }
        Err(e) => {
            let _ = tx.send(format!("Error: {}", e));
            let _ = tx.send(EOT_SIGNAL.to_string());
        }
    }
});

// Receive and process chunks
while let Some(chunk) = rx.recv().await {
    if chunk == EOT_SIGNAL {
        break;
    } else if chunk.starts_with("Error: ") {
        eprintln!("Stream error: {}", chunk);
        break;
    } else {
        print!("{}", chunk); // Display incremental content
    }
}

generate_response_with_history()¶

pub async fn generate_response_with_history(&self, model: &str, messages: Vec<ChatMessage>) -> Result<String>

Generate response with conversation history.

Arguments:

model - The model identifier
messages - Vector of ChatMessage objects representing conversation history

Returns:

Result<String> - Complete response text or error

get_supported_providers()¶

pub fn get_supported_providers() -> Vec<&'static str>

Get a list of supported providers.

Returns:

Vec<&'static str> - List of supported provider identifiers

Supported Providers:

[
    "openai",
    "anthropic",
    "gemini",
    "groq",
    "cohere",
    "ollama",
    "xai"
]

test_model()¶

pub async fn test_model(&self, model: &str) -> Result<bool>

Test if a model is available and working.

Arguments:

model - The model identifier to test

Returns:

Result<bool> - True if model is working, false otherwise

validate_model()¶

pub async fn validate_model(&self, model: &str, provider_type: Option<&str>) -> Result<String>

Validate and get the best available model for a provider.

Arguments:

model - The model identifier to validate
provider_type - Optional provider type for fallback model selection

Returns:

Result<String> - Validated model identifier or fallback model

Utility Functions¶

str_to_adapter_kind()¶

fn str_to_adapter_kind(provider: &str) -> Result<AdapterKind>

Convert a provider string to genai AdapterKind.

Arguments:

provider - Provider string identifier

Returns:

Result<AdapterKind> - Corresponding AdapterKind enum variant

Provider Mapping:

Input String	AdapterKind
`"openai"`	`AdapterKind::OpenAI`
`"anthropic"`	`AdapterKind::Anthropic`
`"gemini"`, `"google"`	`AdapterKind::Gemini`
`"groq"`	`AdapterKind::Groq`
`"cohere"`	`AdapterKind::Cohere`
`"ollama"`	`AdapterKind::Ollama`
`"xai"`	`AdapterKind::Xai`

Usage Examples¶

Basic Chat Interaction¶

use perspt::llm_provider::GenAIProvider;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize provider with environment variables
    let provider = GenAIProvider::new()?;

    // Simple question-answer
    let response = provider.generate_response_simple(
        "gpt-4o-mini",
        "Explain async programming in Rust"
    ).await?;

    println!("AI: {}", response);
    Ok(())
}

Streaming Chat Interface¶

use perspt::llm_provider::GenAIProvider;
use tokio::sync::mpsc;
use perspt::EOT_SIGNAL;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let provider = GenAIProvider::new()?;
    let (tx, mut rx) = mpsc::unbounded_channel();

    // Start streaming
    tokio::spawn(async move {
        let _ = provider.generate_response_stream_to_channel(
            "claude-3-5-sonnet-20241022",
            "Write a haiku about programming",
            tx
        ).await;
    });

    // Display results in real-time
    while let Some(chunk) = rx.recv().await {
        if chunk == EOT_SIGNAL {
            println!("\n[Stream Complete]");
            break;
        }
        print!("{}", chunk);
    }

    Ok(())
}

Error Handling Best Practices¶

use perspt::llm_provider::GenAIProvider;
use anyhow::{Context, Result};

async fn robust_llm_call() -> Result<String> {
    let provider = GenAIProvider::new()
        .context("Failed to initialize LLM provider")?;

    // Test model availability first
    let model = "gpt-4o-mini";
    if !provider.test_model(model).await? {
        return Err(anyhow::anyhow!("Model {} is not available", model));
    }

    // Make the actual request with proper error context
    let response = provider.generate_response_simple(
        model,
        "Hello, world!"
    )
    .await
    .context(format!("Failed to generate response using model {}", model))?;

    Ok(response)
}

Provider Selection¶

use perspt::llm_provider::GenAIProvider;

async fn choose_best_provider() -> Result<(), Box<dyn std::error::Error>> {
    let provider = GenAIProvider::new()?;

    // Get all supported providers
    let providers = GenAIProvider::get_supported_providers();

    for provider_name in providers {
        println!("Checking provider: {}", provider_name);

        // Get available models for each provider
        if let Ok(models) = provider.get_available_models(provider_name).await {
            println!("  Available models: {:?}", models);

            // Test the first model
            if !models.is_empty() {
                let works = provider.test_model(&models[0]).await.unwrap_or(false);
                println!("  Model {} works: {}", models[0], works);
            }
        }
    }

    Ok(())
}

Implementation Details¶

GenAI Crate Integration¶

The module is built on the modern genai crate which provides:

Unified Client Interface:

use genai::Client;

// Single client handles all providers
let client = Client::default();
let models = client.all_model_names(AdapterKind::OpenAI).await?;

Automatic Authentication:

// Environment variables are automatically detected:
// OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, etc.
let client = Client::default();

Streaming Support:

use genai::chat::{ChatRequest, ChatMessage};

let chat_req = ChatRequest::default()
    .append_message(ChatMessage::user("Hello"));

let stream = client.exec_chat_stream("gpt-4o-mini", chat_req, None).await?;

Event Processing:

use genai::chat::ChatStreamEvent;

while let Some(event) = stream.stream.next().await {
    match event? {
        ChatStreamEvent::Start => println!("Stream started"),
        ChatStreamEvent::Chunk(chunk) => print!("{}", chunk.content),
        ChatStreamEvent::ReasoningChunk(chunk) => println!("Reasoning: {}", chunk.content),
        ChatStreamEvent::End(_) => println!("Stream ended"),
    }
}

Error Handling¶

The module uses anyhow::Result for comprehensive error handling:

Configuration Errors: Missing API keys, invalid provider types
Network Errors: Connection timeouts, API rate limits
Model Errors: Invalid model names, unavailable models
Stream Errors: Interrupted streams, malformed responses
Authentication Errors: Invalid API keys, expired tokens

Example Error Handling:

use anyhow::{Context, Result};

async fn safe_llm_call() -> Result<String> {
    let provider = GenAIProvider::new()
        .context("Failed to create provider")?;

    let response = provider.generate_response_simple(
        "gpt-4o-mini",
        "Hello"
    )
    .await
    .context("Failed to generate response")?;

    Ok(response)
}

Advanced Error Recovery:

// Graceful fallback to alternative models
async fn robust_generation(provider: &GenAIProvider, prompt: &str) -> Result<String> {
    let preferred_models = ["gpt-4o", "gpt-4o-mini", "gpt-3.5-turbo"];

    for model in preferred_models {
        match provider.generate_response_simple(model, prompt).await {
            Ok(response) => return Ok(response),
            Err(e) => {
                log::warn!("Model {} failed: {}, trying next", model, e);
                continue;
            }
        }
    }

    Err(anyhow::anyhow!("All models failed"))
}

Performance Considerations¶

Async Streaming:

The streaming implementation is designed for optimal performance:

Non-blocking async operations
Immediate chunk forwarding (no batching delays)
Minimal memory footprint
Proper backpressure handling

Memory Management:

// Unbounded channels for streaming (careful with memory)
let (tx, rx) = mpsc::unbounded_channel();

// Alternative: bounded channels with backpressure
let (tx, rx) = mpsc::channel(1000);

Logging and Debugging:

Comprehensive logging is built-in for performance monitoring:

// Enable debug logging to track stream performance
RUST_LOG=debug ./perspt

// Logs include:
// - Chunk counts and timing
// - Content length tracking
// - Stream start/end events
// - Error conditions and recovery

Testing¶

Unit Tests:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_str_to_adapter_kind() {
        assert!(str_to_adapter_kind("openai").is_ok());
        assert!(str_to_adapter_kind("invalid").is_err());
    }

    #[tokio::test]
    async fn test_provider_creation() {
        let provider = GenAIProvider::new();
        assert!(provider.is_ok());
    }
}

Integration Tests:

// Test with real API keys (requires environment setup)
#[tokio::test]
#[ignore] // Only run with --ignored
async fn test_live_openai() -> Result<()> {
    let provider = GenAIProvider::new()?;
    let response = provider.generate_response_simple(
        "gpt-3.5-turbo",
        "Say hello"
    ).await?;
    assert!(!response.is_empty());
    Ok(())
}

LLM Provider Module¶

Core Philosophy¶

Core Types¶

GenAIProvider¶

new()¶

new_with_config()¶

get_available_models()¶

generate_response_simple()¶

generate_response_stream_to_channel()¶

generate_response_with_history()¶

get_supported_providers()¶

test_model()¶

validate_model()¶

Utility Functions¶

str_to_adapter_kind()¶

Usage Examples¶

Basic Chat Interaction¶

Streaming Chat Interface¶

Error Handling Best Practices¶

Provider Selection¶

Implementation Details¶

GenAI Crate Integration¶

Error Handling¶

Performance Considerations¶

Testing¶

See Also¶