Local Models with Ollama¶
Run Perspt with local models for privacy and offline usage.
Prerequisites¶
Ollama installed
Sufficient RAM for your chosen model (7B models: 8GB+, 70B models: 64GB+)
Setup¶
# Install Ollama (macOS)
brew install ollama
# Start the Ollama service
ollama serve
# Pull a model
ollama pull llama3.2
ollama pull codellama # For coding tasks
Using Ollama with Perspt¶
Perspt auto-detects Ollama if no cloud API keys are set:
# Unset any cloud keys
unset OPENAI_API_KEY ANTHROPIC_API_KEY GEMINI_API_KEY
# Launch Perspt — auto-detects Ollama
perspt
# Or specify a model explicitly
perspt chat --model llama3.2
Agent Mode with Local Models¶
perspt agent --model codellama -w ./my-project "Create a Python utility"
Performance Note
Local models are slower and less capable than cloud models for complex agent tasks. For best results with agent mode, use a capable model (70B+ parameters) or use cloud models for the Architect and Verifier tiers:
export GEMINI_API_KEY="your-key"
perspt agent \
--architect-model gemini-pro-latest \
--actuator-model codellama \
-w ./project "Create a utility"
Available Models¶
Popular Ollama models for use with Perspt:
Model |
Size |
Best For |
|---|---|---|
|
3B/70B |
General chat |
|
7B/34B |
Code generation |
|
6.7B/33B |
Code generation |
|
7B |
General purpose |
|
3.8B |
Lightweight tasks |