Local Models with Ollama

Run AI locally with no API keys or internet required.

Why Local Models?

🔒 Privacy

All data stays on your machine

💰 Cost

No API fees or usage limits

Offline

Works without internet

🧪 Experimentation

Test models freely

Install Ollama

brew install ollama
curl -fsSL https://ollama.ai/install.sh | sh

Download from ollama.ai

Start Ollama

ollama serve

Pull a Model

# Recommended for coding
ollama pull llama3.2        # General purpose
ollama pull codellama       # Code-focused
ollama pull deepseek-coder  # Coding specialist
ollama pull qwen2.5-coder   # Code completion

Use with Perspt

# Chat mode
perspt chat --model llama3.2

# Agent mode
perspt agent --model codellama "Create a Python script"

Model Recommendations

Task

Model

Notes

General chat

llama3.2

Best all-around

Code generation

codellama:13b

Good for agent mode

Code completion

qwen2.5-coder

Fast, accurate

Reasoning

deepseek-coder:33b

Complex tasks

Agent Mode with Local Models

Local models can power SRBN, but with considerations:

# Use local for all tiers
perspt agent \
  --architect-model deepseek-coder:33b \
  --actuator-model codellama:13b \
  --verifier-model llama3.2 \
  --speculator-model llama3.2 \
  "Create a web scraper"

Performance Note

Local models are slower than cloud APIs. For complex agent tasks, consider using a capable cloud model for the Architect tier.

Hybrid Approach

Use cloud for planning, local for execution:

perspt agent \
  --architect-model gpt-5.2 \
  --actuator-model codellama:13b \
  "Build an API"

GPU Acceleration

For faster inference:

# Check GPU usage
ollama ps

# Most models auto-detect GPU
# For manual control:
OLLAMA_GPU_LAYERS=35 ollama serve

Troubleshooting

Model not found:

ollama list     # Show installed models
ollama pull <model>  # Install missing model

Slow performance:

  • Use smaller models (7B instead of 13B)

  • Ensure GPU is being used

  • Increase OLLAMA_NUM_PARALLEL

Connection refused:

# Ensure Ollama is running
ollama serve

# Check port (default 11434)
curl http://localhost:11434/api/tags

See Also