OpenClaw + Ollama lets you run a fully local AI agent — no API keys, no monthly costs, no data leaving your machine. Useful for private tasks, high-volume automation, or just not wanting to pay per token. This guide gets you from zero to a working local setup.
What is Ollama?
Ollama is an open-source tool that makes it easy to download and run large language models locally. You pull a model like you would a Docker image, and Ollama serves it via a local API at http://127.0.0.1:11434. OpenClaw connects to that API directly.
The result: you type a message in Telegram (or iMessage, or Discord), OpenClaw routes it to Ollama, Ollama runs the model on your GPU/CPU, and the response comes back — entirely within your machine.
Requirements
- OpenClaw installed and the gateway running (setup guide →)
- RAM: at least 8GB for small models, 16GB+ for mid-size, 32GB+ for larger models. Apple Silicon (M1/M2/M3/M4) uses unified memory efficiently — a 16GB Mac Mini runs mid-size models well.
- Ollama installed (covered below)
Step 1: Install Ollama
Go to ollama.ai and download the installer for your platform. On macOS you can also use Homebrew:
brew install ollamaOn Linux:
curl -fsSL https://ollama.ai/install.sh | shVerify the install:
ollama --versionStep 2: Pull a model
Ollama has a library of open-weight models. Here are the best options for use with OpenClaw in 2026, depending on your hardware:
| Model | Min RAM | Best for |
|---|---|---|
llama3.3 | 16GB | General use, fast responses |
deepseek-r1:32b | 32GB | Reasoning, math, code |
qwen2.5-coder:32b | 32GB | Coding tasks |
qwen2.5:7b | 8GB | Low-RAM machines, fast |
Pull your chosen model:
ollama pull llama3.3This downloads the model weights (several GB). Pull time depends on your connection. Once complete, verify it runs:
ollama run llama3.3
>>> Hello, are you working?Type /bye to exit the Ollama shell.
Step 3: Configure OpenClaw to use Ollama
OpenClaw auto-discovers Ollama models when you set an API key. Ollama doesn't use real keys — any string works:
openclaw config set models.providers.ollama.apiKey "ollama-local"That's it. OpenClaw queries your local Ollama instance at http://127.0.0.1:11434, discovers available models, and makes them selectable.
To set Ollama as your default model:
openclaw config set agents.defaults.model.primary "ollama/llama3.3"Restart the gateway:
openclaw gateway restartStep 4: Verify it works
openclaw dashboardOpen the Control UI and send a test message. The response should come from your local model. You can verify in the dashboard that the model shown is your Ollama model, not a cloud provider.
If you have a channel connected (e.g., Telegram), send a message there. The AI response should appear with no cloud API calls being made.
Running Ollama as a background service
By default, Ollama runs only when you start it manually. On macOS, the Ollama app runs as a menu bar item and starts automatically at login. On Linux, enable it as a systemd service:
sudo systemctl enable ollama
sudo systemctl start ollamaOpenClaw's gateway needs Ollama to be running when it starts. If Ollama is not available, the provider is simply unavailable — your cloud providers (if configured) will still work.
Frequently asked questions
- Can I use both Ollama and Claude at the same time?
- Yes. Configure both providers and set a primary model. You can have OpenClaw default to Ollama for most tasks and fall back to Claude for complex requests, or switch models per conversation.
- Does it work on Apple Silicon?
- Very well. Ollama uses Apple's Metal GPU acceleration on M-series chips. A Mac Mini with 16GB unified memory runs Llama 3.3 comfortably and handles dozens of requests per minute.
- What if my RAM is limited?
- Start with
qwen2.5:7b— it runs on 8GB and is genuinely capable for most conversational tasks. For code or heavy reasoning you'll want 32B+ models and matching RAM. - Is Ollama compatible with OpenClaw tool calling?
- OpenClaw filters for Ollama models that report
toolscapability in their metadata. Models that support function/tool calling will show up automatically. Llama 3.3 and Qwen 2.5 both support this.