OpenClaw + Ollama: Run Local LLMs Without API Keys

OpenClaw + Ollama lets you run a fully local AI agent — no API keys, no monthly costs, no data leaving your machine. Useful for private tasks, high-volume automation, or just not wanting to pay per token. This guide gets you from zero to a working local setup.

What is Ollama?

Ollama is an open-source tool that makes it easy to download and run large language models locally. You pull a model like you would a Docker image, and Ollama serves it via a local API at http://127.0.0.1:11434. OpenClaw connects to that API directly.

The result: you type a message in Telegram (or iMessage, or Discord), OpenClaw routes it to Ollama, Ollama runs the model on your GPU/CPU, and the response comes back — entirely within your machine.

Requirements

OpenClaw installed and the gateway running (setup guide →)
RAM: at least 8GB for small models, 16GB+ for mid-size, 32GB+ for larger models. Apple Silicon (M1/M2/M3/M4) uses unified memory efficiently — a 16GB Mac Mini runs mid-size models well.
Ollama installed (covered below)

Step 1: Install Ollama

Go to ollama.ai and download the installer for your platform. On macOS you can also use Homebrew:

brew install ollama

On Linux:

curl -fsSL https://ollama.ai/install.sh | sh

Verify the install:

ollama --version

Step 2: Pull a model

Ollama has a library of open-weight models. Here are the best options for use with OpenClaw in 2026, depending on your hardware:

Model	Min RAM	Best for
`llama3.3`	16GB	General use, fast responses
`deepseek-r1:32b`	32GB	Reasoning, math, code
`qwen2.5-coder:32b`	32GB	Coding tasks
`qwen2.5:7b`	8GB	Low-RAM machines, fast

Pull your chosen model:

ollama pull llama3.3

This downloads the model weights (several GB). Pull time depends on your connection. Once complete, verify it runs:

ollama run llama3.3
>>> Hello, are you working?

Type /bye to exit the Ollama shell.

Step 3: Configure OpenClaw to use Ollama

OpenClaw auto-discovers Ollama models when you set an API key. Ollama doesn't use real keys — any string works:

openclaw config set models.providers.ollama.apiKey "ollama-local"

That's it. OpenClaw queries your local Ollama instance at http://127.0.0.1:11434, discovers available models, and makes them selectable.

To set Ollama as your default model:

openclaw config set agents.defaults.model.primary "ollama/llama3.3"

Restart the gateway:

openclaw gateway restart

Step 4: Verify it works

openclaw dashboard

Open the Control UI and send a test message. The response should come from your local model. You can verify in the dashboard that the model shown is your Ollama model, not a cloud provider.

If you have a channel connected (e.g., Telegram), send a message there. The AI response should appear with no cloud API calls being made.

Running Ollama as a background service

By default, Ollama runs only when you start it manually. On macOS, the Ollama app runs as a menu bar item and starts automatically at login. On Linux, enable it as a systemd service:

sudo systemctl enable ollama
sudo systemctl start ollama

OpenClaw's gateway needs Ollama to be running when it starts. If Ollama is not available, the provider is simply unavailable — your cloud providers (if configured) will still work.

Frequently asked questions

Can I use both Ollama and Claude at the same time?: Yes. Configure both providers and set a primary model. You can have OpenClaw default to Ollama for most tasks and fall back to Claude for complex requests, or switch models per conversation.
Does it work on Apple Silicon?: Very well. Ollama uses Apple's Metal GPU acceleration on M-series chips. A Mac Mini with 16GB unified memory runs Llama 3.3 comfortably and handles dozens of requests per minute.
What if my RAM is limited?: Start with qwen2.5:7b — it runs on 8GB and is genuinely capable for most conversational tasks. For code or heavy reasoning you'll want 32B+ models and matching RAM.
Is Ollama compatible with OpenClaw tool calling?: OpenClaw filters for Ollama models that report tools capability in their metadata. Models that support function/tool calling will show up automatically. Llama 3.3 and Qwen 2.5 both support this.