Skip to main content
Back to blog

Getting started with Open WebUI and Ollama

·2 min readAI

Running LLMs locally has gotten surprisingly easy. You no longer need a cloud subscription or an API key to have a capable AI assistant. With Ollama and Open WebUI, you can run models like Llama 3, Mistral, and others right on your own hardware.

Why run models locally?

Privacy is the obvious one. Your prompts and data never leave your machine. But there are practical reasons too. No rate limits, no API costs, and you can experiment with different models without worrying about billing. If you are working on anything sensitive or just want to tinker without constraints, local is the way to go.

Setting up Ollama

Ollama makes it dead simple to download and run open-source models. Install it from ollama.com and then pull a model:

ollama pull llama3.1

That is it. You now have a working LLM. You can chat with it directly in the terminal:

ollama run llama3.1

But a terminal chat is not the best experience for longer conversations. That is where Open WebUI comes in.

Setting up Open WebUI

Open WebUI gives you a ChatGPT-style interface for your local models. The easiest way to run it is with Docker:

docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Open http://localhost:3000 in your browser, create an account (it is local, so this is just for the UI), and you will see your Ollama models available in the model dropdown.

What you get

The interface supports conversation history, multiple model switching, system prompts, and even document uploads for RAG-style workflows. It is genuinely good, not just "good for a self-hosted tool."

I use this setup daily for quick questions, code review, and drafting. It is fast enough on a decent GPU, and the privacy aspect means I do not hesitate to paste in proprietary code or internal docs.

Picking models

Start with llama3.1 for general use. If you want something smaller and faster, mistral is solid. For coding tasks, codellama or deepseek-coder are worth trying. Ollama's model library keeps growing, so check what is available with:

ollama list

The whole setup takes about 10 minutes and gives you a private AI assistant that is surprisingly capable.

Sources

Enjoying the blog? Subscribe via RSS to get new posts in your reader.

Subscribe via RSS