Open WebUI with Ollama
Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG, making it a powerful AI deployment solution.
Ollama is a tool designed to help users run, create, and share large language models (LLMs) locally on their machines. It simplifies the process of working with open-source LLMs by providing an easy-to-use interface for downloading, running, and managing models offline or in private environments.
Installation
Install the app through Rancher.
Configuration
Open WebUI:
- Configure URL (ingress): You can use our wildcard subdomain .icedc.se, for example, mynamespace-openwebui.icedc.se*. Or you can use any FQDN, just make sure it resolves to same IP addresses as for instance np.icedc.se.
- Persistent storage for Open WebUI. The default is enough for most applications.
Ollama:
Specify how much resources that your Ollama instance will reserve in terms of CPU, Memory, Storage and GPUs. Note that Memory request must match Memory limit and the amount of RAM memory needs to be big enought to hold the biggest model you plan to load. Same with persistent storage, make sure that you have enough to store the size of models you plan to download.
Access
The web interface will be accessible on your chosen subdomain, for example:
https://mynamespace-openwebui.icedc.se/
Pulling models
After deployment is done, click on deployments and find the one with name "open-webui-ollama". Start shell by clicking on "..." symbol and choose "Execute Shell". In the shell you can download models by running ollama commands like ollama pull llama3.3:70b