In today’s world, where data security and privacy have become fundamental issues, with the increasing use of language models in both private and business sectors, it is essential to start questioning the security and privacy that these tools can provide.
The risk is that we might end up providing private or confidential information to companies that will use that data for training their models and for their purposes.
ChatGPT, for instance, uses all chats to train its models, even in the paid version ChatGPT Plus at 25 EUR per month. The ChatGPT Team plans (25 EUR per month per user, minimum 2 users) and the APIs available to developers are excluded from the training.
For this reason, I believe it is important to consider using alternative interfaces that allow for a private and controlled environment. This article will explore how to implement a Chat AI interface using OpenWebUI.
OpenWebUI
OpenWebUI is an open-source framework designed to simplify the creation of user interfaces for AI applications. It provides developers with a solid foundation to build upon, allowing anyone to implement their own solution.
OpenWebUI includes an intuitive user interface enabling the management of contextual and personalized conversations, with the possibility of uploading documents or searching the web.
Moreover, OpenWebUI is equipped with tools for user and session management, also supporting integration with various models, thus allowing developers to choose the technology that best fits their needs.
Integration with LLMs
OpenWebUI can interface with LLMs in two ways: locally by running a model on the machine or via external APIs compatible with OpenAI’s standard.
To run a model locally, you can use Ollama, software that allows you to download and run a long list of LLMs locally.
The requirements for running even a small model should not be underestimated—optimal performance requires a GPU and video memory according to the model’s size:
7B model requires ~4 GB VRAM
13B model requires ~8 GB VRAM
30B model needs ~16 GB VRAM
65B model needs ~32 GB VRAM
The other method is to use “OpenAI compatible” APIs to access a model hosted by someone else; the provider can be OpenAI itself or other companies like OpenRouter that allow access to a long range of LLMs and not only GPT.
The LLMs available on the market are not only from OpenAI but also various open-source models developed by other companies.
For example, Meta makes Llama available freely.
Models differ not only in accuracy and performance but also in price when using APIs provided by third-party services. Interactions with an LLM are measured in tokens used; one token is approximately equivalent to four words, and tokens are counted in both input and output. Here are some current price examples for 1M tokens:
- GPT-4o-mini: Input $0.15, Output $0.6
- Meta: Llama 3.1 70B Instruct: Input $0.4, Output $0.4
- Meta: Llama 3.1 8B Instruct: Input $0.055, Output $0.055
- OpenAI: GPT-4o (2024-08-06): Input $2.5, Output $10
Setup
OpenWebUI can be installed by following the official instructions in the documentation, I will be using Docker Compose.
You just need to create a docker-compose.yaml
file with the content you find in the project repository:
services:
ollama:
volumes:
- ollama:/root/.ollama
container_name: ollama
pull_policy: always
tty: true
restart: unless-stopped
image: ollama/ollama:${OLLAMA_DOCKER_TAG-latest}
open-webui:
build:
context: .
args:
OLLAMA_BASE_URL: '/ollama'
dockerfile: Dockerfile
image: ghcr.io/open-webui/open-webui:${WEBUI_DOCKER_TAG-main}
container_name: open-webui
volumes:
- open-webui:/app/backend/data
depends_on:
- ollama
ports:
- ${OPEN_WEBUI_PORT-3000}:8080
environment:
- 'OLLAMA_BASE_URL=http://ollama:11434'
- 'WEBUI_SECRET_KEY='
extra_hosts:
- host.docker.internal:host-gateway
restart: unless-stopped
volumes:
ollama: {}
open-webui: {}
You can start the container with sudo docker compose up -d
and access the interface at http://localhost:3000
.
The first account created will be the administrator; through the Admin Panel in the “Connections” section, you can choose whether to use a model with Ollama or APIs.
Reverse proxy
In a production environment, it is always recommended to serve applications with a web server like Nginx configured as a reverse proxy:
server {
server_name openwebui.site.it;
location / {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_buffering off;
proxy_set_header Origin '';
proxy_set_header Referer '';
}
listen 80;
}