Skip to content

A Docker Compose to run a local ChatGPT-like application using Ollama, Ollama Web UI, Mistral NeMo & DeepSeek R1.

Notifications You must be signed in to change notification settings

lgdd/chatollama

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 

Repository files navigation

ChatOllama

A Docker Compose to run a local ChatGPT-like application using Ollama, Open WebUI with 2 models by default to try out: Mistral NeMo & DeepSeek-R1-Distill-Llama-8B.

Usage

Simply run:

docker compose up

The ollama-pull--* services will trigger an API call to Ollama to pull the associated models and shutdown when it's done. You should see the progress in the logs of that service which should end with:

{"status":"verifying sha256 digest"}
{"status":"writing manifest"}
{"status":"removing any unused layers"}
{"status":"success"}

To verify the list of downloaded models, you can call Ollama on http://localhost:11434/api/tags.

The models are stored in a volume to avoid downloading them at each restart of Ollama.

Once the model downloaded, you can go to http://localhost. By default, the port number mapped to the host is 80, but you can change it by editing the docker-compose.yml file). Next, sign up to create an account (everything is local) and log in. On the top of the page, look for the Select a model dropdown menu and select mistral:latest. After selecting it, click on the Set as default link to avoid having to select it again each time you create a new discussion.

Requirements

By default, Ollama is set to use 1 NVIDIA GPU:

deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          count: 1
          capabilities: [ gpu ]

If you want to run on CPU, you can comment the lines showed above in the docker-compose.yml file and then run docker compose up.

If you want to run on an NVIDIA GPU, make sure that your Docker daemon configuration file contains the following:

"runtimes": {
  "nvidia": {
    "path": "nvidia-container-runtime",
    "runtimeArgs": []
  }
}

You can also add "default-runtime": "nvidia".

Also, you should have installed the NVIDIA CUDA Toolkit. To verify that Docker can access your GPU, you can run:

docker run --runtime nvidia --rm nvidia/cuda:12.3.1-base-ubuntu20.04 nvidia-smi

Note

Make sure that the version of the nvidia/cuda image is aligned with the CUDA version installed using the toolkit.

About

A Docker Compose to run a local ChatGPT-like application using Ollama, Ollama Web UI, Mistral NeMo & DeepSeek R1.

Topics

Resources

Stars

Watchers

Forks