Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torch update #93

Draft
wants to merge 20 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions docs/docker_examples/torch/dockerfile_from_dev_ubuntu
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
ARG ROCM_VERSION=5.7
FROM rocm/dev-ubuntu-22.04:$ROCM_VERSION

ARG ROCM_VERSION # Re-declare for use in this scope

ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
# Add the deadsnakes repo for python
&& apt-get install -y software-properties-common && add-apt-repository ppa:deadsnakes/ppa \
# Install Pyton 3.11 and other deps
&& apt-get update && apt-get install -y \
python3.11 \
python3.11-venv \
wget \
# Clean up the cache
&& rm -rf /var/lib/apt/lists/*

# Create a virtual environment, and put it at the front of our PATH
RUN python3.11 -m venv venv
ENV PATH=/venv/bin:$PATH

# Install PyTorch
RUN pip3 install torch==2.2.1 torchvision==0.17.1 --index-url https://download.pytorch.org/whl/rocm$ROCM_VERSION/

# Install additional python dependencies
RUN pip3 install transformers==4.38.2
2 changes: 1 addition & 1 deletion docs/how-to/3rd-party/pytorch-install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ to `pytorch.org/get-started/locally/ <https://pytorch.org/get-started/locally/>`
wheels command, you must select 'Linux', 'Python', 'pip', and 'ROCm' in the matrix.

.. note::
The available ROCm release varies between the 'Pytorch Build' of ``Stable`` or ``Nightly``. More recent releases are generally available through the Nightly builds.
The available ROCm release varies between the 'PyTorch Build' of ``Stable`` or ``Nightly``. More recent releases are generally available through the Nightly builds.

1. Choose one of the following three options:

Expand Down
90 changes: 90 additions & 0 deletions docs/how-to/3rd-party/pytorch/docker.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
PyTorch+ROCm in Docker
=======================

Using Docker to run your PyTorch + ROCm application is one of the best ways to get consistent and reproducible environments.

Additional PyTorch Docker arguments
-------------------------------

Regardless of which image you use or build, running PyTorch docker images requires several arguments in addition to those discussed in :ref:`docker-access-gpus-in-container`.

* ``--ipc=host`` OR ``--shm-size=Xg``

PyTorch uses shared memory to share data between processes (such as multi-threaded data loaders). As such, you must increase the shared memory size, which defaults to 64M. This can be done in two ways:

* ``--ipc=host`` shares the IPC directly from the host, which will allow the container access to all resources on the host. For most applications, this is sufficient.

* ``--shm-size`` allows more granular control over resourcing for a container. In applications with multiple containers running simultaneously, setting this value appropriately can help prevent memory errors.

See the `PyTorch docs on using docker images <https://github.com/pytorch/pytorch?tab=readme-ov-file#using-pre-built-images>`_ for information on these options.

.. code-block:: bash

docker run -it --device=/dev/fkd --device=/dev/dri \
--security-opt seccomp=unconfined --ipc=host \
<image>

Alternatively, you can use the equivalent ``docker-compose.yaml``:

.. code-block:: yaml

version: "3.7"
services:
my-service:
image: <image>
device:
- /dev/fdk
- /dev/dri
security_opt:
- seccomp:unconfined
ipc: host


Pre-built PyTorch+ROCm Docker images
--------------------------------------

The easiest method to run PyTorch+ROCm is to use a pre-built image from `AMD ROCm on docker hub <https://hub.docker.com/u/rocm>`_, which contain ROCm as well as PyTorch. You can select an image from either of the following sources, with your desired OS, ROCm, Python, and PyTorch versions.

* `rocm/pytorch <https://hub.docker.com/r/rocm/pytorch>`_ - latest stable builds
* `rocm/pytorch-nightly <https://hub.docker.com/r/rocm/pytorch-nightly>`_ - latest nightly builds

For example, to run the latest rocm/pytorch image, run:

.. code-block:: bash

docker run -it --device=/dev/fkd --device=/dev/dri \
--security-opt seccomp=unconfined --ipc=host \
rocm/pytorch:rocm6.0.2_ubuntu22.04_py3.10_pytorch_2.1.2

.. Tip::

Always use a specific tag (e.g. ``rocm6.0.2_ubuntu22.04_py3.10_pytorch_2.1.2``) over ``latest``! Otherwise, the image may change without your knowledge.

Custom Docker Images
--------------------

As is often the case, your specific requirements may not be met by one of the pre-built PyTorch+ROCm images. For example, you may need additional Python dependencies, a different version of PyTorch, or even a completely different combination of OS, ROCm, Python, and PyTorch.

To meet these needs, you can build your own ROCm docker images.

From ROCm dev container
.......................

We provide several dev containers, which contain just the base OS + ROCm. These containers are a great place to start when building custom images, as you don't have to install ROCm into the image yourself.

* Select a base image that meets your needs. To find a list of base images, `search rocm/dev <https://hub.docker.com/search?q=rocm%2Fdev>`_ on Docker Hub.
* Build your ``dockerfile``

* Install required python version
* Install PyTorch, and other python dependencies

Below is an example ``dockerfile`` based on ``rocm/dev-ubuntu-22.04:5.7``

.. literalinclude:: ../../../docker_examples/torch/dockerfile_from_dev_ubuntu
:language: dockerfile

.. tip::

* Using ARGs can help simplify your dockerfiles, and prevent version mismatches
* Use specific versions, rather than ``latest``, to help keep builds reproducible. This also applies to python packages.
* It is always a good idea to use virtual environments, *even inside docker*!
29 changes: 29 additions & 0 deletions docs/how-to/3rd-party/pytorch/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
.. meta::
:description: Pytorch+ROcm
:keywords: PyTorch, ROCm

.. _pytorch-home:

****************************************************************
PyTorch
****************************************************************

`PyTorch <https://pytorch.org/>`_ is an open-source tensor library designed for deep learning. PyTorch on
ROCm provides mixed-precision and large-scale training using our
`MIOpen <https://github.com/ROCmSoftwarePlatform/MIOpen>`_ and
`RCCL <https://github.com/ROCmSoftwarePlatform/rccl>`_ libraries.

.. grid:: 2
:gutter: 1

.. grid-item-card:: Install PyTorch
:link: how-to/3rd-party/pytorch/install
:link-type: doc

PyTorch installation instructions

.. grid-item-card:: PyTorch + Docker
:link: how-to/3rd-party/pytorch/docker
:link-type: doc

Running PyTorch in Docker
Loading
Loading