diff --git a/docs/docker_examples/torch/dockerfile_from_dev_ubuntu b/docs/docker_examples/torch/dockerfile_from_dev_ubuntu new file mode 100644 index 00000000..a4ddc03d --- /dev/null +++ b/docs/docker_examples/torch/dockerfile_from_dev_ubuntu @@ -0,0 +1,26 @@ +ARG ROCM_VERSION=5.7 +FROM rocm/dev-ubuntu-22.04:$ROCM_VERSION + +ARG ROCM_VERSION # Re-declare for use in this scope + +ARG DEBIAN_FRONTEND=noninteractive +RUN apt-get update \ + # Add the deadsnakes repo for python + && apt-get install -y software-properties-common && add-apt-repository ppa:deadsnakes/ppa \ + # Install Pyton 3.11 and other deps + && apt-get update && apt-get install -y \ + python3.11 \ + python3.11-venv \ + wget \ + # Clean up the cache + && rm -rf /var/lib/apt/lists/* + +# Create a virtual environment, and put it at the front of our PATH +RUN python3.11 -m venv venv +ENV PATH=/venv/bin:$PATH + +# Install PyTorch +RUN pip3 install torch==2.2.1 torchvision==0.17.1 --index-url https://download.pytorch.org/whl/rocm$ROCM_VERSION/ + +# Install additional python dependencies +RUN pip3 install transformers==4.38.2 diff --git a/docs/how-to/3rd-party/pytorch-install.rst b/docs/how-to/3rd-party/pytorch-install.rst index 1b5d6d42..9405ba8b 100644 --- a/docs/how-to/3rd-party/pytorch-install.rst +++ b/docs/how-to/3rd-party/pytorch-install.rst @@ -64,7 +64,7 @@ to `pytorch.org/get-started/locally/ ` wheels command, you must select 'Linux', 'Python', 'pip', and 'ROCm' in the matrix. .. note:: - The available ROCm release varies between the 'Pytorch Build' of ``Stable`` or ``Nightly``. More recent releases are generally available through the Nightly builds. + The available ROCm release varies between the 'PyTorch Build' of ``Stable`` or ``Nightly``. More recent releases are generally available through the Nightly builds. 1. Choose one of the following three options: diff --git a/docs/how-to/3rd-party/pytorch/docker.rst b/docs/how-to/3rd-party/pytorch/docker.rst new file mode 100644 index 00000000..79fce425 --- /dev/null +++ b/docs/how-to/3rd-party/pytorch/docker.rst @@ -0,0 +1,90 @@ +PyTorch+ROCm in Docker +======================= + +Using Docker to run your PyTorch + ROCm application is one of the best ways to get consistent and reproducible environments. + +Additional PyTorch Docker arguments +------------------------------- + +Regardless of which image you use or build, running PyTorch docker images requires several arguments in addition to those discussed in :ref:`docker-access-gpus-in-container`. + +* ``--ipc=host`` OR ``--shm-size=Xg`` + + PyTorch uses shared memory to share data between processes (such as multi-threaded data loaders). As such, you must increase the shared memory size, which defaults to 64M. This can be done in two ways: + + * ``--ipc=host`` shares the IPC directly from the host, which will allow the container access to all resources on the host. For most applications, this is sufficient. + + * ``--shm-size`` allows more granular control over resourcing for a container. In applications with multiple containers running simultaneously, setting this value appropriately can help prevent memory errors. + + See the `PyTorch docs on using docker images `_ for information on these options. + +.. code-block:: bash + + docker run -it --device=/dev/fkd --device=/dev/dri \ + --security-opt seccomp=unconfined --ipc=host \ + + +Alternatively, you can use the equivalent ``docker-compose.yaml``: + +.. code-block:: yaml + + version: "3.7" + services: + my-service: + image: + device: + - /dev/fdk + - /dev/dri + security_opt: + - seccomp:unconfined + ipc: host + + +Pre-built PyTorch+ROCm Docker images +-------------------------------------- + +The easiest method to run PyTorch+ROCm is to use a pre-built image from `AMD ROCm on docker hub `_, which contain ROCm as well as PyTorch. You can select an image from either of the following sources, with your desired OS, ROCm, Python, and PyTorch versions. + +* `rocm/pytorch `_ - latest stable builds +* `rocm/pytorch-nightly `_ - latest nightly builds + +For example, to run the latest rocm/pytorch image, run: + +.. code-block:: bash + + docker run -it --device=/dev/fkd --device=/dev/dri \ + --security-opt seccomp=unconfined --ipc=host \ + rocm/pytorch:rocm6.0.2_ubuntu22.04_py3.10_pytorch_2.1.2 + +.. Tip:: + + Always use a specific tag (e.g. ``rocm6.0.2_ubuntu22.04_py3.10_pytorch_2.1.2``) over ``latest``! Otherwise, the image may change without your knowledge. + +Custom Docker Images +-------------------- + +As is often the case, your specific requirements may not be met by one of the pre-built PyTorch+ROCm images. For example, you may need additional Python dependencies, a different version of PyTorch, or even a completely different combination of OS, ROCm, Python, and PyTorch. + +To meet these needs, you can build your own ROCm docker images. + +From ROCm dev container +....................... + +We provide several dev containers, which contain just the base OS + ROCm. These containers are a great place to start when building custom images, as you don't have to install ROCm into the image yourself. + +* Select a base image that meets your needs. To find a list of base images, `search rocm/dev `_ on Docker Hub. +* Build your ``dockerfile`` + + * Install required python version + * Install PyTorch, and other python dependencies + +Below is an example ``dockerfile`` based on ``rocm/dev-ubuntu-22.04:5.7`` + +.. literalinclude:: ../../../docker_examples/torch/dockerfile_from_dev_ubuntu + :language: dockerfile + +.. tip:: + + * Using ARGs can help simplify your dockerfiles, and prevent version mismatches + * Use specific versions, rather than ``latest``, to help keep builds reproducible. This also applies to python packages. + * It is always a good idea to use virtual environments, *even inside docker*! diff --git a/docs/how-to/3rd-party/pytorch/index.rst b/docs/how-to/3rd-party/pytorch/index.rst new file mode 100644 index 00000000..26987cd3 --- /dev/null +++ b/docs/how-to/3rd-party/pytorch/index.rst @@ -0,0 +1,29 @@ +.. meta:: + :description: Pytorch+ROcm + :keywords: PyTorch, ROCm + +.. _pytorch-home: + +**************************************************************** +PyTorch +**************************************************************** + +`PyTorch `_ is an open-source tensor library designed for deep learning. PyTorch on +ROCm provides mixed-precision and large-scale training using our +`MIOpen `_ and +`RCCL `_ libraries. + +.. grid:: 2 + :gutter: 1 + + .. grid-item-card:: Install PyTorch + :link: how-to/3rd-party/pytorch/install + :link-type: doc + + PyTorch installation instructions + + .. grid-item-card:: PyTorch + Docker + :link: how-to/3rd-party/pytorch/docker + :link-type: doc + + Running PyTorch in Docker diff --git a/docs/how-to/3rd-party/pytorch/install.rst b/docs/how-to/3rd-party/pytorch/install.rst new file mode 100644 index 00000000..ca095f01 --- /dev/null +++ b/docs/how-to/3rd-party/pytorch/install.rst @@ -0,0 +1,288 @@ +.. meta:: + :description: PyTorch with ROCm + :keywords: installation instructions, PyTorch, AMD, ROCm + +********************************************************************************** +Installing PyTorch for ROCm +********************************************************************************** + +`PyTorch `_ is an open-source tensor library designed for deep learning. PyTorch on +ROCm provides mixed-precision and large-scale training using our +`MIOpen `_ and +`RCCL `_ libraries. + +Install from pre-built wheels +============================= + +PyTorch supports the ROCm platform by providing pre-built wheels packages for a variety of PyTorch and ROCm versions. + +.. Warning:: + + **Make sure the PyTorch version you install was compiled for your ROCm version!** If there is a mismatch, you may experience + performace degredation or errors. + +The Major and Minor versions must match, but different Patch versions are acceptable. For example, PyTorch 2.2.1+ROCm6.0 will work with ROCm 6.0.1, but not 6.1, or 5.7. + +To check your rocm version: + +.. code-block:: shell + + $ rocm-smi -V + ROCM-SMI-LIB version: 5.7.0 + +Latest Stable +------------- + +To install the latest version: + +* Navigate to `pytorch.org/get-started/locally/ `_. +* Select ``Stable``, ``Linux``, ``Pip``, ``Python``, and ``ROCm`` + + * The ``ROCm`` box will indicate which ROCm version these wheels were built for + +You should see a ``pip3 install`` commmand: + +.. code-block:: shell + + pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.7 + + +.. Note:: + + **PyTorch typically only builds the latest stable builds against a single version of ROCm.** If this ROCm version + does not match your needs, see :ref:`torch-nightly` or :ref:`torch-additional-versions` below. + +.. Tip:: + + You should use a virtual environment when installing python packages in order to separate dependencies + from the system and other projects. + +.. _torch-nightly: + +Nightly +------- + +In addition to stable wheels, PyTorch also publishes nightly builds, which *may* be against a newer version of ROCm. Nightly builds can be a great way +to get access to newer features. + +To install from nightly: + +* Navigate to `pytorch.org/get-started/locally/ `_. +* Select **``Nightly``**, ``Linux``, ``Pip``, ``Python``, and ``ROCm`` + +.. _torch-additional-versions: + +Additional Versions +------------------- + +In addition to the latest stable and nightly wheels, you can search for previous versions on `PyTorch's docs here `_ + +**Make sure to search for the appropriate ROCm version** + +Install From Source +=================== + +If a pre-built wheel is not available to match your specific Python, PyTorch, and ROCm versions, +you can build and install PyTorch from source. See the `official build instructions `_ for details. + +Other Package Managers +======================= + +You can also use more sophisticated dependency management tools like PDM and Poetry. These tools provide several benefits over ``pip``, including +automatic creation of virtual environments, complex dependency resolution, lockfiles, and more. + +`PDM `_ +------------------------------------------- + +* Add the ``index-url`` from above as a `source `_ by adding the following lines to your ``pyproject.toml`` file: + + .. code-block:: + + [[tool.pdm.source]] + name = "torch-index" # You can give this any name + url = "https://download.pytorch.org/whl/rocm5.7/" + +* Add dependencies + + .. code-block:: shell + + pdm add torch ... + +PDM will then first look in the proided source to install any package, before falling back to `pypi.org `_. + +`Poetry `_ +---------------------------------------------- + +* Add the ``index-url`` from above as a `source `_: + + .. code-block:: + + poetry source add torch-index https://download.pytorch.org/whl/rocm5.7 + +* Add dependencies, and specify the source: + + .. code-block:: + + poetry add torch --source torch-index + +[Optional] Installing pre-compiled MIOpen kernels +=================================================== + +PyTorch uses `MIOpen `_ for machine learning +primitives, which are compiled into kernels at runtime. Runtime compilation causes a small warm-up +phase when starting PyTorch, and MIOpen kdb files contain precompiled kernels that can speed up +application warm-up phases. + +MIOpen kdb files can be used with ROCm PyTorch wheels. However, the kdb files need to be placed in +a specific location with respect to the PyTorch installation path. A helper script simplifies this task by +taking the ROCm version and GPU architecture as inputs. This works for Ubuntu and CentOS. + +.. note:: + + Installing pre-compiled MIOpen kernels can speed up warm-up, but will not affect performance after the + initial warm-up. Additionally, as MIOpen caches kernels, this warm-up cost is only paid once. + +To install MIOpen kbd files for pytorch, run: + +.. code-block:: shell + + wget https://raw.githubusercontent.com/wiki/ROCmSoftwarePlatform/pytorch/files/install_kdb_files_for_pytorch_wheels.sh + + #Optional; replace 'gfx90a' with your architecture and 5.6 with your preferred ROCm version + export GFX_ARCH=gfx90a + export ROCM_VERSION=5.6 + + ./install_kdb_files_for_pytorch_wheels.sh + +Further reading: + +* `MIOpen Docs `_ +* `MIOpen repo `_ +* `Installing pre-compiled MIOpen kernels `_ +* `Using MIOpen kbd files with PyTorch Wheels `_ + +Testing the PyTorch installation +================================= + +You can use PyTorch unit tests to validate your PyTorch installation. + +If you want to manually run unit tests to validate your PyTorch installation fully, follow these steps: + +1. Import the torch package in Python to test if PyTorch is installed and accessible. + + .. note:: + + Do not run the following command in the PyTorch git folder. + + .. code-block:: bash + + python3 -c 'import torch' 2> /dev/null && echo 'Success' || echo 'Failure' + +2. Check if the GPU is accessible from PyTorch. In the PyTorch framework, ``torch.cuda`` is a generic way + to access the GPU. This can only access an AMD GPU if one is available. + + .. code-block:: bash + + python3 -c 'import torch; print(torch.cuda.is_available())' + + +3. Run unit tests to validate the PyTorch installation fully. + + .. note:: + + You must run the following command from the PyTorch home directory. + + .. code-block:: bash + + PYTORCH_TEST_WITH_ROCM=1 python3 test/run_test.py --verbose \ + --include test_nn test_torch test_cuda test_ops \ + test_unary_ufuncs test_binary_ufuncs test_autograd + + This command ensures that the required environment variable is set to skip certain unit tests for + ROCm. This also applies to wheel installs in a non-controlled environment. + + .. note:: + + Make sure your PyTorch source code corresponds to the PyTorch wheel or the installation in the + Docker image. Incompatible PyTorch source code can give errors when running unit tests. + + Some tests may be skipped, as appropriate, based on your system configuration. ROCm doesn't + support all PyTorch features; tests that evaluate unsupported features are skipped. Other tests might + be skipped, depending on the host or GPU memory and the number of available GPUs. + + If the compilation and installation are correct, all tests will pass. + +4. Run individual unit tests. + + .. code-block:: bash + + PYTORCH_TEST_WITH_ROCM=1 python3 test/test_nn.py --verbose + + You can replace ``test_nn.py`` with any other test set. + +Running a basic PyTorch example +================================ + +The PyTorch examples repository provides basic examples that exercise the functionality of your +framework. + +Two of our favorite testing databases are: + +* **MNIST** (Modified National Institute of Standards and Technology): A database of handwritten + digits that can be used to train a Convolutional Neural Network for **handwriting recognition**. +* **ImageNet**: A database of images that can be used to train a network for + **visual object recognition**. + +MNIST PyTorch example +------------------------ + +1. Clone the PyTorch examples repository. + + .. code-block:: bash + + git clone https://github.com/pytorch/examples.git + +2. Go to the MNIST example folder. + + .. code-block:: bash + + cd examples/mnist + +3. Follow the instructions in the ``README.md`` file in this folder to install the requirements. Then run: + + .. code-block:: bash + + python3 main.py + + This generates the following output: + + .. code-block:: + + ... + Train Epoch: 14 [58240/60000 (97%)] Loss: 0.010128 + Train Epoch: 14 [58880/60000 (98%)] Loss: 0.001348 + Train Epoch: 14 [59520/60000 (99%)] Loss: 0.005261 + + Test set: Average loss: 0.0252, Accuracy: 9921/10000 (99%) + +ImageNet PyTorch example +---------------------------- + +1. Clone the PyTorch examples repository (if you didn't already do this in the preceding MNIST + example). + + .. code-block:: bash + + git clone https://github.com/pytorch/examples.git + +2. Go to the ImageNet example folder. + + .. code-block:: bash + + cd examples/imagenet + +3. Follow the instructions in the ``README.md`` file in this folder to install the Requirements. Then run: + + .. code-block:: bash + + python3 main.py diff --git a/docs/how-to/native-install/post-install.rst b/docs/how-to/native-install/post-install.rst index cfeb04c6..3c77ccd3 100644 --- a/docs/how-to/native-install/post-install.rst +++ b/docs/how-to/native-install/post-install.rst @@ -6,9 +6,9 @@ Post-installation instructions ************************************************************************* -1. Configure the system linker. +#. Configure the system linker. - Instruct the system linker where to find shared objects (``.so`` files) for ROCm applications. + Instruct the system linker where to find shared objects (``.so``-files) for ROCm applications. .. code-block:: bash @@ -17,8 +17,13 @@ Post-installation instructions /opt/rocm/lib64 EOF sudo ldconfig + sudo tee --append /etc/ld.so.conf.d/rocm.conf <`. + * Confirm that your Linux distribution matches a :ref:`supported distribution`. **Example:** Running the preceding command on an Ubuntu system produces the following output: .. code-block:: shell - x86_64 - DISTRIB_ID=Ubuntu - DISTRIB_RELEASE=20.04 - DISTRIB_CODENAME=focal - DISTRIB_DESCRIPTION="Ubuntu 20.04.5 LTS" + x86_64 + DISTRIB_ID=Ubuntu + DISTRIB_RELEASE=20.04 + DISTRIB_CODENAME=focal + DISTRIB_DESCRIPTION="Ubuntu 20.04.5 LTS" -2. Verify the kernel version. +#. Verify the kernel version. * To check the kernel version of your Linux system, type the following command: .. code-block:: shell - uname -srmv + uname -srmv **Example:** The preceding command lists the kernel version in the following format: .. code-block:: shell - Linux 5.15.0-46-generic #44~20.04.5-Ubuntu SMP Fri Jun 24 13:27:29 UTC 2022 x86_64 + Linux 5.15.0-46-generic #44~20.04.5-Ubuntu SMP Fri Jun 24 13:27:29 UTC 2022 x86_64 * Confirm that your kernel version matches the system requirements, as listed in :ref:`supported_distributions`. @@ -72,7 +72,7 @@ instructions for your distribution. .. tab-item:: RHEL/OL {{ os_release }} - .. code-block:: shell + .. code-block:: shell wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-{{ os_release }}.noarch.rpm sudo rpm -ivh epel-release-latest-{{ os_release }}.noarch.rpm diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in index bf85b9b8..4caecdab 100644 --- a/docs/sphinx/_toc.yml.in +++ b/docs/sphinx/_toc.yml.in @@ -33,16 +33,26 @@ subtrees: - caption: Install entries: - - file: how-to/3rd-party/pytorch-install + - file: how-to/3rd-party/pytorch/index + title: PyTorch + subtrees: + - entries: + - file: how-to/3rd-party/pytorch/install + title: Install PyTorch + - file: how-to/3rd-party/pytorch/docker + title: PyTorch + Docker + + + - file: how-to/3rd-party/pytorch-install-old title: PyTorch - file: how-to/3rd-party/tensorflow-install title: TensorFlow - file: how-to/3rd-party/jax-install title: JAX - file: how-to/3rd-party/magma-install - title: Magma - -- caption: How-to + title: MAGMA + +- caption: How to entries: - file: how-to/docker title: Run Docker containers diff --git a/docs/sphinx/requirements.txt b/docs/sphinx/requirements.txt index 178b92c7..e35cf773 100644 --- a/docs/sphinx/requirements.txt +++ b/docs/sphinx/requirements.txt @@ -1,5 +1,5 @@ # -# This file is autogenerated by pip-compile with Python 3.8 +# This file is autogenerated by pip-compile with Python 3.10 # by the following command: # # pip-compile requirements.in @@ -49,10 +49,6 @@ idna==3.4 # via requests imagesize==1.4.1 # via sphinx -importlib-metadata==7.0.0 - # via sphinx -importlib-resources==6.1.1 - # via rocm-docs-core jinja2==3.1.2 # via # myst-parser @@ -158,7 +154,3 @@ urllib3==1.26.13 # via requests wrapt==1.14.1 # via deprecated -zipp==3.17.0 - # via - # importlib-metadata - # importlib-resources