Skip to content

Commit

Permalink
Docker tweaks (#716)
Browse files Browse the repository at this point in the history
* Omit the checkpoints from the Docker context

This speeds up the build of the Docker image.

Signed-off-by: Alastair D'Silva <[email protected]>

* Build the megatron fused kernels during the Docker build

Signed-off-by: Alastair D'Silva <[email protected]>

* Downgrade protobuf

This solves the following problem:
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.

Signed-off-by: Alastair D'Silva <[email protected]>
  • Loading branch information
deece authored Nov 18, 2022
1 parent fe21c3e commit 028df0a
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 1 deletion.
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
20B_checkpoints/
8 changes: 7 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -78,11 +78,17 @@ RUN pip install torch==1.8.1+cu111 -f https://download.pytorch.org/whl/torch_sta
COPY requirements/requirements.txt .
COPY requirements/requirements-onebitadam.txt .
COPY requirements/requirements-sparseattention.txt .
RUN pip install -r requirements.txt && pip install -r requirements-onebitadam.txt && pip install -r requirements-sparseattention.txt && pip cache purge
RUN pip install -r requirements.txt && pip install -r requirements-onebitadam.txt && \
pip install -r requirements-sparseattention.txt && \
pip install protobuf==3.20.* && \
pip cache purge

## Install APEX
RUN pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" git+https://github.com/NVIDIA/apex.git@a651e2c24ecf97cbf367fd3f330df36760e1c597

COPY megatron/ megatron
RUN python megatron/fused_kernels/setup.py install

# Clear staging
RUN mkdir -p /tmp && chmod 0777 /tmp

Expand Down

0 comments on commit 028df0a

Please sign in to comment.