Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds neuron support #486

Merged
merged 5 commits into from
Dec 3, 2024
Merged

Adds neuron support #486

merged 5 commits into from
Dec 3, 2024

Conversation

michaelfeil
Copy link
Owner

Related Issue

Checklist

  • I have read the CONTRIBUTING guidelines.
  • I have added tests to cover my changes.
  • I have updated the documentation (docs folder) accordingly.

Additional Notes

Add any other context about the PR here.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR adds AWS Neuron support to the Infinity project, enabling deployment on AWS Inferentia hardware through Docker containers and ECS tasks with optimized model inference.

  • Added NeuronOptimumEmbedder in /libs/infinity_emb/infinity_emb/transformer/embedder/neuron.py with dynamic batch size support and proper core detection
  • Added AWS Neuron base Dockerfile /infra/aws_neuron/Dockerfile.base with neuronx packages and runtime configuration
  • Added deployment instructions in /infra/aws_neuron/README.md for EC2 and ECS with Huggingface AMI
  • Added proper device mounting and IPC configuration in ECS task definition for Neuron accelerator access
  • Added version-pinned dependencies in reqs_frozen.txt for reproducible Neuron builds

7 file(s) reviewed, 19 comment(s)
Edit PR Review Bot Settings | Greptile

Comment on lines +48 to +53
# RUN pip3 install \
# neuronx-cc==2.15.143.0 \
# torch-neuronx==2.1.2.2.3.2 \
# transformers-neuronx==0.12.313 \
# libneuronxla==2.0.5347.0 \
# --extra-index-url=https://pip.repos.neuron.amazonaws.com
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Remove commented out code block - it's obsolete and could cause confusion

RUN pip3 install --upgrade \
neuronx-cc==2.* \
libneuronxla==2.0.5347.0 \
torch-neuronx==2.1.2.2.2.0 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: torch-neuronx version 2.1.2.2.2.0 is older than the commented out version 2.1.2.2.3.2 above - verify this downgrade was intentional

Comment on lines +68 to +72
# COPY reqs_frozen.txt reqs_frozen.txt
# RUN pip3 install -r reqs_frozen.txt
# Install optimum-neuron
#14 19.70 Successfully installed aiohappyeyeballs-2.4.4 aiohttp-3.11.9 aiosignal-1.3.1 async-timeout-5.0.1 attrs-24.2.0 coloredlogs-15.0.1 datasets-3.1.0 dill-0.3.8 frozenlist-1.5.0 fsspec-2024.9.0 humanfriendly-10.0 multidict-6.1.0 multiprocess-0.70.16 optimum-1.18.0 optimum-neuron-0.0.1 pandas-2.2.3 propcache-0.2.1 pyarrow-18.1.0 pytz-2024.2 requests-2.32.3 sentencepiece-0.2.0 tokenizers-0.15.2 transformers-4.39.3 tzdata-2024.2 xxhash-3.5.0 yarl-1.18.3
# RUN pip3 install optimum[neuronx] --extra-index-url=https://pip.repos.neuron.amazonaws.com
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Remove commented installation commands and build output logs

ENV PATH="/opt/bin/:/opt/aws/neuron/bin:${PATH}"

FROM neuron AS infinity
RUN apt-get update -y && apt-get install -y nano
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Installing nano editor increases image size unnecessarily - remove if not critical for production

Comment on lines +1 to +2
# Is an mirror of
# 763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference-neuronx:2.1.2-transformers4.43.2-neuronx-py310-sdk2.20.0-ubuntu20.04
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: typo in comment: 'Is an mirror' should be 'Is a mirror'

optimum[neuronx]==1.22.0
orjson==3.10.7
overrides==7.7.0
packaging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: packaging package is missing version pin, which could cause dependency conflicts

Comment on lines +23 to +25
hf-transfer
httptools==0.6.4 ; python_version >= "3.9" and python_version < "4"
huggingface-hub
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: hf-transfer and huggingface-hub packages missing version constraints. Add specific versions to ensure reproducible builds.

mpmath==1.3.0 ; python_version >= "3.9" and python_version < "4"
multidict==6.1.0 ; python_version >= "3.9" and python_version < "4"
multiprocess==0.70.15 ; python_version >= "3.9" and python_version < "4"
numpy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: numpy package missing version constraint. Should be pinned to ensure compatibility with other dependencies.

Comment on lines +55 to +56
scikit-learn
scipy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: scikit-learn and scipy packages missing version constraints. Should be pinned for reproducibility.

aiohappyeyeballs==2.4.3 ; python_version >= "3.9" and python_version < "4"
aiohttp==3.10.10 ; python_version >= "3.9" and python_version < "4"
aiosignal==1.3.1 ; python_version >= "3.9" and python_version < "4"
async-timeout==4.0.3 ; python_version >= "3.9" and python_version < "3.11"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: async-timeout restricted to Python < 3.11 while other packages support up to Python 4. May cause issues with Python 3.11+ environments.

@michaelfeil michaelfeil merged commit 1bc513b into main Dec 3, 2024
36 checks passed
@michaelfeil michaelfeil deleted the neuron branch December 3, 2024 05:58
@codecov-commenter
Copy link

codecov-commenter commented Dec 3, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.48%. Comparing base (dd72f23) to head (6165161).
Report is 1 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #486      +/-   ##
==========================================
- Coverage   79.54%   79.48%   -0.06%     
==========================================
  Files          41       41              
  Lines        3422     3422              
==========================================
- Hits         2722     2720       -2     
- Misses        700      702       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants