Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update deprecated/unused dependencies 🧹 🧹 #36419

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

gante
Copy link
Member

@gante gante commented Feb 26, 2025

What does this PR do?

Updates deprecated (because of our minimum versions, like python>=3.9 or torch>=2.0) and unused dependency-related code.

  • Removes dependencies/maximum version pins from _deps (each change is commented in this PR with further information about why we can make the change)
  • Removes unused is_xxx_available and requires_xxx functions
  • Updates code that can no longer be reached or is redundant due to newer versions (like code to check that we are using torch>=2.0 or dynamo+nvfuser)
  • Other minor dependency-related changes

✅ CI is green on dev docker images (failing TF tests are unrelated, and a result of [test-all]). dev image PR: #36427


Post merge:

  • Update CI image docs on notion (commit message now doesn't need to be exactly dev-ci, it only needs to include dev-ci)

@gante gante changed the title up until blobfile Update deprecated/unused dependencies Feb 26, 2025
Comment on lines -78 to -91
# Remove stale transformers.egg-info directory to avoid https://github.com/pypa/pip/issues/5466
stale_egg_info = Path(__file__).parent / "transformers.egg-info"
if stale_egg_info.exists():
print(
(
"Warning: {} exists.\n\n"
"If you recently updated transformers to 3.0 or later, this is expected,\n"
"but it may prevent transformers from installing in editable mode.\n\n"
"This directory is automatically generated by Python's packaging tools.\n"
"I will remove it now.\n\n"
"See https://github.com/pypa/pip/issues/5466 for details.\n"
).format(stale_egg_info)
)
shutil.rmtree(stale_egg_info)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root issue was sorted and added to pip 20.1, released in early 2020. Our minimum python version is 3.9, released in late 2020, and it comes with a pip version more recent than 20.1.

# IMPORTANT:
# 1. all dependencies should be listed here with their version requirements if any
# 2. once modified, run: `make deps_table_update` to update src/transformers/dependency_versions_table.py
_deps = [
"Pillow>=10.0.1,<=15.0",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(moved, out of alphabetic order)

@@ -345,6 +327,7 @@ def run(self):
)
+ extras["retrieval"]
+ extras["modelcreation"]
+ extras["tiktoken"]
Copy link
Member Author

@gante gante Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed to test py.test tests/models/llama/test_tokenization_llama.py::TikTokenIntegrationTests, so it makes sense to be in the testing extra

Some of our CI images rely on installing testing (e.g. see scheduled daily torch tests), and were not running those tests as a result

Comment on lines +847 to +848
@require_tiktoken
@require_read_token
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@require_read_token behaves slightly different from other @require_xxx: it doesn't add a @unittest.skipUnless. On my machine these tests were not being run, and I can't find them on CI either (but perhaps I'm not looking in the right place)

"accelerate>=0.26.0",
"av",
"beautifulsoup4",
"blobfile",
"codecarbon>=2.8.1",
"cookiecutter==1.7.3",
"dataclasses",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataclasses is part of the standard library since python 3.7: https://peps.python.org/pep-0557/

"datasets!=2.5.0",
"deepspeed>=0.9.3",
"diffusers",
"dill<0.3.5",
Copy link
Member Author

@gante gante Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dill was pinned to sort example tests (#17368) -- removing to test whether it is still needed

"evaluate>=0.2.0",
"faiss-cpu",
"fastapi",
"filelock",
"flax>=0.4.1,<=0.7.0",
"fsspec<2023.10.0",
Copy link
Member Author

@gante gante Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fsspec was pinned to fix CI (#27010) -- removing to test whether it is still needed

"ftfy",
"fugashi>=1.0",
"GitPython<3.1.19",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pin on GitPython was added to fix CI (#12858) -- removing to test whether it is still needed

@@ -130,11 +108,10 @@
"keras>2.9,<2.16",
"keras-nlp>=0.3.1,<0.14.0", # keras-nlp 0.14 doesn't support keras 2, see pin on keras.
"librosa",
"natten>=0.14.6,<0.15.0",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pin on GitPython was added to fix CI (#28432) -- removing to test whether it is still needed

"nltk<=3.8.1",
"num2words",
"numpy>=1.17",
"onnxconverter-common",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onnxconverter-common (library name) nor onnxconverter_common (corresponding import) are present in our library

"nltk<=3.8.1",
"num2words",
"numpy>=1.17",
"onnxconverter-common",
"onnxruntime-tools>=1.4.2",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

"protobuf",
"psutil",
"pyyaml>=5.1",
"pydantic",
"pytest>=7.2.0,<8.0.0",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pin on pytest was added to fix CI (#28758) -- removing to test whether it is still needed

@@ -158,7 +135,6 @@
"requests",
"rhoknp>=1.1.0,<1.3.1",
"rjieba",
"rouge-score!=0.0.7,!=0.0.8,!=0.1,!=0.1.1",
Copy link
Member Author

@gante gante Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no references to rouge-score (library name) nor rouge_score (import name) in our library

It was pinned at some point (#18247), but the actual dependency predates this PR

Comment on lines -196 to -197
"libcst",
"rich",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(moved, out of alphabetical order)

@@ -192,9 +169,6 @@
"unidic_lite>=1.0.7",
"urllib3<2.0.0",
"uvicorn",
"pytest-rich",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no references to pytest-rich (library name) nor --rich (the flag it enables) in our library

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gante gante changed the title Update deprecated/unused dependencies Update deprecated/unused dependencies 🧹 🧹 Feb 26, 2025
@@ -67,7 +66,6 @@ def check_forward(test_module, model, batch_size=1, context_size=1024):
test_module.assertEqual(out.shape[1], context_size)


@require_torchao
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require_torchao_version_greater_or_equal implies require_torchao

optimizing memory utilization, speeding up the training, or both. If you'd like to understand how GPU is utilized during
training, please refer to the [Model training anatomy](model_memory_anatomy) conceptual guide first. This guide
focuses on practical techniques.
This guide demonstrates practical techniques that you can use to increase the efficiency of your model's training by
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry, my IDE clears trailing whitespeces 🙈 the actual diff is L392-393)

Comment on lines -392 to -393
* `dynamo.optimize("nvfuser")` - nvFuser with TorchScript. [Read more](https://dev-discuss.pytorch.org/t/tracing-with-primitives-update-1-nvfuser-and-its-primitives/593)
* `dynamo.optimize("aot_nvfuser")` - nvFuser with AotAutograd. [Read more](https://dev-discuss.pytorch.org/t/tracing-with-primitives-update-1-nvfuser-and-its-primitives/593)
Copy link
Member Author

@gante gante Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvfuser was removed: pytorch/pytorch#105789

Comment on lines -319 to -320
* `dynamo.optimize("nvfuser")` - nvFuser with TorchScriptを使用します。 [詳細はこちら](https://dev-discuss.pytorch.org/t/tracing-with-primitives-update-1-nvfuser-and-its-primitives/593)
* `dynamo.optimize("aot_nvfuser")` - nvFuser with AotAutogradを使用します。 [詳細はこちら](https://dev-discuss.pytorch.org/t/tracing-with-primitives-update-1-nvfuser-and-its-primitives/593)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvfuser was removed: pytorch/pytorch#105789

@@ -733,7 +733,7 @@ class TrainingArguments:
distributed training. Important: this will negatively impact the performance, so only use it for debugging.
torchdynamo (`str`, *optional*):
If set, the backend compiler for TorchDynamo. Possible choices are `"eager"`, `"aot_eager"`, `"inductor"`,
`"nvfuser"`, `"aot_nvfuser"`, `"aot_cudagraphs"`, `"ofi"`, `"fx2trt"`, `"onnxrt"` and `"ipex"`.
Copy link
Member Author

@gante gante Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvfuser was removed: pytorch/pytorch#105789

(note that this does not remove support, only removes it from the docs -- this feature was deprecated in torch 2.1 and removed in torch 2.2)

@@ -3990,91 +3988,12 @@ def test_torchdynamo_full_eval(self):
del trainer
torchdynamo.reset()

# 3. TorchDynamo nvfuser
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvfuser was removed: pytorch/pytorch#105789

@unittest.skip(reason="torch 2.0.0 gives `ModuleNotFoundError: No module named 'torchdynamo'`.")
@require_torch_non_multi_gpu
@require_torchdynamo
def test_torchdynamo_memory(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test compares nvfuser to eager, nvfuser was removed: pytorch/pytorch#105789

(the test was also being skipped 👀 )

@@ -84,7 +84,7 @@ def __post_init__(self):
else:
# BIG HACK WILL REMOVE ONCE FETCHER IS UPDATED
print(os.environ.get("GIT_COMMIT_MESSAGE"))
if "[build-ci-image]" in os.environ.get("GIT_COMMIT_MESSAGE", "") or os.environ.get("GIT_COMMIT_MESSAGE", "") == "dev-ci":
if "[build-ci-image]" in os.environ.get("GIT_COMMIT_MESSAGE", "") or "dev-ci" in os.environ.get("GIT_COMMIT_MESSAGE", ""):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can do [test-all] dev-ci 🤗

@@ -431,7 +395,7 @@ def run(self):
deps["numpy"],
deps["packaging"], # utilities from PyPA to e.g., compare versions
deps["pyyaml"], # used for the model cards metadata
deps["regex"], # for OpenAI GPT
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inaccurate comment, it is now used in more places

@gante gante marked this pull request as ready for review February 26, 2025 18:04
@gante gante requested a review from ydshieh February 26, 2025 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants