ZeroShotClassificationArgumentHandler should be explicit it has a somewhat unsafe internal behaviour. #35874

nicolasdalsass · 2025-01-24T11:18:02Z

Feature request

Currently, ZeroShotClassificationArgumentHandler::__call__ will execute https://github.com/huggingface/transformers/blob/main/src/transformers/pipelines/zero_shot_classification.py#L41 , that is, it will call python .format() on the hypothesis provided to format the label in it, while allowing the full extent of .format() placeholders syntax, which is quite large.

For example, passing hypothesis_template = "{:>9999999999}" and any label will happily eat 100Go of RAM because the whole scope of python formatting is allowed.

This is not made clear annywhere, but library users need to know they have to sanitize those inputs very carefully.

I think that at least the docstring of the class, and ideally the reference doc for "hypothesis_template" on https://huggingface.co/docs/huggingface_hub/package_reference/inference_client#huggingface_hub.InferenceClient.zero_shot_classification should be updated to mention this, it's quite important for users of the lib (in particular for parameters that will naturally tend to be user facing in the end).

Alternatively, this call could accept {} only as a placeholder, it's hard to see a legitimate use case for exotic formatting of labels in the hypothesis template.

Thanks :-)

Motivation

I think it's good to help the internet be a safer place in general :-)

Your contribution

It's unclear to me whether I can contribute to the documentation on hugginface.com.

I could contribute a fix to be stricter on allowed hypothesis_template in transformers though if you want to take this route (I'm pretty sure even an AI model could contribute the two lines needed though...)

The text was updated successfully, but these errors were encountered:

Rocketknight1 · 2025-01-24T16:26:23Z

Hi @nicolasdalsass, this is a good point! I don't think the intention here was to allow the full range of Python formatting behaviour, it was just intended as a simple string insertion. If you'd like to make the PR to tighten up security here, I think we'd be happy to review/accept it. It's up to you whether you use an AI or write it yourself 😅

sambhavnoobcoder · 2025-01-25T13:49:24Z

@Rocketknight1 saw this issue and raised a fix in the PR #35886 . Hope you can review this and merge if issue is resolved. I'll acknowledge any comments you have on it as well if required .

Thank you @nicolasdalsass for raising the issue . 🤗

nicolasdalsass · 2025-01-25T15:39:37Z

@sambhavnoobcoder Thanks for the very quick PR on the issue :-) I had a look, only allowing {} seems good and safe to me :-)

sambhavnoobcoder · 2025-01-25T20:09:51Z

It was my pleasure . I've been actively looking through open issues in the transformers library and contributing PRs where I can help. Looking forward to seeing this merged :-)

nicolasdalsass added the Feature request Request for a new feature label Jan 24, 2025

sambhavnoobcoder linked a pull request Jan 25, 2025 that will close this issue

Add security validation for ZeroShotClassificationArgumentHandler hypothesis templates #35874 #35886

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZeroShotClassificationArgumentHandler should be explicit it has a somewhat unsafe internal behaviour. #35874

ZeroShotClassificationArgumentHandler should be explicit it has a somewhat unsafe internal behaviour. #35874

nicolasdalsass commented Jan 24, 2025 •

edited

Loading

Rocketknight1 commented Jan 24, 2025

sambhavnoobcoder commented Jan 25, 2025

nicolasdalsass commented Jan 25, 2025

sambhavnoobcoder commented Jan 25, 2025 •

edited

Loading

ZeroShotClassificationArgumentHandler should be explicit it has a somewhat unsafe internal behaviour. #35874

ZeroShotClassificationArgumentHandler should be explicit it has a somewhat unsafe internal behaviour. #35874

Comments

nicolasdalsass commented Jan 24, 2025 • edited Loading

Feature request

Motivation

Your contribution

Rocketknight1 commented Jan 24, 2025

sambhavnoobcoder commented Jan 25, 2025

nicolasdalsass commented Jan 25, 2025

sambhavnoobcoder commented Jan 25, 2025 • edited Loading

nicolasdalsass commented Jan 24, 2025 •

edited

Loading

sambhavnoobcoder commented Jan 25, 2025 •

edited

Loading