-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Multi-sequence broken #11898
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
vllm/sequence.py
Outdated
n = self.sampling_params.n | ||
assert isinstance(n, int) | ||
if n > self.num_seqs(): | ||
# At prompt stage, the sequence group is not yet filled up | ||
# and only have one sequence running. However, in the | ||
# generation stage, we will have `n` sequences | ||
# running. | ||
return n | ||
# At sampling stages, return the number of actual sequences | ||
# that are not finished yet. | ||
return self.num_seqs() - self.num_finished_seqs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when will we hit this? I think the engine will only see single-sequence request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you construct the output when n > 1
you access the "master group".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, you construct the RequestOutput with multiple sequences here:
Line 180 in ef725fe
return cls.from_seq_group(assembled_seq_group, use_cache, |
Then call master_seq_group.is_finished()
here:
Line 170 in ef725fe
finished = seq_group.is_finished() |
Which currently already becomes True
when the first sequence terminates (regardless of whether the other sequences has terminated)
params = copy.deepcopy(original_params) | ||
params.n = 1 | ||
if params.seed is not None: | ||
params.seed += i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part makes sense to me.
@andylolu2 thanks for the fix! can you add a test case for |
7c31b9c
to
debff7f
Compare
@youkaichao I added new asserts in the current tests to ensure each sample in the same parallel-sampling group gives different results. |
@andylolu2 please fix the format. |
Woops, thanks for reminding |
Signed-off-by: Andy Lo <[email protected]>
Signed-off-by: Andy Lo <[email protected]>
Signed-off-by: Andy Lo <[email protected]>
@youkaichao Should be good now |
cases with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for fixing it!
|
||
# verify generations within the same parallel sampling group differ | ||
for output in outputs: | ||
for sub_output_a, sub_output_b in combinations(output, 2): | ||
assert sub_output_a != sub_output_b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this test be flaky? e.g. when we generate with n=10
, if two sequences happen to generate the same answer ...
@andylolu2 We have switched to use pre-commit, can you please update this PR with the latest changes in main? Thanks! |
Signed-off-by: Andy Lo <[email protected]> Signed-off-by: Bowen Wang <[email protected]>
Signed-off-by: Andy Lo <[email protected]>
Sorry about that, was caught up in other matters. Glad that it got merged :) |
Fixes the bugs introduced in #9569
SequenceGroup
does not necessarily contain only one sequence (e.g. whenn
> 1), so many of the optimisations don't make sense.n
> 1 with seed set, all completions give the same output.Andy@MistralAI