Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc] Implement Singleton Design Pattern for EngineStat Scraper, RequestStat Monitor, and Router #131

Merged
merged 2 commits into from
Feb 20, 2025

Conversation

sitloboi2012
Copy link
Contributor

FILL IN THE PR DESCRIPTION HERE

Enhance Singleton Design Pattern for EngineStat Scraper, RequestStat Monitor, and Router as mentioned in #86

BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE


  • Make sure the code changes pass the pre-commit checks.
  • Sign-off your commit by using -s when doing git commit
  • Try to classify PRs for easy understanding of the type of changes, such as [Bugfix], [Feat], and [CI].
  • Add-on test case to test the router behavior after implement the singleton desgin pattern
Detailed Checklist (Click to Expand)

Thank you for your contribution to production-stack! Before submitting the pull request, please ensure the PR meets the following criteria. This helps us maintain the code quality and improve the efficiency of the review process.

PR Title and Classification

Please try to classify PRs for easy understanding of the type of changes. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:

  • [Bugfix] for bug fixes.
  • [CI/Build] for build or continuous integration improvements.
  • [Doc] for documentation fixes and improvements.
  • [Feat] for new features in the cluster (e.g., autoscaling, disaggregated prefill, etc.).
  • [Router] for changes to the vllm_router (e.g., routing algorithm, router observability, etc.).
  • [Misc] for PRs that do not fit the above categories. Please use this sparingly.

Note: If the PR spans more than one category, please include all relevant prefixes.

Code Quality

The PR need to meet the following code quality standards:

  • Pass all linter checks. Please use pre-commit to format your code. See README.md for installation.
  • The code need to be well-documented to ensure future contributors can easily understand the code.
  • Please include sufficient tests to ensure the change is stay correct and robust. This includes both unit tests and integration tests.

DCO and Signed-off-by

When contributing changes to this project, you must agree to the DCO. Commits must include a Signed-off-by: header which certifies agreement with the terms of the DCO.

Using -s with git commit will automatically add this header.

What to Expect for the Reviews

We aim to address all PRs in a timely manner. If no one reviews your PR within 5 days, please @-mention one of YuhanLiu11
, Shaoting-Feng or ApostaC.

@sitloboi2012
Copy link
Contributor Author

@gaocegege please help me review this PR as mentioned in the Issue #86

@gaocegege
Copy link
Collaborator

Hi, thanks for the PR. I'm curious if we could utilize the FastAPI state.

A state object for the application. This is the same object for the entire application; it doesn't change from request to request.

I applied it in this pull request. By doing so, we can avoid complex singleton code and leverage FastAPI's built-in mechanism.

@sitloboi2012
Copy link
Contributor Author

hmmm that's true since given that our project is a dedicated FastAPI-based production stack. I will re-implement this with the FastAPI State method, but what do you think if you see any potential pitfalls or if you think there’s a scenario where the custom singleton might still offer benefits ? @gaocegege

@gaocegege
Copy link
Collaborator

I think we can keep the singleton implementation and initialize it with state. In this, we can keep the flexibility to replace fastAPI in the future #26 (comment)

@sitloboi2012
Copy link
Contributor Author

Yep that's right, just look back at the roadmap same as you 😆 saw that we might have to do with Rust and Go as well. Then I think if we go with the current method, we do not need to look back and refactor the code again later. This should also give people when they used the framework flexibility to adapt customize change as well if need to @gaocegege

@sitloboi2012
Copy link
Contributor Author

let's go with the hybrid method, I will add on the fastapi state part in the PR as well

@sitloboi2012 sitloboi2012 force-pushed the enh/singleton-stats-tracker branch from 73cd76e to 81e33fd Compare February 17, 2025 17:00
@sitloboi2012
Copy link
Contributor Author

hi @gaocegege , I updated the code again to include using FastAPI State and Custom Singleton code as we mentioned before, please help me review the code again. Cheers 😄

@sitloboi2012
Copy link
Contributor Author

hi @gaocegege any update or comment for this PR ? 😄

gaocegege
gaocegege previously approved these changes Feb 19, 2025
Copy link
Collaborator

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It is super helpful to implement the batch API support. /cc @Shaoting-Feng PTAL

@gaocegege
Copy link
Collaborator

Please resolve the conflicts

@sitloboi2012
Copy link
Contributor Author

@gaocegege hi i just updated and fixed conflict, please help me review again and merge, all test run good

@gaocegege gaocegege merged commit 5247c69 into vllm-project:main Feb 20, 2025
9 checks passed
@gaocegege
Copy link
Collaborator

LGTM 👍
/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants