Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Offline batched inference based on OpenAI offline batching API #47

Open
gaocegege opened this issue Jan 31, 2025 · 3 comments
Open
Labels
feature request New feature or request

Comments

@gaocegege
Copy link
Collaborator

This issue is created to track the conversation about the offline batch API.

gaocegege@

I was just wondering if there are any plans to add openai api compatible offline batch support in the router. I saw a comment about it vllm-project/vllm#1636 (comment), and it looks like this feature needs file interfaces for uploads, which might not fit well with vLLM. It could be a nice addition to the router, acting as a bridge between users and vLLM.

ApostaC@

Seems like there should be multiple design choices to handle file uploads. Will add this to the roadmap (which will be released as an issue in the project tomorrow). Maybe we can have more detailed discussion there?

simon-mo@

I also agree this can be a lightweight optional component in the stack! given in K8s you can provision persistent volume easily or mount s3-fuse

@gaocegege
Copy link
Collaborator Author

We could start by keeping the files in the local file system, especially since our router currently doesn't support multi-instance deployment. We can design a robust abstraction for file uploads to ensure extensibility, allowing us to support other storage backends (e.g., MinIO, S3) in the future, particularly for k8s environments.

@ApostaC
Copy link
Collaborator

ApostaC commented Jan 31, 2025

@gaocegege Thanks for leading the discussion here!

We could start by keeping the files in the local file system, especially since our router currently doesn't support multi-instance deployment.

The proposal looks good to me! Would you like to take a stab at designing the concrete interface in the router?

@gaocegege
Copy link
Collaborator Author

Yes! I'd be happy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants