vllm-project / production-stack Public

Notifications You must be signed in to change notification settings
Fork 81
Star 670

Code
Issues 28
Pull requests 13
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: vllm-project/production-stack

[Roadmap] vLLM production stack roadmap for 2025 Q1

#26 opened Jan 27, 2025 by ApostaC

Open 21

Labels 10 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

28 Open 47 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

feature: Detect PII in HTTP request feature request

New feature or request

#227 opened Mar 5, 2025 by rootfs

feature: Support LoRA loading for model deployments feature request

New feature or request

#205 opened Mar 1, 2025 by ApostaC

feature: Support CRD based configuration feature request

New feature or request

#204 opened Mar 1, 2025 by rootfs

bug: "POST /v1/chat/completions HTTP/1.1" 503 Service Unavailable bug

Something isn't working

#201 opened Feb 28, 2025 by corona10

[WIP, RFC] Production Stack on Ray Serve discussion

#195 opened Feb 27, 2025 by Hanchenli

bug: File Access Error with vllm using runai_streamer on OCP bug

Something isn't working

#193 opened Feb 27, 2025 by TamKez

feature: custom callback functionality in vllm-router feature request

New feature or request

help wanted

Extra attention is needed

#186 opened Feb 26, 2025 by pwuersch

feature: introduce pyproject.toml and use uv feature request

New feature or request

good first issue

Good for newcomers

help wanted

Extra attention is needed

#184 opened Feb 25, 2025 by bufferoverflow

feature: unify naming of production-stack, vllm-stack and vllm-router discussion feature request

New feature or request

#178 opened Feb 25, 2025 by bufferoverflow

feature: Terraform Quickstart Tutorials for Google GKE feature request

New feature or request

#172 opened Feb 23, 2025 by falconlee236

feature: Terraform Quickstart Tutorials for Underlying Infrastructure feature request

New feature or request

#167 opened Feb 21, 2025 by 0xThresh

Discussion - QPS routing when there are multiple router replicas discussion question

Further information is requested

#166 opened Feb 21, 2025 by aishwaryaraimule21

bug: flaky test case Functionality test for helm chart / Multiple-Models bug

Something isn't working

#152 opened Feb 19, 2025 by gaocegege

bug: Model not found when enable vllm api key bug

Something isn't working

#150 opened Feb 18, 2025 by JustinDuy

[WIP] Amazon EKS tutorial + Azure + GKE tutorial,

#129 opened Feb 14, 2025 by Hanchenli

Discussion: Pipeline parallelism support

#101 opened Feb 10, 2025 by Shaoting-Feng

Discussion: Unifying versions for helm and router question

Further information is requested

#80 opened Feb 7, 2025 by gaocegege

Feat: Router observability (Current QPS, router-side queueing delay, etc) feature request

New feature or request

#78 opened Feb 7, 2025 by sitloboi2012

feat: Distributed tracing for router feature request

New feature or request

help wanted

Extra attention is needed

#77 opened Feb 7, 2025 by gaocegege

feat: Allow remote backend configuration feature request

New feature or request

#75 opened Feb 7, 2025 by askulkarni2

Why Hugging Face Token? question

Further information is requested

#67 opened Feb 6, 2025 by nitin302

Create an Example Building Ingress for Router Service documentation

Improvements or additions to documentation

#60 opened Feb 4, 2025 by 0xThresh

[RFC] prefix-cache-aware routing

#59 opened Feb 4, 2025 by KuntaiDu

Helm Chart Lacks Clear Support for Multi-Node vLLM Deployment help wanted

Extra attention is needed

#50 opened Jan 31, 2025 by shohamyamin

feat: Offline batched inference based on OpenAI offline batching API feature request

New feature or request

#47 opened Jan 31, 2025 by gaocegege

Previous 1 2 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly