Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vLLM Ascend Roadmap Q1 2025 #71

Open
7 of 37 tasks
Yikun opened this issue Feb 17, 2025 · 2 comments
Open
7 of 37 tasks

vLLM Ascend Roadmap Q1 2025 #71

Yikun opened this issue Feb 17, 2025 · 2 comments

Comments

@Yikun
Copy link
Collaborator

Yikun commented Feb 17, 2025

This is a living document!

Note that: vLLM Ascend 0.7.3 (match vLLM v0.7.3) is main release for 2025 Q1, see more in link.

Hardware Plugin

Basic support

Initial vLLM Ascend support will start to support with basic hardware compatibility support.

Feature support

Model support

Performance

  • add vllm-ascend perf website like vLLM does https://perf.vllm.ai/
  • focus on llama3, qwen2.5, qwen2-vl, deepseek v3/R1, improve the performance

Quality

  • Full UT coverage
  • Model e2e test
  • Multi card/node e2e test

Docs

  • README
  • vllm-ascend website: https://vllm-ascend.readthedocs.org/
  • Quick start / Installation / Turtorial
  • User guide: supported feature / models
  • Developer guide: Contributing / Versioning policy

CI and Developer Productivity

@wuhuikx
Copy link

wuhuikx commented Feb 21, 2025

chunked prefill will also be supported in Q1 for functionality.

@Yikun
Copy link
Collaborator Author

Yikun commented Feb 22, 2025

chunked prefill will also be supported in Q1 for functionality.

I updated Chunked Prefill as P0 priority.

@Yikun Yikun changed the title [main] vllm-ascend Roadmap Q1 2025 vLLM Ascend Roadmap Q1 2025 Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants