You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
vLLM Ascend current (v0.7.1rc1) supports torch native ops (with torch npu), the whole workflow like: vllm --> torch --> torch_npu --> atb ---> cann, but in this way:
the devs should have to first implements the ops in atb
then exposed to torch_npu
upgrade torch_npu to latest version as dependency.
finally, users can use the ops.
The lengthy version matching and upgrade process discourages developers from implementing the Ascend operator.
Proposed Change.
This RFC aims to smooth out the complicated process for ops development and make everything clear and simple. It can also help Ascend developers to create ops with better collaboration.
This RFC is going to start with exploring custom ops support via two ways:
AscendCL (aclnn)
AscendC
We propose to support custom ops via torch bindings to archive this goal.
Work items:
Custom Ops framework for vLLM Ascend
A real ops implements with CI passed
A turtorial to help users understand how to develop the custom ops
Motivation.
Currently:
vLLM supports a variety of custom ops by.
https://github.com/vllm-project/vllm/blob/cdc1fa12eb1ba4795d24e97dcffa2018668a9267/csrc/torch_bindings.cpp#L480
vLLM Ascend current (v0.7.1rc1) supports torch native ops (with torch npu), the whole workflow like:
vllm --> torch --> torch_npu --> atb ---> cann
, but in this way:torch_npu
to latest version as dependency.The lengthy version matching and upgrade process discourages developers from implementing the Ascend operator.
Proposed Change.
This RFC aims to smooth out the complicated process for ops development and make everything clear and simple. It can also help Ascend developers to create ops with better collaboration.
This RFC is going to start with exploring custom ops support via two ways:
We propose to support custom ops via torch bindings to archive this goal.
Work items:
Feedback Period.
now - 2025.03.06
CC List.
cc @wangxiyuan
cc @ganyi1996ppo
Any Other Things.
Ready in 2025 Q1 (vLLM Ascend first release v1.7.3)
The text was updated successfully, but these errors were encountered: