-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New Model]: DeepSeek V3 / R1 #72
Comments
update (2025.02.19): #88 merged to v0.7.1-dev, DeepSeek test passed (via DeepSeek-V2-Lite), V3 arch same as V2 should also work, will backport to main soon. Here is the note for DeepSeek-V2-Lite deploy: https://vllm-ascend.readthedocs.io/en/latest/tutorials.html#online-serving-on-multi-machine update (2025.02.22) DeepSeek V3 / R1 support will be ready in next RC release of vLLM Ascend (v0.7.3rc1) in the early of 2025.03 Known issue will be fixed in vllm-ascend v0.7.3rc1 (March. 2025) with CANN 8.1.RC1.alpha001 (March. 2025):
update (2025.03.05) we are still waiting for CANN 8.1.RC1.alpha001 release.: https://www.hiascend.com/zh/developer/download/community/result?module=cann |
非常好 |
It seems like
|
@staugust sorry, this torch-npu used by 0.7.1rc1 is a private version which source code is not merged to main branch. vllm-ascend will rely on a official/open source version of torch-npu in the next release. |
Is https://github.com/Ascend/pytorch the official/open source repository of torch-npu? By the way, which branch is the developing branch for next release? I'm working on on-demand profiling, and found that calling |
@staugust it's here: https://gitee.com/ascend/pytorch I have no idea about its branch policy, you can ask the release things there. I assume profiling works. @Potabk Please take a look at the profile problem mentioned by @staugust |
This issue tracks initial support for the Deepseek V3 model with vllm-ascend:
https://huggingface.co/deepseek-ai/DeepSeek-R1
https://huggingface.co/deepseek-ai/DeepSeek-V3
cc @wangxiyuan feel free to update any investigations
The text was updated successfully, but these errors were encountered: