Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Model]: DeepSeek V3 / R1 #72

Open
Yikun opened this issue Feb 17, 2025 · 6 comments
Open

[New Model]: DeepSeek V3 / R1 #72

Yikun opened this issue Feb 17, 2025 · 6 comments
Assignees

Comments

@Yikun
Copy link
Collaborator

Yikun commented Feb 17, 2025

This issue tracks initial support for the Deepseek V3 model with vllm-ascend:

https://huggingface.co/deepseek-ai/DeepSeek-R1
https://huggingface.co/deepseek-ai/DeepSeek-V3

cc @wangxiyuan feel free to update any investigations

@Yikun
Copy link
Collaborator Author

Yikun commented Feb 18, 2025

For v0.7.1-dev: #68 #88

update (2025.02.19): #88 merged to v0.7.1-dev, DeepSeek test passed (via DeepSeek-V2-Lite), V3 arch same as V2 should also work, will backport to main soon.

Here is the note for DeepSeek-V2-Lite deploy: https://vllm-ascend.readthedocs.io/en/latest/tutorials.html#online-serving-on-multi-machine

update (2025.02.22) DeepSeek V3 / R1 support will be ready in next RC release of vLLM Ascend (v0.7.3rc1) in the early of 2025.03

Known issue will be fixed in vllm-ascend v0.7.3rc1 (March. 2025) with CANN 8.1.RC1.alpha001 (March. 2025):

update (2025.03.05) we are still waiting for CANN 8.1.RC1.alpha001 release.: https://www.hiascend.com/zh/developer/download/community/result?module=cann

@caolicaoli
Copy link

非常好

@staugust
Copy link

staugust commented Mar 6, 2025

It seems like torch_npu in docker image quay.io/ascend/vllm-ascend:v0.7.1rc1 is a dev version. The commit id is not found in torch_npu repo. I'm wondering whether I missed something. Could you please help me to find the source code of torch_npu in the docker image? Thanks in advance. @Yikun

>>> torch_npu.version.git_version
'0e8c5249aacfcf94f3d61c6ff0938fadada1cc6a'
>>> torch_npu.version.__version__
'2.5.1.dev20250218'

@wangxiyuan
Copy link
Collaborator

@staugust sorry, this torch-npu used by 0.7.1rc1 is a private version which source code is not merged to main branch. vllm-ascend will rely on a official/open source version of torch-npu in the next release.

@staugust
Copy link

staugust commented Mar 6, 2025

@staugust sorry, this torch-npu used by 0.7.1rc1 is a private version which source code is not merged to main branch. vllm-ascend will rely on a official/open source version of torch-npu in the next release.

Is https://github.com/Ascend/pytorch the official/open source repository of torch-npu? By the way, which branch is the developing branch for next release? I'm working on on-demand profiling, and found that calling localhost:8000/start_profile and localhost:8000/stop_profile blocks vllm inference. Is there any plan to make profiling available for production environment?

@wangxiyuan
Copy link
Collaborator

@staugust it's here: https://gitee.com/ascend/pytorch I have no idea about its branch policy, you can ask the release things there.

I assume profiling works. @Potabk Please take a look at the profile problem mentioned by @staugust

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants