Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Usage]: I can't start Online Serving on Multi Machine #230

Open
flying632 opened this issue Mar 3, 2025 · 3 comments
Open

[Usage]: I can't start Online Serving on Multi Machine #230

flying632 opened this issue Mar 3, 2025 · 3 comments

Comments

@flying632
Copy link

flying632 commented Mar 3, 2025

Your current environment

使用0.7.3版本,多机启动报错

Image

How would you like to use vllm on ascend

I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.

@MengqingCao
Copy link
Contributor

Which model do you use? It seems there are cuda hard code in the model

@flying632
Copy link
Author

我尝试了0.7.1和0.7.3版本,使用的Qwen2.5-72B-Instruct模型,ray:2.43.0 我在NPU上运行的

@wangxiyuan
Copy link
Collaborator

This problem is fixed by #172 for 0.7.3-dev IMO. Are you using the newest code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants