[Usage]: I can't start Online Serving on Multi Machine #230

flying632 · 2025-03-03T12:12:42Z

使用0.7.3版本，多机启动报错

I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.

The text was updated successfully, but these errors were encountered:

MengqingCao · 2025-03-03T12:36:23Z

Which model do you use? It seems there are cuda hard code in the model

flying632 · 2025-03-04T01:30:25Z

我尝试了0.7.1和0.7.3版本，使用的Qwen2.5-72B-Instruct模型，ray：2.43.0 我在NPU上运行的

wangxiyuan · 2025-03-04T06:39:42Z

This problem is fixed by #172 for 0.7.3-dev IMO. Are you using the newest code?

Provide feedback