[Bug][V1]: Loading Llama3.1-8B-INT8 gets OOM when using VLLM_USE_v1=1 but safe using v0 #14286
Open
1 task done
Labels
bug
Something isn't working
Your current environment
The output of `python collect_env.py`
🐛 Describe the bug
I have a Llama3.1 model with this config
`config.json` of Llama3.1-8B-INT8 model
I load it using vllm like this
But, I got this error
Log error loading Llama3.1 using VLLM V1
If I load it using vllm like this
it works perfectly fine
Log loading Llama3.1 using VLLM V0
What I know is the culprit is this commit da31b53. Because if I tried it using the commit before it which is bb78fb3, it works perfectly
Log loading Llama3.1 using VLLM V1 in commit bb78fb3
Any explanation why this happened?
cc: @JenZhao and @ywang96
EDIT:
I didn't realize that the error from da31b53 is different with the error that I got from the last commit in main. After checking it more, I found that between commit da31b53 and 7f6bae5, the error is about
SamplingMetadata
. But, from this commit 7f6bae5 and go on, the error is OOM. Here is the real OOM error appeared:Log OOM error loading Llama3.1 using VLLM V1 in commit 7f6bae5
And it works perfectly fine for vllm v0
Log loading Llama3.1 using VLLM V0 with commit 7f6bae5
CC: @DarkLight1337
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: