-
-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: V1 Regression: ValueError: could not broadcast input array from shape (y,) into shape (x,) #12567
Closed
1 task done
Labels
bug
Something isn't working
Comments
Do you have any reproduction instructions? |
Hi @sethkimmel3, thanks for reporting the issue. As you experienced, V1 does not gracefully handle sequences longer than the model's max length. We will fix this soon. |
Thanks @WoosukKwon - I have a reproducible script now but it sounds like you're already aware and can replicate the error. Let me know if you need it. |
Isotr0py
pushed a commit
to Isotr0py/vllm
that referenced
this issue
Feb 2, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*) Signed-off-by: Isotr0py <[email protected]>
youngkent
pushed a commit
to youngkent/vllm
that referenced
this issue
Feb 3, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
srikanthsrnvs
pushed a commit
to srikanthsrnvs/vllm
that referenced
this issue
Feb 3, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*) Signed-off-by: Srikanth Srinivas <[email protected]>
sahelib25
pushed a commit
to krai/vllm
that referenced
this issue
Feb 3, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
fxmarty-amd
pushed a commit
to fxmarty-amd/vllm
that referenced
this issue
Feb 7, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*) Signed-off-by: Felix Marty <[email protected]>
NickLucche
pushed a commit
to NickLucche/vllm
that referenced
this issue
Feb 7, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
ShangmingCai
pushed a commit
to ShangmingCai/vllm
that referenced
this issue
Feb 10, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
GWS0428
pushed a commit
to GWS0428/VARserve
that referenced
this issue
Feb 12, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
panf2333
pushed a commit
to yottalabsai/vllm
that referenced
this issue
Feb 18, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
kerthcet
pushed a commit
to kerthcet/vllm
that referenced
this issue
Feb 21, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*)
lk-chen
pushed a commit
to lk-chen/vllm
that referenced
this issue
Mar 5, 2025
SUMMARY: * avoid crashing the engine when we get an input longer than max_model_len FIX vllm-project#12567(*link existing issues this PR will resolve*) Signed-off-by: Linkun Chen <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Your current environment
The output of `python collect_env.py`
(This is just a cpu dump of the vllm version I'm on. I'm encountering the error on a 1xH100).
Model Input Dumps
No response
🐛 Describe the bug
When running v0.7.0 with the v1 architecture, I'm frequently encountering:
ValueError: could not broadcast input array from shape (y,) into shape (x,)
where x = max_model_len and y > x. This is using the batched/offline interface, and causes the run to error.Without the v1 architecture, it typically handles such cases more gracefully and does not cause the whole run to crash.
May be related to: #9848.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: