Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: not enough values to unpack (expected 4, got 2) #29

Open
Second222None opened this issue Nov 20, 2024 · 0 comments
Open

ValueError: not enough values to unpack (expected 4, got 2) #29

Second222None opened this issue Nov 20, 2024 · 0 comments

Comments

@Second222None
Copy link

Second222None commented Nov 20, 2024

Describe the bug

Since there is no powerful GPU, we have to run the End2End test with a samll model Qwen/Qwen2-1.5B. LMcache could start successfully. We get the error when sending requests to server.

The error from current screen:
image

And the error from /tmp/root-8000-stdout.log

(VllmWorkerProcess pid=73509) ERROR 11-20 22:14:12 multiproc_worker_utils.py:233]     _, _, num_heads, head_size = kv_cache[0].shape
(VllmWorkerProcess pid=73509) ERROR 11-20 22:14:12 multiproc_worker_utils.py:233] ValueError: not enough values to unpack (expected 4, got 2) 

env:

  • OS: Ubuntu 22.04.5 LTS
  • vllm: v0.6.2 (pip install vllm==0.6.2)
  • LMcache: v0.1.3-alpha (installed from source)
  • lmcache-vllm: v0.6.2.2 (installed from source)
  • GPU: Tesla T4 (16GB) x 2

To Reproduce
Steps to reproduce the behavior:

  1. Set to use Qwen/Qwen2-1.5B by changing tests/tests.py. def test_lmcache_local_cpu(model = "Qwen/Qwen2-1.5B") -> pd.DataFrame:
  2. python3 main.py tests/tests.py -f test_lmcache_local_cpu -o outputs/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant