[Installation]: Attempting to build and run vLLM for Intel Core Ultra 7 155H with ARC iGPU #14295
Open
1 task done
Labels
installation
Installation problems
The build seems to complete with no issues.
The server runs with some odd stdout logging.
CURL does not return valid responses, but gets a 200 from the server.
Working notes are here - https://github.com/cgruver/vllm-intel-gpu-workspace
Your current environment
python collect_env.py
How you are installing vllm
Build from source -
OS Fedora 41
HW - Core Ultra 7 155H
Installed Packages -
Install XPU dependencies
Fix
requirements-cpu.txt
Fix apparent issue with args passed to
torch.xpu.varlen_fwd
#11173 (comment)Build vLLM
VLLM_TARGET_DEVICE=xpu python -m pip install .
Run vLLM Server
STDOUT -
Note: The following is logged several times -
Then -
Test -
STDOUT -
INFO 03-05 15:11:47 [logger.py:39] Received request cmpl-12ee6303c8df488aa5bc8bcf657093c8-0: prompt: 'San Francisco is a', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=7, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None), prompt_token_ids: [23729, 12879, 374, 264], lora_request: None, prompt_adapter_request: None. INFO 03-05 15:11:47 [engine.py:289] Added request cmpl-12ee6303c8df488aa5bc8bcf657093c8-0.
Note: The following is logged several times -
Then -
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: