Feat: add shm configure into helm chart to support tensor parallel #97

ApostaC · 2025-02-09T16:42:19Z

Describe the feature

To run vLLM with TP > 1, vLLM requires existence of shared memory to avoid NCCL shared memory allocation issues (as described in vllm-project/vllm#6574).

To address this problem, we need to allocate the shared memory when starting the vLLM pod (potentially by mounting a volume to /dev/shm).

Ideally, users don't need to configure the shared memory by default and whether to mount the shared memory should be determined by the helm template (e.g., mount the volume when number of GPUs > 1, or have a specific configuration item in vllmConfig about TP).

Also, it would be great if we could have a new tutorial about how to set up multi-GPU vLLM instance.

Why do you need this feature?

No response

Additional context

Related issues:
#44
#50
#95

The text was updated successfully, but these errors were encountered:

ApostaC · 2025-02-11T05:09:37Z

Closed this issue by PR #105

ApostaC added feature request New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Feb 9, 2025

YuhanLiu11 self-assigned this Feb 9, 2025

ApostaC mentioned this issue Feb 10, 2025

Discussion: Pipeline parallelism support #101

Open

ApostaC closed this as completed Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: add shm configure into helm chart to support tensor parallel #97

Feat: add shm configure into helm chart to support tensor parallel #97

ApostaC commented Feb 9, 2025 •

edited

Loading

ApostaC commented Feb 11, 2025

Feat: add shm configure into helm chart to support tensor parallel #97

Feat: add shm configure into helm chart to support tensor parallel #97

Comments

ApostaC commented Feb 9, 2025 • edited Loading

Describe the feature

Why do you need this feature?

Additional context

ApostaC commented Feb 11, 2025

ApostaC commented Feb 9, 2025 •

edited

Loading