Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add shm configure into helm chart to support tensor parallel #97

Closed
ApostaC opened this issue Feb 9, 2025 · 1 comment
Closed
Assignees
Labels
feature request New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@ApostaC
Copy link
Collaborator

ApostaC commented Feb 9, 2025

Describe the feature

To run vLLM with TP > 1, vLLM requires existence of shared memory to avoid NCCL shared memory allocation issues (as described in vllm-project/vllm#6574).

To address this problem, we need to allocate the shared memory when starting the vLLM pod (potentially by mounting a volume to /dev/shm).

Ideally, users don't need to configure the shared memory by default and whether to mount the shared memory should be determined by the helm template (e.g., mount the volume when number of GPUs > 1, or have a specific configuration item in vllmConfig about TP).

Also, it would be great if we could have a new tutorial about how to set up multi-GPU vLLM instance.

Why do you need this feature?

No response

Additional context

Related issues:
#44
#50
#95

@ApostaC ApostaC added feature request New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Feb 9, 2025
@YuhanLiu11 YuhanLiu11 self-assigned this Feb 9, 2025
@ApostaC
Copy link
Collaborator Author

ApostaC commented Feb 11, 2025

Closed this issue by PR #105

@ApostaC ApostaC closed this as completed Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants