[Question] Multi-node multi-GPU accelerate quantization #1139

nicklausbrown · 2025-02-12T04:16:23Z

Hello,

I have 2 nodes which contain 2 4090s each with a mellanox dual 25gbe NIC as RoCE interconnect. I'd like to know if it is possible to run llm-compressor in "distributed mode" leveraging accelerate's ability to handle multi-node training. I may be misunderstanding the functionality, but if not I would be grateful to know in an example how to leverage multiple nodes' GPUs.

Thank you for a great tool!

brian-dellabetta · 2025-02-28T22:31:40Z

Hi @nicklausbrown , you likely will not need to train on multiple nodes for the compression algorithms we are providing here. We are running calibration training, usually involving caching activations of a single batch of data and performing some compression based on the results. Are you are looking to do post-training afterward? It would certainly benefit from multi-node but is not needed for most of the pipelines we are currently supporting.

brian-dellabetta self-assigned this Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Multi-node multi-GPU accelerate quantization #1139

[Question] Multi-node multi-GPU accelerate quantization #1139

nicklausbrown commented Feb 12, 2025

brian-dellabetta commented Feb 28, 2025 •

edited

Loading

[Question] Multi-node multi-GPU accelerate quantization #1139

[Question] Multi-node multi-GPU accelerate quantization #1139

Comments

nicklausbrown commented Feb 12, 2025

brian-dellabetta commented Feb 28, 2025 • edited Loading

brian-dellabetta commented Feb 28, 2025 •

edited

Loading