You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In diffusers.models.embeddings, functions like get_1d_rotary_pos_embed do not allow users to specify the device of the returned tensors.
Impact
Device Mismatch in Stable Audio Open Pipeline
• In the diffusers.pipelines.stable_audio.pipeline_stable_audio pipeline, this results in some inputs not being on the same device, even after calling pipe.to("cuda").
• While this does not prevent the pipeline from running, it introduces unnecessary overhead and slightly slows down inference when calling the __call__ method.
TensorRT Incompatibility
• The device inconsistency prevents the transformer model of the stable audio pipeline from being compiled with TensorRT, limiting optimization opportunities.
Suggested Improvement
Allow users to specify the device for tensors returned by rotary embedding functions. Then, modify the call method of the Stable Audio pipeline to use this functionality, ensuring all inputs are consistently placed on the target device and enabling full compatibility with .to("cuda") and TensorRT compilation.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
In diffusers.models.embeddings, functions like get_1d_rotary_pos_embed do not allow users to specify the device of the returned tensors.
Impact
• In the diffusers.pipelines.stable_audio.pipeline_stable_audio pipeline, this results in some inputs not being on the same device, even after calling pipe.to("cuda").
• While this does not prevent the pipeline from running, it introduces unnecessary overhead and slightly slows down inference when calling the
__call__
method.• The device inconsistency prevents the transformer model of the stable audio pipeline from being compiled with TensorRT, limiting optimization opportunities.
Suggested Improvement
Allow users to specify the device for tensors returned by rotary embedding functions. Then, modify the call method of the Stable Audio pipeline to use this functionality, ensuring all inputs are consistently placed on the target device and enabling full compatibility with .to("cuda") and TensorRT compilation.
Would love to hear thoughts on this!
Beta Was this translation helpful? Give feedback.
All reactions