You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since we already see the trend of large-scale LoRA deployment in production, it would be great for production-stack to support dynamic LoRA loading. This will allow users to efficiently apply LoRA adapters without requiring full model reloading, improving both resource utilization and deployment agility.
More specifically, we want:
Enable dynamic loading and unloading of LoRA adapters on deployed vLLM instances.
Support specifying LoRA adapters at runtime via API or configuration updates.
Documentation and examples for configuring LoRA adapters in the deployment.
Why do you need this feature?
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
Describe the feature
Since we already see the trend of large-scale LoRA deployment in production, it would be great for production-stack to support dynamic LoRA loading. This will allow users to efficiently apply LoRA adapters without requiring full model reloading, improving both resource utilization and deployment agility.
More specifically, we want:
Why do you need this feature?
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: