feature: Support LoRA loading for model deployments #205

ApostaC · 2025-03-01T18:16:27Z

Describe the feature

Since we already see the trend of large-scale LoRA deployment in production, it would be great for production-stack to support dynamic LoRA loading. This will allow users to efficiently apply LoRA adapters without requiring full model reloading, improving both resource utilization and deployment agility.

More specifically, we want:

Enable dynamic loading and unloading of LoRA adapters on deployed vLLM instances.
Support specifying LoRA adapters at runtime via API or configuration updates.
Documentation and examples for configuring LoRA adapters in the deployment.

Why do you need this feature?

No response

Additional context

No response

ApostaC added the feature request New feature or request label Mar 1, 2025

This was referenced Mar 1, 2025

Manually Enable LoRA Adapters using existing Router and vLLM deployment #206

Merged

Added lora support proposal #216

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: Support LoRA loading for model deployments #205

feature: Support LoRA loading for model deployments #205

ApostaC commented Mar 1, 2025

feature: Support LoRA loading for model deployments #205

feature: Support LoRA loading for model deployments #205

Comments

ApostaC commented Mar 1, 2025

Describe the feature

Why do you need this feature?

Additional context