Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support model registration flow using aibrix runtime api #580

Merged
merged 6 commits into from
Jan 21, 2025

Conversation

Jeffwan
Copy link
Collaborator

@Jeffwan Jeffwan commented Jan 20, 2025

Pull Request Description

  1. If user specify - --enable-runtime-sidecar in controller manager, controller will try to talk with runtime sidecar instead of engine directly. this is to promote our runtime model management api work and this is also good to build abstraction from different engines.

  2. Add /v1/models listing api in runtime which is missing in the past and lora adapter does relies on this feature to fetch existing adapters and base model.

Related Issues

Resolves: #567 #521 (partial) #49 (partial)

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

@brosoul
Copy link
Collaborator

brosoul commented Jan 21, 2025

overall lgtm

1. hugginface protocol shadow assignment bug
2. wrong runtime port
3. wrong host used in buildurls
4. can not forward entire headers due to content length mismatch

Signed-off-by: Jiaxin Shan <[email protected]>
Signed-off-by: Jiaxin Shan <[email protected]>
@Jeffwan Jeffwan force-pushed the jiaxin/switch-to-runtime-api branch from 5805f1e to 0e8542b Compare January 21, 2025 18:58
@Jeffwan Jeffwan merged commit d6319bb into main Jan 21, 2025
13 checks passed
@Jeffwan Jeffwan deleted the jiaxin/switch-to-runtime-api branch January 21, 2025 19:41
gangmuk pushed a commit that referenced this pull request Jan 25, 2025
* Introduce RuntimeConfig to all controllers

* Refactor the logic to construct URLs based on different envs

* Leverage runtime api to manage lora load & unload

* Fix several bugs

1. hugginface protocol shadow assignment bug
2. wrong runtime port
3. wrong host used in buildurls
4. can not forward entire headers due to content length mismatch

* Format files

* Address code review feedback

---------

Signed-off-by: Jiaxin Shan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

integrate the model registration flow with runtime
3 participants