Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve model adapter reconcile workflow stability #260

Merged
merged 3 commits into from
Oct 3, 2024

Conversation

Jeffwan
Copy link
Collaborator

@Jeffwan Jeffwan commented Oct 1, 2024

Pull Request Description

  1. Polish the workflow
I1001 10:27:10.704733   11452 modeladapter_controller.go:266] "Adding finalizer for ModelAdapter" ModelAdapter="aibrix-system/lora-1"
I1001 10:27:18.798096   11452 modeladapter_controller.go:452] "model adapter reconcile" Update CR status="lora-1" status={"phase":"Pending"}
I1001 10:27:33.971201   11452 leastadapters.go:53] "pod selected with least model adapters" pod="aibrix-system/llama2-70b-8fd6c849b-vd9rl"
I1001 10:27:39.941767   11452 modeladapter_controller.go:452] "model adapter reconcile" Update CR status="lora-1" status={"phase":"Scheduled","conditions":[{"type":"Initialized","status":"Unknown","lastTransitionTime":"2024-10-01T17:27:18Z","reason":"ModelAdapterPending","message":"Starting reconciliation"}],"instances":["llama2-70b-8fd6c849b-vd9rl"]}
I1001 10:28:01.208669   11452 modeladapter_controller.go:715] "Creating a new service" service="aibrix-system/lora-1"
I1001 10:28:13.041924   11452 modeladapter_controller.go:807] "Creating a new EndpointSlice" endpointslice="aibrix-system/lora-1"
I1001 10:28:44.503105   11452 modeladapter_controller.go:439] "model adapter reconcile" Update CR status="lora-1" status={"phase":"Running","conditions":[{"type":"Initialized","status":"Unknown","lastTransitionTime":"2024-10-01T17:27:18Z","reason":"ModelAdapterPending","message":"Starting reconciliation"},{"type":"Scheduled","status":"True","lastTransitionTime":"2024-10-01T17:27:39Z","reason":"Scheduled","message":"ModelAdapter aibrix-system/lora-1 has been allocated to pod aibrix-system/llama2-70b-8fd6c849b-vd9rl"}],"instances":["llama2-70b-8fd6c849b-vd9rl"]}
  1. Update the conditions. the major change is not to use it as state machine, only report necessary failures.

Related Issues

Resolves: #136

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

@Jeffwan Jeffwan force-pushed the jiaxin/improve-model-adapter-flow branch from 53958ce to 8e4d4f8 Compare October 3, 2024 17:52
@Jeffwan Jeffwan merged commit cc0e024 into main Oct 3, 2024
10 checks passed
@Jeffwan Jeffwan deleted the jiaxin/improve-model-adapter-flow branch October 3, 2024 18:02
gangmuk pushed a commit that referenced this pull request Jan 25, 2025
* Remove self maintained pointer utils

* Refactor status update methods

* Improve model adapter stability
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhance the model adapter stability to alpha grade
2 participants