Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: experimental Free() changes [formerly shutdown branch] #928

Closed
wants to merge 8 commits into from

Conversation

dave-gray101
Copy link
Collaborator

@dave-gray101 dave-gray101 commented Aug 19, 2023

Description
This PR and associated branch have been banish back to experimental use. The Unload features will stay here for more experimentation. RWKV went to #937, and Process-Based backend shutdown went to #938

@dave-gray101
Copy link
Collaborator Author

dave-gray101 commented Aug 19, 2023

Comment mostly to remind myself: at this point in history, models can be unloaded, but sending new requests to that model fail since currently model loading happens only up front.

I now need to decide between making backends disposable or reusable....

EDIT: THIS IS OUTDATED. WE NOW TERMINATE PROCESSES

@mudler
Copy link
Owner

mudler commented Aug 19, 2023

liking it ! 👍 I'm all in for a way to manage this from API.

Just a note on implementation side - rather believe this should just manage the process rather then directly interacting with the backends.

Also apparently includes a quick fix for RWKV prompt token usage, since that turned out to be unexpectedly easy.

awesome! 👍

@dave-gray101 dave-gray101 marked this pull request as ready for review August 20, 2023 18:56
@dave-gray101 dave-gray101 changed the title WIP: Backend Model Unloading feat: Backend Shutdown Endpoint Aug 20, 2023
@@ -60,3 +60,12 @@ func (llm *LLM) PredictStream(opts *pb.PredictOptions, results chan string) erro

return nil
}

func (llm *LLM) Unload() error {
Copy link
Owner

@mudler mudler Aug 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add unused code - I know it's tempting, and I appreciate your efforts, but then it makes it not obvious what the code actually do for first time readers - is it used? where?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can drop it if you'd prefer - it is currently unused, after I switched back to process based shutdown. I'm currently doing some prototyping around "multi-model-file" backends - and I wanted to have the ability to call the Free function during that testing. It might be cleanest to migrate that to the other branch until it's used - I had initially decided it made sense to expose it since otherwise the "Free()" functions down in the backends are themselves unused code [but hey, at least that's much more hidden from first time readers!]

@dave-gray101 dave-gray101 marked this pull request as draft August 21, 2023 17:35
@dave-gray101 dave-gray101 changed the title feat: Backend Shutdown Endpoint feat: experimental Free() changes [formerly shutdown branch] Aug 21, 2023
@dave-gray101 dave-gray101 deleted the feat-unload branch February 21, 2024 02:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants