-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: experimental Free() changes [formerly shutdown branch] #928
Conversation
Comment mostly to remind myself: at this point in history, models can be unloaded, but sending new requests to that model fail since currently model loading happens only up front. I now need to decide between making backends disposable or reusable.... EDIT: THIS IS OUTDATED. WE NOW TERMINATE PROCESSES |
liking it ! 👍 I'm all in for a way to manage this from API. Just a note on implementation side - rather believe this should just manage the process rather then directly interacting with the backends.
awesome! 👍 |
@@ -60,3 +60,12 @@ func (llm *LLM) PredictStream(opts *pb.PredictOptions, results chan string) erro | |||
|
|||
return nil | |||
} | |||
|
|||
func (llm *LLM) Unload() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not add unused code - I know it's tempting, and I appreciate your efforts, but then it makes it not obvious what the code actually do for first time readers - is it used? where?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can drop it if you'd prefer - it is currently unused, after I switched back to process based shutdown. I'm currently doing some prototyping around "multi-model-file" backends - and I wanted to have the ability to call the Free function during that testing. It might be cleanest to migrate that to the other branch until it's used - I had initially decided it made sense to expose it since otherwise the "Free()" functions down in the backends are themselves unused code [but hey, at least that's much more hidden from first time readers!]
Description
This PR and associated branch have been banish back to experimental use. The Unload features will stay here for more experimentation. RWKV went to #937, and Process-Based backend shutdown went to #938