-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle terminating pod for request routing #549
Conversation
@@ -144,9 +144,12 @@ func CountReadyPods(podList *v1.PodList) (int64, error) { | |||
func FilterReadyPods(pods map[string]*v1.Pod) []*v1.Pod { | |||
var readyPods []*v1.Pod | |||
for _, pod := range pods { | |||
if pod.Status.PodIP != "" && IsPodReady(pod) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this method being used by other files? like some other controllers other than cache.go? I am a little bit concerned on the impact if we have other scenarios.
the key difference of this change is terminated
pod?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used by all routing methods, updatePodMetrics and early rejection for gateway. No impact with the change.
I also thought about it, and not including terminating pods from ready pods list makes sense, hence I updated this method rather than introducing another helper method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean other controllers other than routing. I remember some ray controllers use it so I create the util file earlier. If so, other logics would be affected. we may consider to create another method if there's references
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. then we are safe to modify it.
@@ -243,6 +243,15 @@ func (s *Server) HandleRequestBody(ctx context.Context, requestID string, req *e | |||
fmt.Sprintf("model %s does not exist", model)), targetPodIP, stream | |||
} | |||
|
|||
// early reject if no pods are ready to accept request for a model | |||
pods, err := s.cache.GetPodsForModel(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it filters the terminated pods here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, it returns all the pods irrespective of the pod state.
* Handle terminating pod for request routing * nit log updates * add early check for ready pods for request with or without routing strategy
#541