Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Router observability (Current QPS, router-side queueing delay, etc) - WIP #90

Closed

Conversation

sitloboi2012
Copy link
Contributor

@sitloboi2012 sitloboi2012 commented Feb 8, 2025

  • Update the Grafana to include these new metrics:

    • Current QPS
    • router-side queueing delay
    • number of pending / prefilling / decoding requests
    • average prefill / decoding length.
  • Formatting layout for Grafana visualization

  • Update router to include instrumental for tracking and logging new metrics

  • Minor update:

    • Include library openai + vllm for test req.txt
    • Update the install utils to include checking nvidia-smi + nvidia-ctk before running minikube

Please refer to Issue 78 for more info

…side queueing delay, number of pending / prefilling / decoding requests, average prefill / decoding length
… to check nvidia-smi and nvidia-ctk when setup
@sitloboi2012 sitloboi2012 marked this pull request as draft February 9, 2025 03:46
@ApostaC ApostaC requested review from ApostaC and YuhanLiu11 February 9, 2025 05:13
@YuhanLiu11
Copy link
Collaborator

YuhanLiu11 commented Feb 9, 2025

@sitloboi2012 Thanks a lot for your contribution! Just making sure, are you planning to update the metrics based on our latest discussions in #78 ?

@sitloboi2012
Copy link
Contributor Author

@YuhanLiu11 yep I will update the metrics again based on the discussion in the #78 👍

sitloboi2012 and others added 12 commits February 9, 2025 16:41
…update router to include instrumental log for vllm num of request, updatet installl minikube cluster to include checking and updating docker, minikube context and path tracking for nvidia-smi and nvidia-ctk
* feat(router): generate req id with uuid.

Signed-off-by: Electronic-Waste <[email protected]>

* fix: fix lint error.

Signed-off-by: Electronic-Waste <[email protected]>

---------

Signed-off-by: Electronic-Waste <[email protected]>
* Add helm update to helm func test pipeline

* Add runtimeClassName to multi model test

---------

Signed-off-by: Shaoting Feng <[email protected]>
Signed-off-by: Shaoting Feng <[email protected]>
* Adding TP>1 support and adding shm size as a configurable parameter

Signed-off-by: YuhanLiu11 <[email protected]>
…side queueing delay, number of pending / prefilling / decoding requests, average prefill / decoding length
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants