Feat: Router observability (Current QPS, router-side queueing delay, etc) - WIP #90

sitloboi2012 · 2025-02-08T18:31:27Z

Update the Grafana to include these new metrics:
- Current QPS
- router-side queueing delay
- number of pending / prefilling / decoding requests
- average prefill / decoding length.
Formatting layout for Grafana visualization
Update router to include instrumental for tracking and logging new metrics
Minor update:
- Include library openai + vllm for test req.txt
- Update the install utils to include checking nvidia-smi + nvidia-ctk before running minikube

Please refer to Issue 78 for more info

…side queueing delay, number of pending / prefilling / decoding requests, average prefill / decoding length

… to check nvidia-smi and nvidia-ctk when setup

YuhanLiu11 · 2025-02-09T06:00:01Z

@sitloboi2012 Thanks a lot for your contribution! Just making sure, are you planning to update the metrics based on our latest discussions in #78 ?

sitloboi2012 · 2025-02-09T06:02:08Z

@YuhanLiu11 yep I will update the metrics again based on the discussion in the #78 👍

…update router to include instrumental log for vllm num of request, updatet installl minikube cluster to include checking and updating docker, minikube context and path tracking for nvidia-smi and nvidia-ctk

* feat(router): generate req id with uuid. Signed-off-by: Electronic-Waste <[email protected]> * fix: fix lint error. Signed-off-by: Electronic-Waste <[email protected]> --------- Signed-off-by: Electronic-Waste <[email protected]>

Signed-off-by: 0xThresh.eth <[email protected]>

Signed-off-by: junchenj <[email protected]>

* Add helm update to helm func test pipeline * Add runtimeClassName to multi model test --------- Signed-off-by: Shaoting Feng <[email protected]>

Signed-off-by: Shaoting Feng <[email protected]>

* Adding TP>1 support and adding shm size as a configurable parameter Signed-off-by: YuhanLiu11 <[email protected]>

…side queueing delay, number of pending / prefilling / decoding requests, average prefill / decoding length

sitloboi2012 added 3 commits February 8, 2025 20:23

update vllm dashboard to include placeholder for Current QPS, router-…

e8fb619

…side queueing delay, number of pending / prefilling / decoding requests, average prefill / decoding length

update vllm dashboard with layout and update install minikube cluster…

1ed19df

… to check nvidia-smi and nvidia-ctk when setup

update gitignore + req.txt to include openai and vllm for test folder

0f60f67

sitloboi2012 marked this pull request as draft February 9, 2025 03:46

Merge branch 'vllm-project:main' into feat/router-observe

6fc1e20

ApostaC requested review from ApostaC and YuhanLiu11 February 9, 2025 05:13

sitloboi2012 and others added 12 commits February 9, 2025 16:41

update req.txt to include aiofiles, and python multipart for router, …

ee45e1e

…update router to include instrumental log for vllm num of request, updatet installl minikube cluster to include checking and updating docker, minikube context and path tracking for nvidia-smi and nvidia-ctk

Merge branch 'vllm-project:main' into feat/router-observe

e033ab9

feat(router): generate req id with uuid. (#89)

0b82c77

* feat(router): generate req id with uuid. Signed-off-by: Electronic-Waste <[email protected]> * fix: fix lint error. Signed-off-by: Electronic-Waste <[email protected]> --------- Signed-off-by: Electronic-Waste <[email protected]>

feat: Add support for disabling router (#96)

d4e8256

Signed-off-by: 0xThresh.eth <[email protected]>

Update yaml file for the tutorials (#98)

8f54898

Signed-off-by: junchenj <[email protected]>

rebase with main vllm

6cd23a9

[CI/Build] Add helm update to helm func test pipeline (#99)

b228309

* Add helm update to helm func test pipeline * Add runtimeClassName to multi model test --------- Signed-off-by: Shaoting Feng <[email protected]>

Avoid using helm repo (#100)

8997717

Signed-off-by: Shaoting Feng <[email protected]>

Enable multi-GPU inference in vLLM with tensor parallelism (#105)

f1f8b52

* Adding TP>1 support and adding shm size as a configurable parameter Signed-off-by: YuhanLiu11 <[email protected]>

update vllm dashboard to include placeholder for Current QPS, router-…

2dcf9bf

…side queueing delay, number of pending / prefilling / decoding requests, average prefill / decoding length

update gitignore + req.txt to include openai and vllm for test folder

e6f71b4

merge latest vllm

11b352f

sitloboi2012 closed this Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Router observability (Current QPS, router-side queueing delay, etc) - WIP #90

Feat: Router observability (Current QPS, router-side queueing delay, etc) - WIP #90

sitloboi2012 commented Feb 8, 2025 •

edited

Loading

YuhanLiu11 commented Feb 9, 2025 •

edited

Loading

sitloboi2012 commented Feb 9, 2025

Feat: Router observability (Current QPS, router-side queueing delay, etc) - WIP #90

Feat: Router observability (Current QPS, router-side queueing delay, etc) - WIP #90

Conversation

sitloboi2012 commented Feb 8, 2025 • edited Loading

YuhanLiu11 commented Feb 9, 2025 • edited Loading

sitloboi2012 commented Feb 9, 2025

sitloboi2012 commented Feb 8, 2025 •

edited

Loading

YuhanLiu11 commented Feb 9, 2025 •

edited

Loading