Skip to content

Commit

Permalink
update the run router for case with static service discovery and k8s …
Browse files Browse the repository at this point in the history
…service discovery

Signed-off-by: sitloboi2012 <[email protected]>
  • Loading branch information
sitloboi2012 committed Feb 12, 2025
1 parent 694f804 commit 7e1566b
Show file tree
Hide file tree
Showing 3 changed files with 27 additions and 9 deletions.
8 changes: 7 additions & 1 deletion observability/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,20 @@ sudo bash install.sh

After installing, the dashboard can be accessed through the service `service/kube-prom-stack-grafana` in the `monitoring` namespace.

## Access the Grafana dashboard
## Access the Grafana & Prometheus dashboard

Forward the Grafana dashboard port to the local node-port

```bash
sudo kubectl --namespace monitoring port-forward svc/kube-prom-stack-grafana 3000:80 --address 0.0.0.0
```

Forward the Prometheus dashboard

```bash
sudo kubectl --namespace monitoring port-forward prometheus-kube-prom-stack-kube-prome-prometheus-0 9090:9090
```

Open the webpage at `http://<IP of your node>:3000` to access the Grafana web page. The default user name is `admin` and the password can be configured in `values.yaml` (default is `prom-operator`).

Import the dashboard using the `vllm-dashboard.json` in this folder.
2 changes: 1 addition & 1 deletion src/vllm_router/router.py
Original file line number Diff line number Diff line change
Expand Up @@ -451,7 +451,7 @@ def log_stats(interval: int = 10):
es = engine_stats[url]
logstr += (
f" Engine Stats (Dashboard): Running Requests: {es.num_running_requests}, "
f"Queueing Delay (requests): {es.num_queing_requests}, "
f"Queueing Delay (requests): {es.num_queuing_requests}, "
f"GPU Cache Hit Rate: {es.gpu_cache_hit_rate:.2f}\n"
)
else:
Expand Down
26 changes: 19 additions & 7 deletions src/vllm_router/run-router.sh
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,27 @@ if [[ $# -ne 1 ]]; then
exit 1
fi

python3 vllm_router/router.py --port "$1" \
--service-discovery k8s \
--k8s-label-selector release=test \
--k8s-namespace default \
--routing-logic session \
--session-key "x-user-id" \
# Use this command when testing with k8s service discovery
# python3 -m vllm_router.router --port "$1" \
# --service-discovery k8s \
# --k8s-label-selector release=test \
# --k8s-namespace default \
# --routing-logic session \
# --session-key "x-user-id" \
# --engine-stats-interval 10 \
# --log-stats

# Use this command when testing with static service discovery
python3 -m vllm_router.router --port "$1" \
--service-discovery static \
--static-backends "http://localhost:9000" \
--static-models "fake_model_name" \
--engine-stats-interval 10 \
--log-stats
--log-stats \
--routing-logic session \
--session-key "x-user-id"

# Use this command when testing with roundrobin routing logic
#python3 router.py --port "$1" \
# --service-discovery k8s \
# --k8s-label-selector release=test \
Expand Down

0 comments on commit 7e1566b

Please sign in to comment.