Enhance Mocked vLLM App with Dynamic Metrics for Autoscaling Efficiency #117

kr11 · 2024-09-02T12:18:17Z

🚀 Feature Description and Motivation

I propose adding a 'metrics' endpoint in docs/development/app/app.py that returns Prometheus-style metrics results.

By mocking the impact of replica changes on metrics, where app.py will detect the replica number and return inversely proportional metric values, we can improve the testing and development of autoscaling policies.

We further add an unitest and enrich README.

Use Case

No response

Proposed Solution

related PR: #116

The text was updated successfully, but these errors were encountered:

kr11 self-assigned this Sep 2, 2024

kr11 added area/autoscaling kind/feature Categorizes issue or PR as related to a new feature. labels Sep 2, 2024

kr11 mentioned this issue Sep 4, 2024

Autoscaling Workflow Enhancement - Part 4: Integrating MetricClient into Autoscaling Workflow #116

Merged

Jeffwan closed this as completed in #116 Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance Mocked vLLM App with Dynamic Metrics for Autoscaling Efficiency #117

Enhance Mocked vLLM App with Dynamic Metrics for Autoscaling Efficiency #117

kr11 commented Sep 2, 2024

Enhance Mocked vLLM App with Dynamic Metrics for Autoscaling Efficiency #117

Enhance Mocked vLLM App with Dynamic Metrics for Autoscaling Efficiency #117

Comments

kr11 commented Sep 2, 2024

🚀 Feature Description and Motivation

Use Case

Proposed Solution