[Docs] heterogenous gpu docs added #545

nwangfw · 2024-12-31T17:45:43Z

Pull Request Description

A blog post introducing heterogeneous GPU inference, along with a deployment example is added.

Related Issues

Partially resolves: #390

Important: Before submitting, please complete the description above and review the checklist below.

Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

[Bug]: Corrections to existing functionality
[CI]: Changes to build process or CI pipeline
[Docs]: Updates or additions to documentation
[API]: Modifications to aibrix's API or interface
[CLI]: Changes or additions to the Command Line Interface
[Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

PR title includes appropriate prefix(es)
Changes are clearly explained in the PR description
New and existing tests pass successfully
Code adheres to project style and best practices
Documentation updated to reflect changes (if applicable)
Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

zhangjyr

I think we should further document

How podautoscaler should configured to read metricsSources
The new label I added to the deployment as "model.aibrix.ai/min_replicas" in
apiVersion: apps/v1
kind: Deployment
metadata:
name: simulator-llama2-7b-a100
labels:
model.aibrix.ai/name: "llama2-7b"
model.aibrix.ai/min_replicas: "1" # min replica for gpu optimizer when no workloads.

zhangjyr · 2024-12-31T18:22:06Z

docs/source/features/heterogeneous-gpu.rst

+    kubectl -n aibrix-system port-forward svc/aibrix-redis-master 6379:6379 1>/dev/null 2>&1 &
+
+    cd $AIBRIX_HOME/python/aibrix/aibrix/gpu_optimizer
+    make DP=simulator-llama2-7b-a100 COST=0.1gen-profile


There's a typo here: missing " " before gen-profile. Plus. it will be better to set cost of A100 large than a40, e.g., cost=1

@zhangjyr can you take another look?

There's a typo here: missing " " before gen-profile. Plus. it will be better to set cost of A100 large than a40, e.g., cost=1

Thanks for finding this typo. I intended to type 1.0 rather then 0.1.

nwangfw · 2025-01-02T19:48:36Z

I think we should further document

How podautoscaler should configured to read metricsSources

The new label I added to the deployment as "model.aibrix.ai/min_replicas" in
apiVersion: apps/v1
kind: Deployment
metadata:
name: simulator-llama2-7b-a100
labels:
model.aibrix.ai/name: "llama2-7b"
model.aibrix.ai/min_replicas: "1" # min replica for gpu optimizer when no workloads.

Thanks for your suggestions. I have added two new parts accordingly.

* heterogenous gpu docs added * added podAutoscaler and minReplicas config

heterogenous gpu docs added

0200ff1

nwangfw requested review from zhangjyr and Jeffwan December 31, 2024 17:45

nwangfw added kind/documentation Improvements or additions to documentation area/heterogeneous area/website labels Dec 31, 2024

nwangfw added this to the v0.2.0 milestone Dec 31, 2024

nwangfw changed the title ~~heterogenous gpu docs added~~ [Docs] heterogenous gpu docs added Dec 31, 2024

zhangjyr reviewed Dec 31, 2024

View reviewed changes

added podAutoscaler and minReplicas config

fade25a

nwangfw force-pushed the ning/issues-390-improve-docs branch from 4473827 to fade25a Compare January 2, 2025 20:07

zhangjyr approved these changes Jan 3, 2025

View reviewed changes

Jeffwan merged commit ba78d2d into main Jan 3, 2025
2 checks passed

Jeffwan deleted the ning/issues-390-improve-docs branch January 3, 2025 19:04

gangmuk pushed a commit that referenced this pull request Jan 25, 2025

[Docs] heterogenous gpu docs added (#545)

1c22d11

* heterogenous gpu docs added * added podAutoscaler and minReplicas config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Docs] heterogenous gpu docs added #545

[Docs] heterogenous gpu docs added #545

nwangfw commented Dec 31, 2024

zhangjyr left a comment

zhangjyr Dec 31, 2024

Jeffwan Jan 2, 2025

nwangfw Jan 2, 2025

nwangfw commented Jan 2, 2025

[Docs] heterogenous gpu docs added #545

[Docs] heterogenous gpu docs added #545

Conversation

nwangfw commented Dec 31, 2024

Pull Request Description

Related Issues

Pull Request Title Format

Submission Checklist

zhangjyr left a comment

Choose a reason for hiding this comment

zhangjyr Dec 31, 2024

Choose a reason for hiding this comment

Jeffwan Jan 2, 2025

Choose a reason for hiding this comment

nwangfw Jan 2, 2025

Choose a reason for hiding this comment

nwangfw commented Jan 2, 2025