Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] heterogenous gpu docs added #545

Merged
merged 2 commits into from
Jan 3, 2025
Merged

Conversation

nwangfw
Copy link
Collaborator

@nwangfw nwangfw commented Dec 31, 2024

Pull Request Description

A blog post introducing heterogeneous GPU inference, along with a deployment example is added.

Related Issues

Partially resolves: #390

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

@nwangfw nwangfw requested review from zhangjyr and Jeffwan December 31, 2024 17:45
@nwangfw nwangfw added kind/documentation Improvements or additions to documentation area/heterogeneous area/website labels Dec 31, 2024
@nwangfw nwangfw added this to the v0.2.0 milestone Dec 31, 2024
@nwangfw nwangfw changed the title heterogenous gpu docs added [Docs] heterogenous gpu docs added Dec 31, 2024
Copy link
Collaborator

@zhangjyr zhangjyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should further document

  1. How podautoscaler should configured to read metricsSources
  2. The new label I added to the deployment as "model.aibrix.ai/min_replicas" in
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: simulator-llama2-7b-a100
    labels:
    model.aibrix.ai/name: "llama2-7b"
    model.aibrix.ai/min_replicas: "1" # min replica for gpu optimizer when no workloads.

kubectl -n aibrix-system port-forward svc/aibrix-redis-master 6379:6379 1>/dev/null 2>&1 &

cd $AIBRIX_HOME/python/aibrix/aibrix/gpu_optimizer
make DP=simulator-llama2-7b-a100 COST=0.1gen-profile
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo here: missing " " before gen-profile. Plus. it will be better to set cost of A100 large than a40, e.g., cost=1

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhangjyr can you take another look?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo here: missing " " before gen-profile. Plus. it will be better to set cost of A100 large than a40, e.g., cost=1

Thanks for finding this typo. I intended to type 1.0 rather then 0.1.

@nwangfw
Copy link
Collaborator Author

nwangfw commented Jan 2, 2025

I think we should further document

  1. How podautoscaler should configured to read metricsSources
  2. The new label I added to the deployment as "model.aibrix.ai/min_replicas" in
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: simulator-llama2-7b-a100
    labels:
    model.aibrix.ai/name: "llama2-7b"
    model.aibrix.ai/min_replicas: "1" # min replica for gpu optimizer when no workloads.

Thanks for your suggestions. I have added two new parts accordingly.

@nwangfw nwangfw force-pushed the ning/issues-390-improve-docs branch from 4473827 to fade25a Compare January 2, 2025 20:07
@Jeffwan Jeffwan merged commit ba78d2d into main Jan 3, 2025
2 checks passed
@Jeffwan Jeffwan deleted the ning/issues-390-improve-docs branch January 3, 2025 19:04
gangmuk pushed a commit that referenced this pull request Jan 25, 2025
* heterogenous gpu docs added

* added podAutoscaler and minReplicas config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/heterogeneous area/website kind/documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improves the docs site with more guidances
3 participants