Model orchestration with heterogeneous hardwares #13
Labels
area/heterogeneous
kind/enhancement
New feature or request
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
We meet a few cases that single deployment needs to be deployed across different chips due to quota or resource shortage. However, in Kubernetes, most of the time we use
Deployment
to manage a group of pods using one type of GPU, If we remove GPU type constraints, then it's hard to control the ratio. Technically, we can workaround the problem using multiple deployment, but the rolling upgrade control additional control, same as HPA. The RoleSet CRD is not able to manage the such cases as well.The text was updated successfully, but these errors were encountered: