Skip to content

Commit

Permalink
[Bugfix] value file based accessMode (#108)
Browse files Browse the repository at this point in the history
* fix: value file based accessMode

* fix: add config to examples

---------
Signed-off-by: BrianPark314 <[email protected]>
  • Loading branch information
BrianPark314 authored Feb 11, 2025
1 parent 8d4b05a commit b470098
Show file tree
Hide file tree
Showing 9 changed files with 22 additions and 2 deletions.
4 changes: 4 additions & 0 deletions .github/multiple-models.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ servingEngineSpec:
requestMemory: "16Gi"
requestGPU: 1
pvcStorage: "10Gi"
pvcAccessMode:
- ReadWriteOnce

- name: "smol135m"
repository: "vllm/vllm-openai"
Expand All @@ -20,3 +22,5 @@ servingEngineSpec:
requestMemory: "16Gi"
requestGPU: 1
pvcStorage: "10Gi"
pvcAccessMode:
- ReadWriteOnce
3 changes: 1 addition & 2 deletions helm/templates/pvc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@ metadata:
name: "{{ .Release.Name }}-{{$modelSpec.name}}-storage-claim"
namespace: {{ .Release.Namespace }}
spec:
accessModes:
- ReadWriteOnce
accessModes: {{ toYaml $modelSpec.pvcAccessMode | nindent 4 }}
resources:
requests:
storage: {{ $modelSpec.pvcStorage }}
Expand Down
3 changes: 3 additions & 0 deletions helm/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ servingEngineSpec:
# - requestGPU: (int) The number of GPUs requested for the model, e.g., 1
#
# - pvcStorage: (string) The amount of storage requested for the model, e.g., "50Gi"
# - pvcAccessMode: (list) The access mode policy for the mounted volume, e.g., ["ReadWriteOnce"]
# - pvcMatchLabels: (optional, map) The labels to match the PVC, e.g., {model: "opt125m"}
#
# - vllmConfig: (optional, map) The configuration for the VLLM model, supported options are:
Expand Down Expand Up @@ -57,6 +58,8 @@ servingEngineSpec:
# requestGPU: 1
#
# pvcStorage: "50Gi"
# pvcAccessMode:
# - ReadWriteOnce
# pvcMatchLabels:
# model: "mistral"
#
Expand Down
2 changes: 2 additions & 0 deletions tutorials/assets/values-01-2pods-minimal-example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ servingEngineSpec:
requestGPU: 0.5

pvcStorage: "10Gi"
pvcAccessMode:
- ReadWriteMany

vllmConfig:
maxModelLen: 1024
Expand Down
2 changes: 2 additions & 0 deletions tutorials/assets/values-01-minimal-example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,5 @@ servingEngineSpec:
requestGPU: 1

pvcStorage: "10Gi"
pvcAccessMode:
- ReadWriteOnce
2 changes: 2 additions & 0 deletions tutorials/assets/values-02-basic-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ servingEngineSpec:
requestGPU: 1

pvcStorage: "50Gi"
pvcAccessMode:
- ReadWriteOnce

vllmConfig:
enableChunkedPrefill: false
Expand Down
2 changes: 2 additions & 0 deletions tutorials/assets/values-03-match-pv.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ servingEngineSpec:
requestGPU: 1

pvcStorage: "50Gi"
pvcAccessMode:
- ReadWriteOnce
pvcMatchLabels:
model: "llama3-pv"

Expand Down
4 changes: 4 additions & 0 deletions tutorials/assets/values-04-multiple-models.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ servingEngineSpec:
requestMemory: "16Gi"
requestGPU: 1
pvcStorage: "50Gi"
pvcAccessMode:
- ReadWriteOnce
vllmConfig:
maxModelLen: 4096
hf_token: <YOUR HF TOKEN FOR LLAMA3.1>
Expand All @@ -23,6 +25,8 @@ servingEngineSpec:
requestMemory: "16Gi"
requestGPU: 1
pvcStorage: "50Gi"
pvcAccessMode:
- ReadWriteOnce
vllmConfig:
maxModelLen: 4096
hf_token: <YOUR HF TOKEN FOR MISTRAL>
2 changes: 2 additions & 0 deletions tutorials/assets/values-05-cpu-offloading.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ servingEngineSpec:
requestMemory: "40Gi"
requestGPU: 1
pvcStorage: "50Gi"
pvcAccessMode:
- ReadWriteOnce
vllmConfig:
enableChunkedPrefill: false
enablePrefixCaching: false
Expand Down

0 comments on commit b470098

Please sign in to comment.