Add pre-commit based linting and formatting (vllm-project#35)

* Add pre-commit workflow * Add actionlint * Add generic hooks * Add black, isort, shellcheck * Add requirements and markdown linting * Add toml * Add Dockerfile * Add codespell * Use Node.js version of `markdownlint` * Add `requirements-lint.txt` * Use CLI version of Node.js `markdownlint` * Add `pre-commit` instructions to `Contributing` * `pre-commit run -a` automatic fixes * Exclude helm templates from `check-yaml` * Comment hooks that require installed tools * Make `codespell` happy * Make `actionlint` happy * Disable `shellcheck` until it can be installed properly * Make `markdownlint` happy * Add note about running pre-commit --------- Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: 0xThresh.eth <[email protected]>
0xThresh · Jan 30, 2025 · 3246edd · 3246edd
1 parent aca24d6
commit 3246edd
Show file tree

Hide file tree

Showing 54 changed files with 993 additions and 671 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -28,7 +28,7 @@ jobs:
         uses: actions/checkout@v4
 
       - name: Set up Python
-        uses: actions/setup-python@v2
+        uses: actions/setup-python@v5
         with:
           python-version: '3.8'
 

diff --git a/.github/workflows/helm-lint.yml b/.github/workflows/helm-lint.yml
@@ -23,4 +23,3 @@ jobs:
       - name: Lint open-webui Helm Chart
         run: |
           helm lint ./helm
-
diff --git a/.github/workflows/helm-release.yml b/.github/workflows/helm-release.yml
@@ -24,7 +24,7 @@ jobs:
           git config user.name "$GITHUB_ACTOR"
           git config user.email "[email protected]"
 
-      # Could add Prometheus as a dependent chart here if desired    
+      # Could add Prometheus as a dependent chart here if desired
       # - name: Add Dependency Repos
       #   run: |
       #     helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
@@ -52,6 +52,5 @@ jobs:
               break
             fi
             REPO=$(echo '${{ github.repository }}' | tr '[:upper:]' '[:lower:]')
-            helm push "${pkg}" oci://ghcr.io/$REPO
+            helm push "${pkg}" "oci://ghcr.io/$REPO"
           done
-
diff --git a/.github/workflows/matchers/actionlint.json b/.github/workflows/matchers/actionlint.json
@@ -0,0 +1,17 @@
+{
+    "problemMatcher": [
+      {
+        "owner": "actionlint",
+        "pattern": [
+          {
+            "regexp": "^(?:\\x1b\\[\\d+m)?(.+?)(?:\\x1b\\[\\d+m)*:(?:\\x1b\\[\\d+m)*(\\d+)(?:\\x1b\\[\\d+m)*:(?:\\x1b\\[\\d+m)*(\\d+)(?:\\x1b\\[\\d+m)*: (?:\\x1b\\[\\d+m)*(.+?)(?:\\x1b\\[\\d+m)* \\[(.+?)\\]$",
+            "file": 1,
+            "line": 2,
+            "column": 3,
+            "message": 4,
+            "code": 5
+          }
+        ]
+      }
+    ]
+  }
diff --git a/.github/workflows/pre-commit.yml b/.github/workflows/pre-commit.yml
@@ -0,0 +1,17 @@
+name: pre-commit
+
+on:
+  pull_request:
+  push:
+    branches: [main]
+
+jobs:
+  pre-commit:
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
+    - uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0
+      with:
+        python-version: "3.12"
+    - run: echo "::add-matcher::.github/workflows/matchers/actionlint.json"
+    - uses: pre-commit/action@2c7b3805fd2a0fd8c1884dcaebf91fc102a13ecd # v3.0.1
diff --git a/.markdownlint.yaml b/.markdownlint.yaml
@@ -0,0 +1,5 @@
+MD013: false # line-length
+MD028: false # no-blanks-blockquote
+MD029: # ol-prefix
+  style: ordered
+MD033: false # no-inline-html
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,45 @@
+repos:
+- repo: https://github.com/rhysd/actionlint
+  rev: v1.7.7
+  hooks:
+  - id: actionlint
+- repo: https://github.com/pre-commit/pre-commit-hooks
+  rev: v5.0.0
+  hooks:
+  - id: check-json
+  - id: check-toml
+  - id: check-yaml
+    exclude: ^helm/templates/
+  - id: end-of-file-fixer
+  - id: requirements-txt-fixer
+  - id: trailing-whitespace
+# TODO: Enable these hooks when environment issues are resolved
+# - repo: https://github.com/hadolint/hadolint
+#   rev: v2.12.0
+#   hooks:
+#   - id: hadolint
+# - repo: https://github.com/gruntwork-io/pre-commit
+#   rev: v0.1.25
+#   hooks:
+#   - id: helmlint
+- repo: https://github.com/psf/black
+  rev: '25.1.0'
+  hooks:
+  - id: black
+- repo: https://github.com/pycqa/isort
+  rev: '6.0.0'
+  hooks:
+  - id: isort
+# TODO: Enable this hook when environment issues are resolved
+# - repo: https://github.com/koalaman/shellcheck-precommit
+#   rev: v0.10.0
+#   hooks:
+#   - id: shellcheck
+- repo: https://github.com/igorshubovych/markdownlint-cli
+  rev: v0.44.0
+  hooks:
+  - id: markdownlint
+- repo: https://github.com/codespell-project/codespell
+  rev: v2.4.1
+  hooks:
+  - id: codespell
diff --git a/README.md b/README.md
@@ -1,13 +1,12 @@
-# vLLM Production Stack: reference stack for production vLLM deployment 
-
+# vLLM Production Stack: reference stack for production vLLM deployment
 
 **vLLM Production Stack** project provides a reference implementation on how to build an inference stack on top of vLLM, which allows you to:
 
 - 🚀 Scale from single vLLM instance to distributed vLLM deployment without changing any application code
 - 💻 Monitor the  through a web dashboard
 - 😄 Enjoy the performance benefits brought by request routing and KV cache offloading
 
-## Latest News:
+## Latest News
 
 - 🔥 vLLM Production Stack is released! Checkout our [release blogs](https://blog.lmcache.ai/2025-01-21-stack-release) [01-22-2025]
 - ✨Join us at #production-stack channel of vLLM [slack](https://slack.vllm.ai/), LMCache [slack](https://join.slack.com/t/lmcacheworkspace/shared_invite/zt-2viziwhue-5Amprc9k5hcIdXT7XevTaQ), or fill out this [interest form](https://forms.gle/wSoeNpncmPVdXppg8) for a chat!
@@ -20,7 +19,6 @@ The stack is set up using [Helm](https://helm.sh/docs/), and contains the follow
 - **Request router**: Directs requests to appropriate backends based on routing keys or session IDs to maximize KV cache reuse.
 - **Observability stack**: monitors the metrics of the backends through [Prometheus](https://github.com/prometheus/prometheus) + [Grafana](https://grafana.com/)
 
-
  <img src="https://github.com/user-attachments/assets/8f05e7b9-0513-40a9-9ba9-2d3acca77c0c" alt="Architecture of the stack" width="800"/>
 
 ## Roadmap
@@ -42,6 +40,7 @@ We are actively working on this project and will release the following features
 ### Deployment
 
 vLLM Production Stack can be deployed via helm charts. Clone the repo to local and execute the following commands for a minimal deployment:
+
 ```bash
 git clone https://github.com/vllm-project/production-stack.git
 cd production-stack/
@@ -55,21 +54,18 @@ To validate the installation and and send query to the stack, refer to [this tut
 
 For more information about customizing the helm chart, please refer to [values.yaml](https://github.com/vllm-project/production-stack/blob/main/helm/values.yaml) and our other [tutorials](https://github.com/vllm-project/production-stack/tree/main/tutorials).
 
-
 ### Uninstall
 
 ```bash
 sudo helm uninstall vllm
 ```
 
-
 ## Grafana Dashboard
 
 ### Features
 
 The Grafana dashboard provides the following insights:
 
-
 1. **Available vLLM Instances**: Displays the number of healthy instances.
 2. **Request Latency Distribution**: Visualizes end-to-end request latency.
 3. **Time-to-First-Token (TTFT) Distribution**: Monitors response times for token generation.
@@ -98,7 +94,6 @@ The router ensures efficient request distribution among backends. It supports:
   - Session-ID based routing
   - (WIP) prefix-aware routing
 
-
 ## Contributing
 
 Contributions are welcome! Please follow the standard GitHub flow:
@@ -107,11 +102,25 @@ Contributions are welcome! Please follow the standard GitHub flow:
 2. Create a feature branch.
 3. Submit a pull request with detailed descriptions.
 
+We use `pre-commit` for formatting, it is installed as follows:
+
+```bash
+pip install -r requirements-lint.txt
+pre-commit install
+```
+
+It will run automatically before every commit. You cana also run it manually on all files with:
+
+```bash
+pre-commit run --all-files
+```
+
+> You can read more about `pre-commit` at <https://pre-commit.com>.
+
 ## License
 
 This project is licensed under the MIT License. See the `LICENSE` file for details.
 
 ---
 
 For any issues or questions, feel free to open an issue or contact the maintainers.
-
diff --git a/helm/README.md b/helm/README.md
@@ -2,14 +2,14 @@
 
 This helm chart lets users deploy multiple serving engines and a router into the Kubernetes cluster.
 
-## Key features:
+## Key features
 
 - Support running multiple serving engines with multiple different models
-- Load the model weights directly from the existing PersistentVolumes 
+- Load the model weights directly from the existing PersistentVolumes
 
 ## Prerequisites
 
-1. A running Kubernetes cluster with GPU. (You can set it up through `minikube`: https://minikube.sigs.k8s.io/docs/tutorials/nvidia/)
+1. A running Kubernetes cluster with GPU. (You can set it up through `minikube`: <https://minikube.sigs.k8s.io/docs/tutorials/nvidia/>)
 2. [Helm](https://helm.sh/docs/intro/install/)
 
 ## Install the helm chart

diff --git a/helm/ct.yaml b/helm/ct.yaml
@@ -1,3 +1,3 @@
 chart-dirs:
   - charts
-validate-maintainers: false
+validate-maintainers: false
diff --git a/helm/lintconf.yaml b/helm/lintconf.yaml
@@ -39,4 +39,4 @@ rules:
     type: unix
   trailing-spaces: enable
   truthy:
-    level: warning
+    level: warning
diff --git a/helm/templates/deployment-vllm-multi.yaml b/helm/templates/deployment-vllm-multi.yaml
@@ -67,7 +67,7 @@ spec:
             value: /data
           {{- if $modelSpec.hf_token }}
           - name: HF_TOKEN
-            valueFrom: 
+            valueFrom:
               secretKeyRef:
                 name: {{ .Release.Name }}-secrets
                 key: hf_token_{{ $modelSpec.name }}
@@ -89,7 +89,7 @@ spec:
             value: "{{ $modelSpec.lmcacheConfig.cpuOffloadingBufferSize }}"
           {{-   end }}
           {{-   if $modelSpec.lmcacheConfig.diskOffloadingBufferSize }}
-          - name: LMCACHE_LOCAL_DISK 
+          - name: LMCACHE_LOCAL_DISK
             value: "True"
           - name: LMCACHE_MAX_LOCAL_DISK_SIZE
             value: "{{ $modelSpec.lmcacheConfig.diskOffloadingBufferSize }}"
@@ -99,7 +99,7 @@ spec:
           envFrom:
             - configMapRef:
                 name: "{{ .Release.Name }}-configs"
-          {{- end }}          
+          {{- end }}
           ports:
             - name: {{ include "chart.container-port-name" . }}
               containerPort: {{ include "chart.container-port" . }}
@@ -123,7 +123,7 @@ spec:
 
       {{- if .Values.servingEngineSpec.runtimeClassName }}
       runtimeClassName: nvidia
-      {{- end }} 
+      {{- end }}
       {{- if $modelSpec.nodeSelectorTerms}}
       affinity:
         nodeAffinity:
@@ -132,7 +132,7 @@ spec:
             {{- with $modelSpec.nodeSelectorTerms }}
             {{- toYaml . | nindent 12 }}
             {{- end }}
-      {{- end }} 
+      {{- end }}
 {{- end }}
 ---
 {{- end }}
diff --git a/helm/templates/role.yaml b/helm/templates/role.yaml
@@ -7,4 +7,3 @@ rules:
 - apiGroups: [""] # "" indicates the core API group
   resources: ["pods"]
   verbs: ["get", "watch", "list"]
-
diff --git a/helm/templates/serviceaccount.yaml b/helm/templates/serviceaccount.yaml
@@ -3,4 +3,3 @@ kind: ServiceAccount
 metadata:
   name: "{{ .Release.Name }}-router-service-account"
   namespace: {{ .Release.Namespace }}
-
diff --git a/helm/test.sh b/helm/test.sh
@@ -1,2 +1,2 @@
-#helm upgrade --install --create-namespace --namespace=ns-vllm test-vllm . -f values-yihua.yaml 
+#helm upgrade --install --create-namespace --namespace=ns-vllm test-vllm . -f values-yihua.yaml
 helm upgrade --install test-vllm . -f values-additional.yaml #--create-namespace --namespace=vllm
diff --git a/helm/values.schema.json b/helm/values.schema.json
@@ -140,7 +140,7 @@
             }
           }
         },
-        "runtimeClassName": { 
+        "runtimeClassName": {
             "type": "string"
         }
       }
@@ -170,4 +170,3 @@
     }
   }
 }
-
diff --git a/helm/values.yaml b/helm/values.yaml
@@ -51,13 +51,13 @@ servingEngineSpec:
   #
   #   requestCPU: 10
   #   requestMemory: "64Gi"
-  #   requestGPU: 1 
+  #   requestGPU: 1
   #
   #   pvcStorage: "50Gi"
-  #   pvcMatchLabels: 
+  #   pvcMatchLabels:
   #     model: "mistral"
   #
-  #   vllmConfig: 
+  #   vllmConfig:
   #     enableChunkedPrefill: false
   #     enablePrefixCaching: false
   #     maxModelLen: 16384
@@ -80,14 +80,14 @@ servingEngineSpec:
   #         - "NVIDIA-RTX-A6000"
   modelSpec: []
 
-  # -- Container port 
+  # -- Container port
   containerPort: 8000
-  # -- Service port 
+  # -- Service port
   servicePort: 80
-  
+
   # -- Set other environment variables from config map
   configs: {}
-  
+
   # -- Readiness probe configuration
   startupProbe:
     # -- Number of seconds after the container has started before startup probe is initiated
@@ -102,7 +102,7 @@ servingEngineSpec:
       path: /health
       # -- Name or number of the port to access on the container, on which the server is listening
       port: 8000
-  
+
   # -- Liveness probe configuration
   livenessProbe:
    # -- Number of seconds after the container has started before liveness probe is initiated
@@ -117,7 +117,7 @@ servingEngineSpec:
       path: /health
       # -- Name or number of the port to access on the container, on which the server is listening
       port: 8000
-  
+
   # -- Disruption Budget Configuration
   maxUnavailablePodDisruptionBudget: ""
 
@@ -135,7 +135,7 @@ servingEngineSpec:
 routerSpec:
   # -- Number of replicas
   replicaCount: 1
-  
+
   # -- Container port
   containerPort: 8000
 

diff --git a/observability/README.md b/observability/README.md
@@ -4,7 +4,7 @@
 
 ## Deploy the observability stack
 
-The observability stack is based on [kube-prom-stack](https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/README.md). 
+The observability stack is based on [kube-prom-stack](https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/README.md).
 
 To launch the observability stack:
 

diff --git a/observability/upgrade.sh b/observability/upgrade.sh
@@ -1,4 +1,3 @@
 helm upgrade kube-prom-stack prometheus-community/kube-prometheus-stack \
   --namespace monitoring \
   -f "values.yaml"
-
Original file line number	Diff line number	Diff line change
Expand Up		@@ -23,4 +23,3 @@ jobs:
		- name: Lint open-webui Helm Chart
		run: \|
		helm lint ./helm
Original file line number	Diff line number	Diff line change
Expand Up		@@ -7,4 +7,3 @@ rules:
		- apiGroups: [""] # "" indicates the core API group
		resources: ["pods"]
		verbs: ["get", "watch", "list"]
Original file line number	Diff line number	Diff line change
Expand Up		@@ -3,4 +3,3 @@ kind: ServiceAccount
		metadata:
		name: "{{ .Release.Name }}-router-service-account"
		namespace: {{ .Release.Namespace }}
-Original file line number
+Diff line change
@@ Expand Up / @@ -140,7 +140,7 @@ @@
                 }
               }
             },
-            "runtimeClassName": {
+            "runtimeClassName": {
                 "type": "string"
             }
           }
@@ Expand Down Expand Up / @@ -170,4 +170,3 @@ @@
         }
       }
     }