Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Add dispatch job to leverage jobs to dynamic devices #251

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 28 additions & 3 deletions .github/workflows/vllm_ascend_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,30 @@ defaults:
shell: bash -el {0}

jobs:
dispatch:
name: vLLM Ascend test (dispatch)
runs-on: ascend-03-arm64
outputs:
number: ${{ steps.dispatch-device.outputs.number }}
steps:
- name: vLLM Ascend test (dispatch)
id: dispatch-device
run: |
# Try to acquire lock to dispatch a device
lockfile /tmp/dispatch.lock

# Print npu info
npu-list /dev/null 2>&1

# Select first available device (exclude davinci1 and davinci0)
NUMBER=$(npu-list /dev/null 2>&1 | grep None | grep -v davinci1 | grep -v davinci0 |head -1 | cut -b 15)
echo "Dispatch to /dev/davinci$NUMBER"
echo "number=$NUMBER" >> $GITHUB_OUTPUT

test:
needs: [dispatch]
name: vLLM Ascend test (self-host)
runs-on: ascend-arm64 # actionlint-ignore: runner-label
runs-on: ascend-03-arm64 # actionlint-ignore: runner-label

container:
image: quay.io/ascend/cann:8.0.0-910b-ubuntu22.04-py3.10
Expand All @@ -58,9 +79,11 @@ jobs:
- /usr/local/bin/npu-smi:/usr/local/bin/npu-smi
- /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/
# Use self-host cache speed up pip and model download
- /home/action/actions-runner/_work/cache:/github/home/.cache/
- /home/action/cache:/github/home/.cache/
# for dispatch lock
- /tmp/:/tmp/
options: >-
--device /dev/davinci6
--device /dev/davinci${{ needs.dispatch.outputs.number }}
--device /dev/davinci_manager
--device /dev/devmm_svm
--device /dev/hisi_hdc
Expand All @@ -71,6 +94,8 @@ jobs:
run: |
npu-smi info
cat /usr/local/Ascend/ascend-toolkit/latest/"$(uname -i)"-linux/ascend_toolkit_install.info
# unlock
rm -rf /tmp/dispatch.lock

- name: Config mirrors
run: |
Expand Down
1 change: 1 addition & 0 deletions tests/test_offline_inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
"Qwen/Qwen2.5-0.5B-Instruct",
]
os.environ["VLLM_USE_MODELSCOPE"] = "True"
os.environ["PYTORCH_NPU_ALLOC_CONF"] = "max_split_size_mb:256"

TARGET_TEST_SUITE = os.environ.get("TARGET_TEST_SUITE", "L4")

Expand Down