[V1][Core] Generic mechanism for handling engine utility methods #13060

njhill · 2025-02-10T23:54:47Z

We now have a number of utility / "control" operations that need to be called on the engine (add_lora, profile, sleep, wakeup, ...). It should be possible to call these in a synchronous manner to know when the operation is complete and whether it succeeded. Some operations in future may require results to be returned.

These changes centralize the mechanism for doing this in V1, and add the result-returning part.

We now have a number of utility / "control" operations that need to be called on the engine (add_lora, profile, sleep, wakeup, ...). It should be possible to call these in a synchronous manner to know when the operation is complete and whether it succeeded. Some operations in future may require results to be returned. These changes centralize the mechanism for doing this in V1, and add the result-returning part. Signed-off-by: Nick Hill <[email protected]>

github-actions · 2025-02-10T23:54:59Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

mergify · 2025-02-12T08:43:18Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @njhill.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Nick Hill <[email protected]>

Signed-off-by: Nick Hill <[email protected]> # Conflicts: # vllm/v1/engine/__init__.py # vllm/v1/engine/async_llm.py

Signed-off-by: Nick Hill <[email protected]>

# Conflicts: # vllm/v1/engine/core.py

Signed-off-by: Nick Hill <[email protected]>

mergify · 2025-02-14T06:34:35Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @njhill.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…ty-funcs # Conflicts: # vllm/v1/engine/__init__.py # vllm/v1/engine/core.py # vllm/v1/engine/core_client.py

Signed-off-by: Nick Hill <[email protected]>

njhill · 2025-02-15T00:13:07Z

@youkaichao this should be ready to go 🤞

youkaichao · 2025-02-15T07:58:27Z

@robertgshaw2-redhat can you help review this? I'm not familiar with this part of the code, but I will need it in #12987 🥺

mergify · 2025-02-15T12:00:50Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @njhill.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

# Conflicts: # vllm/v1/engine/core.py

vllm/v1/engine/core.py

vllm/v1/engine/core_client.py

vllm/v1/engine/core.py

afeldman-nm · 2025-02-18T00:58:36Z

vllm/v1/engine/core.py

+            msgspec.convert(v, type=p.annotation) if isclass(p.annotation)
+            and issubclass(p.annotation, msgspec.Struct)
+            and not isinstance(v, p.annotation) else v


Aesthetically, a helper function might clean up this code, i.e.

return tuple(msgspec.convert(v, type=p.annotation) if needs_conversion(v,p) else v for v, p in zip(args, arg_types))

however this is the engine core, perhaps multiple helper-function calls would be too costly.

afeldman-nm · 2025-02-18T01:02:57Z

vllm/v1/engine/core_client.py

+        # Ensure that the outputs socket processing thread does not have
+        # a ref to the client which prevents gc.
+        output_socket = self.output_socket
+        decoder = self.decoder
+        utility_results = self.utility_results
+        outputs_queue = self.outputs_queue

-        (frame, ) = self.output_socket.recv_multipart(copy=False)
-        return self.decoder.decode(frame.buffer)
+        def process_outputs_socket():
+            while True:
+                (frame, ) = output_socket.recv_multipart(copy=False)
+                outputs = decoder.decode(frame.buffer)
+                if outputs.utility_output:
+                    _process_utility_output(outputs.utility_output,
+                                            utility_results)
+                else:
+                    outputs_queue.put_nowait(outputs)
+
+        # Process outputs from engine in separate thread.
+        Thread(target=process_outputs_socket, daemon=True).start()


Not a strong opinion, but this code to launch the engine output processing thread could go in a separate helper function.

afeldman-nm · 2025-02-18T01:04:24Z

vllm/v1/engine/core_client.py

@@ -236,6 +275,17 @@ def _send_input(self, request_type: EngineCoreRequestType,
        msg = (request_type.value, self.encoder.encode(request))
        self.input_socket.send_multipart(msg, copy=False)

+    def _call_utility(self, method: str, *args, unary: bool = False) -> Any:
+        call_id = uuid.uuid1().int >> 64


I think you use call_id = uuid.uuid1().int >> 64 in at least two places, perhaps makes sense to have a new_call_id() helper function?

afeldman-nm · 2025-02-18T01:06:21Z

vllm/v1/engine/core_client.py

+                                  method: str,
+                                  *args,
+                                  unary: bool = False) -> Any:
+        call_id = uuid.uuid1().int >> 64


Second occurrence which could be replaced with a helper function.

afeldman-nm

Hi Nick, I left a little bit of feedback. Thanks for the PR!

Signed-off-by: Nick Hill <[email protected]>

njhill · 2025-02-18T04:22:17Z

Thanks for the great comments @afeldman-nm. I've addressed some of them. Re helper functions it's a bit subjective but I personally lean towards being more selective; the indirection can make the code less readable, especially when the logic itself is trivial.

youkaichao

thank for the great work!

…t#13060) Signed-off-by: Nick Hill <[email protected]>

…t#13060) Signed-off-by: Nick Hill <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

njhill requested a review from youkaichao February 10, 2025 23:54

mergify bot added the v1 label Feb 10, 2025

This was referenced Feb 11, 2025

[V1] LoRA - Enable Serving Usecase #12883

Merged

WIP: [Core] Expose sleep and wake_up api to the Client for Model State Management #13016

Closed

mergify bot added the needs-rebase label Feb 12, 2025

njhill added 2 commits February 12, 2025 17:29

add tests and fixes

d018da2

Signed-off-by: Nick Hill <[email protected]>

Merge remote-tracking branch 'origin/main' into v1-utility-funcs

f59db2e

Signed-off-by: Nick Hill <[email protected]> # Conflicts: # vllm/v1/engine/__init__.py # vllm/v1/engine/async_llm.py

njhill marked this pull request as ready for review February 13, 2025 01:40

njhill requested review from WoosukKwon, robertgshaw2-redhat, ywang96, comaniac and alexm-redhat as code owners February 13, 2025 01:40

mergify bot removed the needs-rebase label Feb 13, 2025

fix GC issue

3bba546

Signed-off-by: Nick Hill <[email protected]>

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 13, 2025

njhill added 3 commits February 13, 2025 08:47

Merge remote-tracking branch 'origin/main' into v1-utility-funcs

fe6f1a3

# Conflicts: # vllm/v1/engine/core.py

fixes

370a880

Signed-off-by: Nick Hill <[email protected]>

Merge remote-tracking branch 'origin/main' into v1-utility-funcs

81c9cca

mergify bot added the needs-rebase label Feb 14, 2025

njhill added 2 commits February 13, 2025 23:19

Merge remote-tracking branch 'refs/remotes/origin/main' into v1-utili…

8fef5a4

…ty-funcs # Conflicts: # vllm/v1/engine/__init__.py # vllm/v1/engine/core.py # vllm/v1/engine/core_client.py

more fixes

bb5a3f5

Signed-off-by: Nick Hill <[email protected]>

mergify bot removed the needs-rebase label Feb 14, 2025

clean up comment

cf33034

Signed-off-by: Nick Hill <[email protected]>

mergify bot added the needs-rebase label Feb 15, 2025

Merge remote-tracking branch 'origin/main' into v1-utility-funcs

e1180a2

# Conflicts: # vllm/v1/engine/core.py

mergify bot removed the needs-rebase label Feb 17, 2025

afeldman-nm suggested changes Feb 18, 2025

View reviewed changes

njhill added 2 commits February 17, 2025 20:15

Address @afeldman-nm's comments

8b0dbe9

Signed-off-by: Nick Hill <[email protected]>

Merge remote-tracking branch 'origin/main' into v1-utility-funcs

ebd8d0c

youkaichao approved these changes Feb 19, 2025

View reviewed changes

youkaichao merged commit caf7ff4 into vllm-project:main Feb 19, 2025
47 checks passed

njhill deleted the v1-utility-funcs branch February 19, 2025 15:40

njhill mentioned this pull request Feb 19, 2025

[BugFix] Avoid error traceback in logs when V1 LLM terminates #13565

Merged

xjpang pushed a commit to xjpang/vllm that referenced this pull request Feb 20, 2025

[V1][Core] Generic mechanism for handling engine utility (vllm-projec…

a778a41

…t#13060) Signed-off-by: Nick Hill <[email protected]>

kerthcet pushed a commit to kerthcet/vllm that referenced this pull request Feb 21, 2025

[V1][Core] Generic mechanism for handling engine utility (vllm-projec…

1f01fee

…t#13060) Signed-off-by: Nick Hill <[email protected]>

varun-sundar-rabindranath mentioned this pull request Feb 22, 2025

[Core] LoRA V1 - Add add/pin/list/remove_lora functions #13705

Merged

Akshat-Tripathi pushed a commit to krai/vllm that referenced this pull request Mar 3, 2025

[V1][Core] Generic mechanism for handling engine utility (vllm-projec…

9288c58

…t#13060) Signed-off-by: Nick Hill <[email protected]>

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Mar 5, 2025

[V1][Core] Generic mechanism for handling engine utility (vllm-projec…

f43febf

…t#13060) Signed-off-by: Nick Hill <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[V1][Core] Generic mechanism for handling engine utility methods #13060

[V1][Core] Generic mechanism for handling engine utility methods #13060

njhill commented Feb 10, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Feb 10, 2025

mergify bot commented Feb 12, 2025

mergify bot commented Feb 14, 2025

njhill commented Feb 15, 2025

youkaichao commented Feb 15, 2025

mergify bot commented Feb 15, 2025

afeldman-nm Feb 18, 2025

afeldman-nm Feb 18, 2025 •

edited

Loading

afeldman-nm Feb 18, 2025

afeldman-nm Feb 18, 2025

afeldman-nm left a comment •

edited

Loading

njhill commented Feb 18, 2025

youkaichao left a comment

[V1][Core] Generic mechanism for handling engine utility methods #13060

[V1][Core] Generic mechanism for handling engine utility methods #13060

Conversation

njhill commented Feb 10, 2025 • edited by github-actions bot Loading

github-actions bot commented Feb 10, 2025

mergify bot commented Feb 12, 2025

mergify bot commented Feb 14, 2025

njhill commented Feb 15, 2025

youkaichao commented Feb 15, 2025

mergify bot commented Feb 15, 2025

afeldman-nm Feb 18, 2025

Choose a reason for hiding this comment

afeldman-nm Feb 18, 2025 • edited Loading

Choose a reason for hiding this comment

afeldman-nm Feb 18, 2025

Choose a reason for hiding this comment

afeldman-nm Feb 18, 2025

Choose a reason for hiding this comment

afeldman-nm left a comment • edited Loading

Choose a reason for hiding this comment

njhill commented Feb 18, 2025

youkaichao left a comment

Choose a reason for hiding this comment

njhill commented Feb 10, 2025 •

edited by github-actions bot

Loading

afeldman-nm Feb 18, 2025 •

edited

Loading

afeldman-nm left a comment •

edited

Loading