Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: standard fastapi project structure for better main… #217

Merged
merged 17 commits into from
Mar 4, 2025

Conversation

BrianPark314
Copy link
Contributor

…tainability

FILL IN THE PR DESCRIPTION HERE

This is a suggestion for restructuring the project into more maintainable shape. Changes include:

apply mvc pattern
isolate the entry point
separate routers
removed duplicated code (in batch router)
fix typo
Hope this gets reviewed before 0.1.0 release!

BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE


  • Make sure the code changes pass the pre-commit checks.
  • Sign-off your commit by using -s when doing git commit
  • Try to classify PRs for easy understanding of the type of changes, such as [Bugfix], [Feat], and [CI].
Detailed Checklist (Click to Expand)

Thank you for your contribution to production-stack! Before submitting the pull request, please ensure the PR meets the following criteria. This helps us maintain the code quality and improve the efficiency of the review process.

PR Title and Classification

Please try to classify PRs for easy understanding of the type of changes. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:

  • [Bugfix] for bug fixes.
  • [CI/Build] for build or continuous integration improvements.
  • [Doc] for documentation fixes and improvements.
  • [Feat] for new features in the cluster (e.g., autoscaling, disaggregated prefill, etc.).
  • [Router] for changes to the vllm_router (e.g., routing algorithm, router observability, etc.).
  • [Misc] for PRs that do not fit the above categories. Please use this sparingly.

Note: If the PR spans more than one category, please include all relevant prefixes.

Code Quality

The PR need to meet the following code quality standards:

  • Pass all linter checks. Please use pre-commit to format your code. See README.md for installation.
  • The code need to be well-documented to ensure future contributors can easily understand the code.
  • Please include sufficient tests to ensure the change is stay correct and robust. This includes both unit tests and integration tests.

DCO and Signed-off-by

When contributing changes to this project, you must agree to the DCO. Commits must include a Signed-off-by: header which certifies agreement with the terms of the DCO.

Using -s with git commit will automatically add this header.

What to Expect for the Reviews

We aim to address all PRs in a timely manner. If no one reviews your PR within 5 days, please @-mention one of YuhanLiu11
, Shaoting-Feng or ApostaC.

Copy link
Collaborator

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this PR. Thanks for your contribution!

@BrianPark314 BrianPark314 changed the title refactor: maintain standard fastapi project structure for better main… refactor: standard fastapi project structure for better main… Mar 3, 2025
BrianPark314 added 3 commits March 3, 2025 23:11
# Conflicts:
#	src/vllm_router/router.py
Signed-off-by: BrianPark314 <[email protected]>
Signed-off-by: BrianPark314 <[email protected]>
@BrianPark314
Copy link
Contributor Author

if this is going to be merged, it should preceed any other prs at the moment.

BrianPark314 added 3 commits March 4, 2025 00:17
Signed-off-by: BrianPark314 <[email protected]>
@Shaoting-Feng
Copy link
Collaborator

I checked the router's logs in GitHub Actions, and it shows the following error. Could you check if this is related to your code changes? Thanks.

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.12/site-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in __call__
    raise exc
  File "/usr/local/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 715, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    raise exc
  File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
    await app(scope, receive, sender)
  File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 73, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/fastapi/routing.py", line 301, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/vllm_router/routers/main_router.py", line 36, in route_completion
    return await route_general_request(request, "/v1/completions")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/vllm_router/services/request_service/request.py", line 150, in route_general_request
    headers, status_code = await anext(stream_generator)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/vllm_router/services/request_service/request.py", line 58, in process_request
    async with client.stream(
               ^^^^^^^^^^^^^
AttributeError: 'HTTPXClientWrapper' object has no attribute 'stream'

@ApostaC
Copy link
Collaborator

ApostaC commented Mar 3, 2025

Hey @BrianPark314 , really appreciate the huge effort!
This PR looks too large and has too many changes. Right now there are also multiple efforts working on different router features, which might create many conflicts.
Can we first have a proposal about how are you going to refactor the components are what are the main interfaces between the components (probably with some figures)? You can first put it under the proposals/ folder so that all the folks working on the router can be aware of that.

@ApostaC
Copy link
Collaborator

ApostaC commented Mar 3, 2025

Totally understand it takes non-trivial time to create this PR. Many thanks ❤️!

Copy link
Collaborator

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I take back my comments before. This refactoring looks clean and makes router.py much better than before! I really like this!

Just please fix the issue mentioned by @Shaoting-Feng

@ApostaC
Copy link
Collaborator

ApostaC commented Mar 3, 2025

Since it introduces a lot of file changes, let's try to quickly merge it and unblock the other potential PRs?

@BrianPark314
Copy link
Contributor Author

@Shaoting-Feng @ApostaC I will try to resolve the issues asap.

BrianPark314 added 4 commits March 4, 2025 11:23
Signed-off-by: BrianPark314 <[email protected]>
# Conflicts:
#	src/vllm_router/router.py
Signed-off-by: BrianPark314 <[email protected]>
@KuntaiDu
Copy link
Collaborator

KuntaiDu commented Mar 4, 2025

Looking forward to this PR being merged! I'll propose the prefix-aware routing after this.

BrianPark314 added 6 commits March 4, 2025 12:34
Signed-off-by: BrianPark314 <[email protected]>
Signed-off-by: BrianPark314 <[email protected]>
Signed-off-by: BrianPark314 <[email protected]>
Signed-off-by: BrianPark314 <[email protected]>
Signed-off-by: BrianPark314 <[email protected]>
@Shaoting-Feng
Copy link
Collaborator

Thanks for your fix! The CI tests result LGTM.

Copy link
Collaborator

@Shaoting-Feng Shaoting-Feng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Copy link
Collaborator

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ApostaC ApostaC merged commit 95efecc into vllm-project:main Mar 4, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants