-
Notifications
You must be signed in to change notification settings - Fork 82
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Siddhant Ray <[email protected]>
- Loading branch information
1 parent
f29edf2
commit cb5ebb2
Showing
40 changed files
with
475 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Minimal makefile for Sphinx documentation | ||
# | ||
|
||
# You can set these variables from the command line, and also | ||
# from the environment for the first two. | ||
SPHINXOPTS ?= | ||
SPHINXBUILD ?= sphinx-build | ||
SOURCEDIR = source | ||
BUILDDIR = build | ||
|
||
# Put it first so that "make" without argument is like "make help". | ||
help: | ||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | ||
|
||
.PHONY: help Makefile | ||
|
||
# Catch-all target: route all unknown targets to Sphinx using the new | ||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). | ||
%: Makefile | ||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
@ECHO OFF | ||
|
||
pushd %~dp0 | ||
|
||
REM Command file for Sphinx documentation | ||
|
||
if "%SPHINXBUILD%" == "" ( | ||
set SPHINXBUILD=sphinx-build | ||
) | ||
set SOURCEDIR=source | ||
set BUILDDIR=build | ||
|
||
%SPHINXBUILD% >NUL 2>NUL | ||
if errorlevel 9009 ( | ||
echo. | ||
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx | ||
echo.installed, then set the SPHINXBUILD environment variable to point | ||
echo.to the full path of the 'sphinx-build' executable. Alternatively you | ||
echo.may add the Sphinx directory to PATH. | ||
echo. | ||
echo.If you don't have Sphinx installed, grab it from | ||
echo.https://www.sphinx-doc.org/ | ||
exit /b 1 | ||
) | ||
|
||
if "%1" == "" goto help | ||
|
||
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% | ||
goto end | ||
|
||
:help | ||
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% | ||
|
||
:end | ||
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
sphinx==8.0.2 | ||
sphinx-autodoc-typehints==2.4.1 | ||
sphinx-book-theme==1.1.3 | ||
sphinx-click==6.0.0 | ||
sphinx-copybutton==0.5.2 | ||
sphinx-togglebutton==0.3.2 | ||
sphinx_design==0.6.1 | ||
sphinxemoji==0.3.1 |
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _multiround-qa: | ||
|
||
Multi-round QA | ||
============== |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
# Configuration file for the Sphinx documentation builder. | ||
# | ||
# For the full list of built-in configuration values, see the documentation: | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html | ||
|
||
# -- Project information ----------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information | ||
|
||
import os | ||
import sys | ||
from dataclasses import asdict | ||
|
||
from sphinx.ext import autodoc | ||
|
||
sys.path.insert(0, os.path.abspath("../../src")) | ||
|
||
project = "production-stack" | ||
copyright = "2025, vLLM Production Stack Team" | ||
author = "vLLM Production Stack Team" | ||
|
||
extensions = [ | ||
"sphinx.ext.napoleon", | ||
"sphinx.ext.linkcode", | ||
"sphinx.ext.intersphinx", | ||
"sphinx_copybutton", | ||
"sphinx.ext.autodoc", | ||
"sphinx.ext.autosummary", | ||
"myst_parser", | ||
"sphinxarg.ext", | ||
"sphinx_design", | ||
"sphinx_togglebutton", | ||
] | ||
|
||
# -- General configuration --------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration | ||
|
||
extensions = [] | ||
|
||
templates_path = ["_templates"] | ||
exclude_patterns = [] | ||
|
||
|
||
class MockedClassDocumenter(autodoc.ClassDocumenter): | ||
"""Remove note about base class when a class is | ||
derived from object.""" | ||
|
||
def add_line(self, line: str, source: str, *lineno: int) -> None: | ||
if line == " Bases: :py:class:`object`": | ||
return | ||
super().add_line(line, source, *lineno) | ||
|
||
|
||
autodoc.ClassDocumenter = MockedClassDocumenter | ||
|
||
# autodoc_default_options = { | ||
# "members": True, | ||
# "undoc-members": True, | ||
# "private-members": True | ||
# } | ||
|
||
|
||
# -- Options for HTML output ------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output | ||
|
||
html_title = project | ||
html_theme = "sphinx_book_theme" | ||
html_static_path = ["_static"] | ||
html_logo = "./assets/prodstack_icon.png" | ||
html_favicon = "./assets/output.ico" | ||
html_permalinks_icon = "<span>#</span>" | ||
# pygments_style = "sphinx" | ||
# pygments_style_dark = "fruity" | ||
html_theme_options = { | ||
"path_to_docs": "docs/source", | ||
"repository_url": "https://github.com/vllm-project/production-stack", | ||
"use_repository_button": True, | ||
"use_edit_page_button": True, | ||
# navigation and sidebar | ||
"show_toc_level": 2, | ||
"announcement": None, | ||
"secondary_sidebar_items": [ | ||
"page-toc", | ||
], | ||
"navigation_depth": 3, | ||
"primary_sidebar_end": [], | ||
"pygments_light_style": "tango", | ||
"pygments_dark_style": "monokai", | ||
} | ||
|
||
intersphinx_mapping = { | ||
"python": ("https://docs.python.org/3", None), | ||
"typing_extensions": ("https://typing-extensions.readthedocs.io/en/latest", None), | ||
"numpy": ("https://numpy.org/doc/stable", None), | ||
"torch": ("https://pytorch.org/docs/stable", None), | ||
"psutil": ("https://psutil.readthedocs.io/en/stable", None), | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _aws: | ||
|
||
AWS | ||
=== |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _gcp: | ||
|
||
Google Cloud Platform | ||
===================== |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
.. _index: | ||
|
||
|
||
Cloud Environments | ||
============================================== | ||
|
||
📈 Easily deploy the stack on AWS, GCP, or any other cloud provider | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Deployment | ||
|
||
aws.rst | ||
gcp.rst |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _helm_charts: | ||
|
||
Helm Charts | ||
======================================= |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _ray_deploy: | ||
|
||
Ray Deployment | ||
======================================= |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _engine-stats: | ||
|
||
Engine Stats | ||
============ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
.. _dev_api_index: | ||
|
||
Developer API | ||
================ | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Developer Guide | ||
|
||
router-logic.rst | ||
engine-stats.rst | ||
service-discovery.rst |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _router-logic: | ||
|
||
Router Logic | ||
============ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _service-discovery: | ||
|
||
Service Discovery | ||
================= |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
.. _peripheral_index: | ||
|
||
Peripheral | ||
================ | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Developer Guide | ||
|
||
models.rst | ||
interfaces.rst |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _dev_interfaces: | ||
|
||
Interfaces | ||
================ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _models: | ||
|
||
Models | ||
====== |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
.. _examples: | ||
|
||
Minimal Example | ||
=============== | ||
|
||
Add simple tutorial here. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
.. _installation: | ||
|
||
.. role:: raw-html(raw) | ||
:format: html | ||
|
||
Installation | ||
============ | ||
|
||
Architecture | ||
------------ | ||
|
||
.. figure:: ../assets/prodarch.png | ||
:width: 60% | ||
:align: center | ||
:alt: production-arch | ||
:class: no-scaled-link | ||
|
||
|
||
The stack is set up using Helm, and contains the following key parts: | ||
|
||
|
||
* **Serving engine**: The vLLM engines that run different LLMs | ||
* **Request router**: Directs requests to appropriate backends based on routing keys or session IDs to maximize KV cache reuse. | ||
* **Observability stack**: monitors the metrics of the backends through `Prometheus <https://prometheus.io/>`_ and `Grafana <https://grafana.com/>`_. | ||
|
||
|
||
Prerequisites | ||
------------- | ||
|
||
- A running Kubernetes (K8s) environment with GPUs | ||
- Run ``cd utils`` && ``bash install-minikube-cluster.sh`` | ||
- Or follow our `tutorial <https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md>`_ | ||
|
||
|
||
Deployment | ||
---------- | ||
|
||
vLLM Production Stack can be deployed via helm charts. Clone the repo to local and execute the following commands for a minimal deployment: | ||
|
||
.. code:: bash | ||
git clone https://github.com/vllm-project/production-stack.git | ||
cd production-stack/ | ||
helm repo add vllm https://vllm-project.github.io/production-stack | ||
helm install vllm vllm/vllm-stack -f tutorials/assets/values-01-minimal-example.yaml | ||
The deployed stack provides the same `OpenAI API interface <https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html?ref=blog.mozilla.ai#openai-compatible-server>`_ as vLLM, and can be accessed through kubernetes service. | ||
|
||
To validate the installation and and send query to the stack, refer to this `example <https://github.com/vllm-project/production-stack/blob/main/tutorials/01-minimal-helm-installation.md>`_. | ||
|
||
Uninstallation | ||
-------------- | ||
|
||
To uninstall the stack, run: | ||
|
||
.. code:: bash | ||
sudo helm uninstall vllm |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.. _troubleshooting: | ||
|
||
Troubleshooting | ||
=========================== |
Oops, something went wrong.