From 27d82096fda25506d9306642a79e274434b922af Mon Sep 17 00:00:00 2001 From: Shaoting Date: Wed, 5 Mar 2025 14:34:18 -0600 Subject: [PATCH] Update endpoint in 01 tutorial Signed-off-by: Shaoting --- tutorials/01-minimal-helm-installation.md | 30 +++++++++++++---------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/tutorials/01-minimal-helm-installation.md b/tutorials/01-minimal-helm-installation.md index f2b93300..457c673c 100644 --- a/tutorials/01-minimal-helm-installation.md +++ b/tutorials/01-minimal-helm-installation.md @@ -6,17 +6,21 @@ This tutorial guides you through a minimal setup of the vLLM Production Stack us ## Table of Contents -- [Introduction](#introduction) -- [Table of Contents](#table-of-contents) -- [Prerequisites](#prerequisites) -- [Steps](#steps) - - [1. Deploy vLLM Instance](#1-deploy-vllm-instance) - - [2. Validate Installation](#2-validate-installation) - - [3. Send a Query to the Stack](#3-send-a-query-to-the-stack) - - [3.1. Forward the Service Port](#31-forward-the-service-port) - - [3.2. Query the OpenAI-Compatible API to list the available models](#32-query-the-openai-compatible-api-to-list-the-available-models) - - [3.3. Query the OpenAI Completion Endpoint](#33-query-the-openai-completion-endpoint) - - [4. Uninstall](#4-uninstall) +- [Tutorial: Minimal Setup of the vLLM Production Stack](#tutorial-minimal-setup-of-the-vllm-production-stack) + - [Introduction](#introduction) + - [Table of Contents](#table-of-contents) + - [Prerequisites](#prerequisites) + - [Steps](#steps) + - [1. Deploy vLLM Instance](#1-deploy-vllm-instance) + - [1.1: Use Predefined Configuration](#11-use-predefined-configuration) + - [1.2: Deploy the Helm Chart](#12-deploy-the-helm-chart) + - [2. Validate Installation](#2-validate-installation) + - [2.1: Monitor Deployment Status](#21-monitor-deployment-status) + - [3. Send a Query to the Stack](#3-send-a-query-to-the-stack) + - [3.1: Forward the Service Port](#31-forward-the-service-port) + - [3.2: Query the OpenAI-Compatible API to list the available models](#32-query-the-openai-compatible-api-to-list-the-available-models) + - [3.3: Query the OpenAI Completion Endpoint](#33-query-the-openai-completion-endpoint) + - [4. Uninstall](#4-uninstall) ## Prerequisites @@ -116,7 +120,7 @@ sudo kubectl port-forward svc/vllm-router-service 30080:80 Test the stack's OpenAI-compatible API by querying the available models: ```bash -curl -o- http://localhost:30080/models +curl -o- http://localhost:30080/v1/models ``` Expected output: @@ -141,7 +145,7 @@ Expected output: Send a query to the OpenAI `/completion` endpoint to generate a completion for a prompt: ```bash -curl -X POST http://localhost:30080/completions \ +curl -X POST http://localhost:30080/v1/completions \ -H "Content-Type: application/json" \ -d '{ "model": "facebook/opt-125m",