Making max-tokens configurable in the benchmark client. #797

gangmuk · 2025-03-05T06:44:55Z

🚀 Feature Description and Motivation

Making max-tokens configurable in the benchmark client
max-tokens is the argument that limits the output token lengths. It should be exposed to the user explicitly in case they want to set.

response = await client.chat.completions.create(
            model=model,
            messages=prompt,
            temperature=0,
            max_tokens=max_tokens, // this config is not here before.
            stream=True,
            stream_options={"include_usage": True},
        )

Use Case

configuring output token length

Proposed Solution

No response

The text was updated successfully, but these errors were encountered:

gangmuk added the area/benchmark label Mar 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making max-tokens configurable in the benchmark client. #797

Making max-tokens configurable in the benchmark client. #797

gangmuk commented Mar 5, 2025

Making max-tokens configurable in the benchmark client. #797

Making max-tokens configurable in the benchmark client. #797

Comments

gangmuk commented Mar 5, 2025

🚀 Feature Description and Motivation

Use Case

Proposed Solution