Skip to content

vllm.benchmarks

Modules:

Name Description
datasets
latency

Benchmark the latency of processing a single batch of requests.

lib

Benchmark library utilities.

mm_processor

Benchmark multimodal processor latency.

plot

Generate plots for benchmark results.

serve

Benchmark online serving throughput.

startup

Benchmark the cold and warm startup time of vLLM models.

sweep
throughput

Benchmark offline inference throughput.