Skip to main content

sglang

uv pip install "sglang[all]>=0.4.9.post6" --prerelease=allow

uv run python3 -m sglang.launch_server \
--model-path Qwen/Qwen2.5-VL-7B-Instruct \
--host 0.0.0.0 \
--port 30000

Server

FAQ

RuntimeError: SGLang only supports sm75 and above.

FlashInfer

Python: 3.8, 3.9, 3.10, 3.11
PyTorch: 2.2/2.3/2.4 with CUDA 11.8/12.1/12.4 (only for torch 2.4)
Supported GPU architectures: sm75, sm80, sm86, sm89, sm90