Skip to content

vllm.v1.attention.ops.triton_decode_attention

Memory-efficient attention for decoding. It supports page size >= 1.