vllm.config.mamba ¶
MambaBackendEnum ¶
Bases: Enum
Enumeration of supported Mamba SSU (selective state update) backends.
Source code in vllm/config/mamba.py
MambaConfig ¶
Configuration for Mamba SSM backends.
Source code in vllm/config/mamba.py
backend class-attribute instance-attribute ¶
backend: MambaBackendEnum = TRITON
Mamba SSU backend to use.
enable_stochastic_rounding class-attribute instance-attribute ¶
enable_stochastic_rounding: bool = False
Enable stochastic rounding when writing SSM state to fp16 cache. Uses random bits to unbias the rounding error, which can improve numerical stability for long sequences.
stochastic_rounding_philox_rounds class-attribute instance-attribute ¶
stochastic_rounding_philox_rounds: int = 0
Number of Philox PRNG rounds for stochastic rounding random number generation. 0 uses the Triton default. Higher values improve randomness quality at the cost of compute.
validate_backend_before classmethod ¶
Enable parsing of the backend enum type from string.
Source code in vllm/config/mamba.py
_MambaBackendEnumMeta ¶
Bases: EnumMeta
Metaclass for MambaBackendEnum to provide better error messages.