Skip to content

vllm.model_executor.models.kimi_vl

KimiVLImagePixelInputs

Bases: TensorSchema

Dimensions
  • nc: Number of channels
  • np: Number of patches
  • ps: Patch size
  • ni: Number of images
Source code in vllm/model_executor/models/kimi_vl.py
class KimiVLImagePixelInputs(TensorSchema):
    """
    Dimensions:
        - nc: Number of channels
        - np: Number of patches
        - ps: Patch size
        - ni: Number of images
    """

    type: Literal["pixel_values"] = "pixel_values"

    pixel_values: Annotated[
        torch.Tensor | list[torch.Tensor],
        TensorShape("np", 3, "ps", "ps"),
    ]

    image_grid_hws: Annotated[torch.Tensor, TensorShape("ni", 2)]