Skip to content

vllm.model_executor.models.fuyu

PyTorch Fuyu model.

FuyuImagePatchInputs

Bases: TensorSchema

Dimensions
  • bn: Batch size * number of images
  • bnp: Batch size * number of images * number of patches
  • fn: patch_size_x * patch_size_y * num_channels
Source code in vllm/model_executor/models/fuyu.py
class FuyuImagePatchInputs(TensorSchema):
    """
    Dimensions:
        - bn: Batch size * number of images
        - bnp: Batch size * number of images * number of patches
        - fn: patch_size_x * patch_size_y * num_channels
    """

    type: Literal["image_patches"] = "image_patches"

    image_patches_flat: Annotated[torch.Tensor, TensorShape("bnp", "fn")]

    patches_per_image: Annotated[list[int], TensorShape("bn")]
    """
    The number of total patches for each image in the batch.

    This is used to split the embeddings which has the first two dimensions
    flattened just like `image_patches_flat`.
    """

patches_per_image instance-attribute

patches_per_image: Annotated[list[int], TensorShape(bn)]

The number of total patches for each image in the batch.

This is used to split the embeddings which has the first two dimensions flattened just like image_patches_flat.