vllm.model_executor.models.fuyu ¶

PyTorch Fuyu model.

FuyuImagePatchInputs ¶

Bases: TensorSchema

Dimensions

bn: Batch size * number of images
bnp: Batch size * number of images * number of patches
fn: patch_size_x * patch_size_y * num_channels

Source code in vllm/model_executor/models/fuyu.py

class FuyuImagePatchInputs(TensorSchema):
    """
    Dimensions:
        - bn: Batch size * number of images
        - bnp: Batch size * number of images * number of patches
        - fn: patch_size_x * patch_size_y * num_channels
    """

    type: Literal["image_patches"] = "image_patches"

    image_patches_flat: Annotated[torch.Tensor, TensorShape("bnp", "fn")]

    patches_per_image: Annotated[list[int], TensorShape("bn")]
    """
    The number of total patches for each image in the batch.

    This is used to split the embeddings which has the first two dimensions
    flattened just like `image_patches_flat`.
    """

patches_per_image `instance-attribute` ¶

patches_per_image: Annotated[list[int], TensorShape(bn)]

The number of total patches for each image in the batch.

This is used to split the embeddings which has the first two dimensions flattened just like image_patches_flat.

vllm.model_executor.models.fuyu ¶

FuyuImagePatchInputs ¶

patches_per_image instance-attribute ¶

patches_per_image `instance-attribute` ¶