vllm.model_executor.models.aya_vision ¶
AyaVisionImagePixelInputs ¶
Bases: TensorSchema
Dimensions
- np: The total number of patches over each image over each prompt in the batch
- c: Number of channels
- h: Height of each image patch
- w: Width of each image patch
- bn: Batch size * number of images
Source code in vllm/model_executor/models/aya_vision.py
AyaVisionProcessingInfo ¶
Bases: BaseProcessingInfo
Source code in vllm/model_executor/models/aya_vision.py
get_num_patches ¶
get_num_patches(
*,
image_width: int,
image_height: int,
size: dict,
min_patches: int,
max_patches: int,
) -> int
Calculate the number of patches needed for a given image based on size constraints. This method replicates and adjusts the logic from: transformers/models/got_ocr2/image_processing_got_ocr2