vllm.model_executor.model_loader.reload.meta ¶
MetaCopyCounter ¶
Bases: TorchDispatchMode
Tracks total number of elements modified with copy_.
Useful for keeping track of weight loading where underlying weights can be arbitrarily transformed (such as with narrow) before calling copy.
Note: Assumes that copy kwargs are not used.
Source code in vllm/model_executor/model_loader/reload/meta.py
get_numel_loaded ¶
get_numel_loaded(
weight_loader: Callable, args: BoundArguments
) -> tuple[int, object]
Determine how many elements would be loaded by a weight loader call.
:param weight loader: used to load weights :param args: bound arguments to weight loader :return: number of elements loaded by the weight loader, the return value of the weight loader
Source code in vllm/model_executor/model_loader/reload/meta.py
materialize_layer ¶
materialize_layer(layer: Module) -> None
Materialize all meta tensors in a layer to actual tensors.
Source code in vllm/model_executor/model_loader/reload/meta.py
materialize_meta_tensor ¶
Materialize a meta tensor into an actual tensor on the current device. Should be called within the torch device context for the given rank.
Source code in vllm/model_executor/model_loader/reload/meta.py
restore_layer_on_meta ¶
restore_layer_on_meta(
layer: Module, info: LayerReloadingInfo
)
Restore a layer to model format with tensors on the meta device
Source code in vllm/model_executor/model_loader/reload/meta.py
to_meta_tensor ¶
Convert a tensor to a meta tensor while preserving class and attributes.