llmcompressor.utils.pytorch
get_layer_by_name(layer_name, module)
Get the layer of a module by name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
layer_name | str | Name of the layer to find. | required |
module | Module | Module in which to search for layer_name | required |
Returns:
Type | Description |
---|---|
Module | Module, the layer with name layer_name |
Source code in llmcompressor/utils/pytorch/module.py
get_layers(targets, module, exclude_internal_modules=False)
Get layers (also known as submodules) of module based on targets
Parameters:
Name | Type | Description | Default |
---|---|---|---|
targets | Union[str, List[str]] | names or regexes to search for Can be regex, e.g. "re:.*input_layernorm$" to find all layers in module whose names end in string "input_layernorm" | required |
module | Module | Parent module in which to search for targets | required |
exclude_internal_modules | bool | If True, don't include internal modules added by llm-compressor, e.g. Observers and Transforms. Defaults to False to maintain backward compatibility | False |
Returns:
Type | Description |
---|---|
Dict[str, Module] | dict of {layer name -> module} of all layers in module that match targets |
Source code in llmcompressor/utils/pytorch/module.py
get_matching_layer(target, name_to_match, module)
Given a target regex, find the layer name in the module that most closely matches the name_to_match string. This is used to matches submodules in the same layer, for instance matching "re.*k_proj" to "model.decoder.layer.0.q_proj" to find the k_proj that exists in layer 0.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target | str | regex to search for | required |
name_to_match | str | full layer name to match to, should exist in module | required |
module | Module | module to search for target in | required |
Returns:
Type | Description |
---|---|
Optional[Tuple[str, Module]] | Tuple containing the layer name and module that fits the target regex and best matches name_to_match, or None if no match can be found |
Source code in llmcompressor/utils/pytorch/module.py
get_no_split_params(model)
Get list of module classes that shouldn't be split when sharding. For Hugging Face Transformer models, this is the decoder layer type. For other types of models, this just returns all module names.
Returns:
Type | Description |
---|---|
Union[str, List[str]] | list of class names that shouldn't be split |
Source code in llmcompressor/utils/pytorch/module.py
qat_active(module)
Determines if any layers in the model have quantization enabled by checking for weight_fake_quant attributes
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module | Module | PyTorch model to check for quantization | required |
Returns:
Type | Description |
---|---|
bool | True if quantization is active anywhere in the model, False otherwise |