llmcompressor.modifiers.utils.pytorch_helpers
apply_pad_mask_to_batch(batch)
Apply a mask to the input ids of a batch. This is used to zero out padding tokens so they do not contribute to the hessian calculation in the GPTQ and SparseGPT algorithms
Assumes that attention_mask
only contains zeros and ones
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch | Dict[str, Tensor] | batch to apply padding to if it exists | required |
Returns:
Type | Description |
---|---|
Dict[str, Tensor] | batch with padding zeroed out in the input_ids |
Source code in llmcompressor/modifiers/utils/pytorch_helpers.py
is_moe_model(model)
Check if the model is a mixture of experts model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model | Module | the model to check | required |
Returns:
Type | Description |
---|---|
bool | True if the model is a mixture of experts model |