Extend SmoothQuant quantizer to support per-module applier control

## Motivation

Currently, [SmoothQuant](https://github.com/Samsung/TICO/blob/16f603cb99b1dfa5c4d3fbce6b3eab7720ed8ab3/tico/experimental/quantization/algorithm/smoothquant/smooth_quant.py#L284) smoothing is applied by trying a fixed list of appliers (LLaMA, fairseq, …) in order, stopping at the first success. This design is too rigid in practice:

- Sometimes we only want to apply smoothing to **specific modules** (e.g., decoder layers but not embeddings).
- Sometimes we want to apply **multiple appliers sequentially** to the same module (e.g., first LayerNorm–QKV smoothing, then ReLU bridge fusion).
- Sometimes we want to override the **alpha factor** or **applier set** for individual modules without affecting the rest of the model.

Without these knobs, users must either modify the applier list globally or patch code manually, which is error-prone and inflexible.

## Proposed Design

Introduce a flexible interface for controlling SmoothQuant application:

### 1. Global controls

- `only_appliers`, `skip_appliers` → restrict the global set of appliers
- `include`, `exclude` → restrict which module names are considered

### 2. Per-module overrides (`per_module` dict)

- Key: module name or glob pattern ("model.layers.7.*")
- Values:
  - "alpha" → custom alpha for that module
  - "appliers" → explicit list of appliers to apply sequentially


### 3. Observer filtering

- `observe_include`, `observe_exclude` to limit which modules are hooked for activation statistics

## Usage Examples

1. Apply multiple appliers to each decoder layer, in order

```python
cfg = SmoothQuantConfig(
    include=["model.layers.*"],
    per_module={
        "model.layers.*": {
            "appliers": ["ln_to_qkv", "relu_bridge"],
            "alpha": 0.6,
        },
    },
)
```

2. Apply only one applier globally, but add another as well for decoder layers

```python
cfg = SmoothQuantConfig(
    only_appliers=["ln_to_qkv"],   # global default
    per_module={
        "decoder.layers.*": {
            "appliers": ["ln_to_qkv", "relu_bridge"],
        },
    },
)
```

3. Skip first few layers, change alpha and appliers for later layers

```python
cfg = SmoothQuantConfig(
    include=["model.layers.*"],
    exclude=["model.layers.0.*", "model.layers.1.*", "model.layers.2.*"],
    per_module={
        "model.layers.20*": {
            "appliers": ["ln_to_qkv", "relu_bridge"],
            "alpha": 0.55,
        },
    },
)
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend SmoothQuant quantizer to support per-module applier control #320

Motivation

Proposed Design

1. Global controls

2. Per-module overrides (`per_module` dict)

3. Observer filtering

Usage Examples

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extend SmoothQuant quantizer to support per-module applier control #320

Description

Motivation

Proposed Design

1. Global controls

2. Per-module overrides (per_module dict)

3. Observer filtering

Usage Examples

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

2. Per-module overrides (`per_module` dict)