Skip to content

BackboneFinetuning: train_bn only applied during unfreezing phase #21531

@patrontheo

Description

@patrontheo

Outline & Motivation

The BackboneFinetuning callback exposes a train_bn parameter intended to control whether BatchNorm layers are trainable during backbone finetuning. However, the current implementation only applies this parameter during the unfreezing phase, not during the initial frozen phase.

In freeze_before_training(), the callback calls:

self.freeze(pl_module.backbone)

which uses the default train_bn=True. As a result, BatchNorm layers remain trainable during the frozen stage, regardless of the train_bn value passed to the callback.

This leads to a somewhat counter-intuitive behavior if train_bn=False:

  • It does not freeze BN during the frozen phase.
  • it freezes BN when the backbone is unfrozen.

So the meaning of the parameter becomes:
“Train BN while the backbone is frozen, and optionally freeze it once the backbone is unfrozen.”
This is not what the parameter name suggests, and is rarely the intended finetuning strategy.

Phase train_bn=True train_bn=False
Frozen phase Backbone: frozen
BN: trainable
Backbone: frozen
BN: trainable
After unfreeze Backbone: trainable
BN: trainable
Backbone: trainable
BN: frozen

Pitch

To keep current behavior available while making BN handling explicit and predictable:

  • Deprecate the existing train_bn parameter.
  • Introduce two new parameters:
    • train_bn_frozen_phase: controls whether BatchNorm layers are trainable while the backbone is frozen.
    • train_bn_unfrozen_phase: controls whether BatchNorm layers are trainable after the backbone is unfrozen.
  • Set the default values to match the current behavior:
    • train_bn_frozen_phase=True
    • train_bn_unfrozen_phase=True
  • Keep the old train_bn parameter for one deprecation cycle, mapping it internally to train_bn_unfrozen_phase=train_bn
  • Emit a deprecation warning when train_bn is used, directing users to the new parameters.
  • Remove train_bn in a future major release once the transition period is over.

Happy to discuss any other directions / improvements you have in mind.

Additional context

I’m happy to open a PR if the direction makes sense.

cc @lantiga @justusschock

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions