AlignedNorm: Prompting Vision–Language Models via
Coupled Prompt Field

Qi Ma^1,2, Chen-Yang Wang^1,2, Dehong Gao³, Deng-Ping Fan^1,2,4*,

¹VCIP & CS, Nankai University ²NKIARI, Shenzhen Futian ³Northwestern Polytechnical University ⁴SLAI

News

🗓️ 2025/06/12: AlignedNorm code is released!
🗓️ 2026/05/01: AlignedNorm is accepted by ICML 2026 🎉

Introduction

Prompt learning has become an efficient way to adapt vision-language models (VLMs) to downstream tasks. However, existing end-to-end and decoupled methods often optimize base and new classes in isolated, task-specific feature spaces, which can lead to local optima and limited generalization.

We introduce AlignedNorm, a simple prompt-learning method built upon the concept of a Coupled Prompt Field. Instead of treating base and new classes independently, the coupled field places them in a shared optimization space where their learning dynamics mutually constrain each other. AlignedNorm realizes this coupling by dynamically aligning learnable prompts with the native feature scale of the pretrained VLM.

From isolated optimization to a Coupled Prompt Field shared by base and new classes.

Highlights

A new perspective on prompt learning. We formulate base-to-new generalization through the Coupled Prompt Field, which encourages joint rather than isolated optimization.
Diagnosis of representation degradation. We reveal that uncontrolled prompt learning causes norm drift and Entanglement Collapse, weakening the pretrained representation structure.
Simple and effective alignment. AlignedNorm aligns prompt norms with the VLM's native feature scale at both intermediate and output levels, without introducing a complex architecture.
Better geometric preservation. AlignedNorm maintains a more favorable balance between feature uniformity and semantic tolerance across base and new classes.

AlignedNorm mitigates norm drift and Entanglement Collapse during prompt learning.

AlignedNorm better preserves the geometric structure of the pretrained representation space.

Method Overview

As illustrated above, end-to-end methods optimize prompts within a single task path, while decoupled methods separate the optimization of base and new classes. In contrast, AlignedNorm couples the two learning dynamics inside the vision encoder. It aligns the norms of learnable prompt tokens with the corresponding native [CLS] representations at intermediate layers and further aligns their projected features at the output level. These lightweight alignment objectives keep prompt updates on the pretrained model's feature scale, reducing representation distortion while preserving transferable knowledge for both base and new classes.

Running

All commands below should be executed from the project root directory. Before running an experiment, directly modify DATA_ROOT in the corresponding script under scripts/alignednorm/:

DATA_ROOT="/path/to/your/datasets"

The scripts that require this setting are:

Base-to-Novel: base2new_train.sh and base2new_test.sh
Cross-Dataset: cross_datasets_train.sh and cross_datasets_test.sh
Few-Shot: few_shot.sh

By default, all scripts run three random seeds (1, 2, and 3) and summarize the results after completion.

Base-to-Novel Generalization

Train on the base classes and evaluate the trained models on the new classes of all 11 datasets:

bash base_to_novel.sh

To run only one dataset, first train on its base classes and then evaluate on its new classes:

bash scripts/alignednorm/base2new_train.sh eurosat
bash scripts/alignednorm/base2new_test.sh eurosat

Cross-Dataset Generalization

Train a 16-shot model on ImageNet and evaluate it on all target datasets:

bash cross_datasets.sh

To evaluate only one target dataset, run the training and evaluation stages separately:

bash scripts/alignednorm/cross_datasets_train.sh
bash scripts/alignednorm/cross_datasets_test.sh dtd

Few-Shot Learning

Run the 1-, 2-, 4-, 8-, and 16-shot settings on all 11 datasets:

bash few_shot.sh

To run all shot settings on a single dataset:

bash scripts/alignednorm/few_shot.sh eurosat

Running Specific Seeds

Use SEEDS before a command to run only the selected seed or seeds:

# Run only seed 1
SEEDS=1 bash scripts/alignednorm/few_shot.sh eurosat

# Run seeds 1 and 3 for Base-to-Novel
SEEDS="1 3" bash scripts/alignednorm/base2new_train.sh eurosat
SEEDS="1 3" bash scripts/alignednorm/base2new_test.sh eurosat

# Run the complete Cross-Dataset experiment using only seed 2
SEEDS=2 bash cross_datasets.sh

For Base-to-Novel and Cross-Dataset evaluation, use the same seeds for training and testing so that each evaluation script can locate its corresponding checkpoint. Existing output directories are skipped automatically; remove or rename the corresponding directory under output/ to rerun an experiment.

📅 TODO

Release code.
Release model weights and corresponding log files.

Contact

If you have any questions, you can submit an issue on Github, or contact me by email (nkucsmq[at]gmail.com).

Acknowledgements

This codebase builds on MMRL/MMRL++ and clip_text_span.

Citation

If you find our paper or repo helpful for your research, please consider citing our paper and giving this repo a star⭐. Thank you!

@inproceedings{ma2026alignednorm,
  title={AlignedNorm: Prompting Vision–Language Models via Coupled Prompt Field},
  author={Ma, Qi and Wang, Chen-Yang and Gao, Dehong and Fan, Deng-Ping},
  booktitle={ICML},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
clip		clip
configs		configs
datasets		datasets
docs		docs
images		images
lpclip		lpclip
scripts		scripts
trainers		trainers
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
base_to_novel.sh		base_to_novel.sh
cross_datasets.sh		cross_datasets.sh
few_shot.sh		few_shot.sh
parse_test_res.py		parse_test_res.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AlignedNorm: Prompting Vision–Language Models via
Coupled Prompt Field

News

Introduction

Highlights

Method Overview

Running

Base-to-Novel Generalization

Cross-Dataset Generalization

Few-Shot Learning

Running Specific Seeds

📅 TODO

Contact

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AlignedNorm: Prompting Vision–Language Models via Coupled Prompt Field

News

Introduction

Highlights

Method Overview

Running

Base-to-Novel Generalization

Cross-Dataset Generalization

Few-Shot Learning

Running Specific Seeds

📅 TODO

Contact

Acknowledgements

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

AlignedNorm: Prompting Vision–Language Models via
Coupled Prompt Field

Packages