Skip to content

Claude/finetune flood detection zq leg#3874

Open
VIncentmuyi wants to merge 141 commits intoopen-mmlab:mainfrom
VIncentmuyi:claude/finetune-flood-detection-ZqLeg
Open

Claude/finetune flood detection zq leg#3874
VIncentmuyi wants to merge 141 commits intoopen-mmlab:mainfrom
VIncentmuyi:claude/finetune-flood-detection-ZqLeg

Conversation

@VIncentmuyi
Copy link
Copy Markdown

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

VIncentmuyi and others added 30 commits December 3, 2025 12:50
修改了以下配置文件,从20000 iter训练改为100 epoch训练:
- Deeplabv3+UAVflood.py
- segformer_mit-b0_8xb1-160k_UAVflood-256x256.py
- Unet-Uavflood.py
- vit-Uavflood.py
- convnext-base-uavflood.py
- Swin-uavflood-256x256.py

主要修改内容:
1. param_scheduler: 将by_epoch从False改为True,调整warmup为前5个epoch,主训练为5-100 epoch
2. train_cfg: 从IterBasedTrainLoop改为EpochBasedTrainLoop,设置max_epochs=100
3. default_hooks: 将checkpoint和logger改为基于epoch,每10个epoch验证和保存一次

https://claude.ai/code/session_01HTaghbFUmt7u1CcEGHvmpJ
修改内容:
1. 从所有6个配置文件的_base_中移除了schedule_20k.py的继承,避免与epoch训练配置冲突
2. 为Deeplabv3+UAVflood.py添加了缺失的optim_wrapper配置(SGD优化器)
3. 其他5个文件已有自己的optim_wrapper定义,无需修改

这样可以确保配置文件完整独立,避免iter训练配置的干扰

https://claude.ai/code/session_01HTaghbFUmt7u1CcEGHvmpJ
…OSq'

# Conflicts:
#	configs/deeplabv3plus/Deeplabv3+UAVflood.py
…ampler

The training was not stopping at max_epochs=100 because InfiniteSampler
causes the dataset to loop infinitely. Changed to DefaultSampler to ensure
training stops correctly after 100 epochs and validation is triggered
at the configured intervals.

https://claude.ai/code/session_01BjJg5WLcsLWaZp3Vx2LV5f
claude and others added 29 commits April 2, 2026 14:15
Same fix as benchmark script - model expects list of 3D tensors,
not 4D tensors with batch dimension.

https://claude.ai/code/session_0135dmPP4TXG96XoRSSNvu5v
…ents-NHeaL

Fix visualize_expert_routing.py: dummy input should be 3D (C,H,W)
- Update MODAL_DISPLAY to new dataset names (UrbanSARFlood, FloodNet, GF-FloodNet)
- Only show y-axis dataset labels on the leftmost subplot in Fig 4a and 4b
  to prevent long names from overlapping with adjacent subplots

https://claude.ai/code/session_0135dmPP4TXG96XoRSSNvu5v
…ents-NHeaL

Fix overlapping y-axis labels in expert routing figures
For new flood event data that only contains one modality (e.g. RGB 3-band),
overrides FixedRatioModalSampler with DefaultSampler and sets filter_modality.

https://claude.ai/code/session_0135dmPP4TXG96XoRSSNvu5v
…ents-NHeaL

Add single-modal fine-tuning config for generalization experiments
MMEngine config inheritance merges dicts recursively, so the old
FixedRatioModalSampler fields (modal_ratios, modal_order, etc.)
were leaking into the DefaultSampler. Using _delete_=True forces
a full replacement of the sampler dict.

https://claude.ai/code/session_0135dmPP4TXG96XoRSSNvu5v
…ents-NHeaL

Fix sampler override with _delete_=True to prevent merge leak
Reads a large GeoTIFF, tiles it into patches with overlap, runs
multi-modal model inference, and stitches results into a full-size
GeoTIFF. Overlapping regions are averaged. Supports rasterio and GDAL.
Output: flood=red(255,0,0), non-flood=black(0,0,0), preserves CRS/transform.

https://claude.ai/code/session_0135dmPP4TXG96XoRSSNvu5v
…ents-NHeaL

Add large TIF tile-based inference script with stitching
Was using per-tile min-max normalization, but training uses ImageNet-style
mean/std normalization (MultiModalNormalize). This mismatch caused the
model to see completely different input distributions, producing mostly
non-flood predictions.

Fix: use the exact same mean/std values from MultiModalNormalize for
each modality (rgb, sar, GF). Also added normalization debug output.

https://claude.ai/code/session_0135dmPP4TXG96XoRSSNvu5v
…ents-NHeaL

Fix critical normalization bug in large TIF inference
Fig 4a/4c/4d now load actual test images through the dataset pipeline
(with proper normalization), capturing true data-driven routing patterns.
Random noise only reflects modal_bias; real images show the combined
effect of input features + modal_bias on expert routing decisions.
Falls back to random noise if no images found for a modality.

https://claude.ai/code/session_0135dmPP4TXG96XoRSSNvu5v
…ents-NHeaL

Use real test images instead of random noise for routing analysis
- Skip LoadAnnotations (labels not needed for routing analysis)
- Use minimal pipeline: LoadImage -> Resize(256) -> Normalize -> Pack
- Set test_cfg to 'whole' mode (avoid slide_inference overhead)
- Remove seg_map_path dependency
- Print debug info on first pipeline failure

https://claude.ai/code/session_0135dmPP4TXG96XoRSSNvu5v
…ents-NHeaL

Fix routing visualization to properly load real test images
Introduce a Sen1Floods11Dataset that loads the S1Hand (2-band SAR) or
S2Hand (13-band Sentinel-2 MSI) subdirectories paired with LabelHand
masks. A companion LoadSen1Floods11Annotation transform decodes the
signed-int label TIFFs via tifffile and remaps the -1 nodata value to
the standard 255 ignore index so CrossEntropyLoss skips those pixels.

Wire the new 's1' (2ch) and 's2' (13ch) modalities into
MultiModalNormalize.NORM_CONFIGS with sensible defaults, and ship a
tools/compute_sen1floods11_stats.py helper that can recompute mean/std
from any split (nodata-masked) for users who want dataset-specific
statistics.

Two new finetune configs (finetune_sen1floods11_s1.py and
finetune_sen1floods11_s2.py) inherit the existing freeze-backbone /
retrain-stem-and-decoder recipe but override modal_configs,
training_modals, and dataset_names so the pretrained Swin body is
reused while the stem conv and decode head retrain from scratch for
the new sensor. Shape-mismatched modal-specific weights from the
pretrained ckpt are dropped by mmengine's strict=False loader.
…ion-training-SFpKO

Add Sen1Floods11 S1/S2 fine-tune configs and dataset
SparseDispatcher used torch.nonzero(gates) (which treats NaN as nonzero
since NaN != 0) to build _batch_index, but counted (gates > 0) (which
excludes NaN) for _part_sizes. When any gate became NaN, torch.split
crashed because the split sizes no longer summed to the dispatched
tensor. This surfaced while fine-tuning finetune_sen1floods11_s1.py:
the freshly-initialized 2-channel s1 patch embed can emit zero-norm
pooled features, and F.normalize(0) -> NaN propagated through the
softmax/scatter into gates.

Two fixes:
1. CosineTopKGate: pass eps=1e-6 to F.normalize so zero-norm vectors
   no longer produce NaN logits in the first place.
2. SparseDispatcher: sanitize gates with nan_to_num as a safety net,
   and use a single positive_mask for both _batch_index and
   _part_sizes so the two can never disagree again.
Sen1Floods11 S1Hand TIFFs (and many other SAR / MSI products) encode
nodata pixels as NaN or +/-Inf inside the raster. The pipeline loaded
them with tifffile and fed them straight into

    img = (img - mean) / std

which propagated NaN through the first conv, the Swin body, and the
CosineTopKGate, turning every loss term (CE head + MoE balance + aux
head) into NaN from the very first training step. During
finetune_sen1floods11_s1 the user saw loss: nan from epoch 1 and
Flood IoU pinned at 0 (the model only predicted Background).

tools/compute_sen1floods11_stats.py already masks out non-finite
S1Hand pixels when computing the stats, confirming this is a known
property of the source data; the runtime pipeline just wasn't doing
the same filtering.

Fix: in MultiModalNormalize.transform, replace non-finite pixels with
the per-channel mean (so they normalize to 0) before doing the actual
(img - mean) / std, and then clip the result to +/-10 sigma as a
safety net for products that use a finite sentinel like -9999 instead
of NaN. +/-10 sigma is well outside any legitimate value for the
rgb / sar / GF / s1 / s2 / multispectral configs, so real data is
untouched.
Adds the missing end-to-end setup for fine-tuning the Swin+MoE backbone
on Sen1Floods11 with either S1Hand (2-band SAR) or S2Hand (13-band MSI)
against the shared LabelHand masks.

tools/setup_sen1floods11.py (new):
  One-shot setup that reads data/Sen1Floods11/LabelHand/, writes
  deterministic 70/15/15 train/val/test splits to
  data/Sen1Floods11/splits/{train,val,test}.txt (MD5-hashed basenames
  so re-running is idempotent), and then computes per-channel mean/std
  for s1 and s2 using only the training split - with NaN/Inf pixels
  and label == -1 pixels masked out, matching how MultiModalNormalize
  now sanitizes inputs at runtime. Prints a copy-pasteable
  NORM_CONFIGS snippet so the shipped defaults can be refreshed for
  the actual on-disk data.

configs/floodnet/finetune_sen1floods11_s1.py &
configs/floodnet/finetune_sen1floods11_s2.py:
  * Wire ann_file='splits/train.txt' / 'splits/val.txt' /
    'splits/test.txt' into the three dataloaders. Previously every
    dataloader scanned the full S1Hand / S2Hand directory, which meant
    train / val / test all saw the same tiles and any reported val
    mIoU was a training-set metric. BaseSegDataset._join_prefix
    resolves the ann_file relative to data_root, so no absolute paths
    in the config.
  * Drop RandomResize from the train pipeline. All Sen1Floods11 tiles
    are already 512x512; the inherited (2048, 512) + 0.5..2.0 range
    came from a FloodNet config and added no value for this dataset.
    New pipeline is Load -> LoadAnn -> RandomCrop(256) ->
    RandomFlip -> MultiModalNormalize -> MultiModalPad -> Pack, with
    LoadAnn before RandomCrop so cat_max_ratio=0.75 can reject
    all-background crops.
  * Val / test batch_size dropped to 1 (sliding-window inference on
    512x512 tiles spawns 9 crops per sample; batch 16 pushed 144
    crops through the network at once for no speed benefit).
  * Docstrings now explain the required setup order: run
    tools/setup_sen1floods11.py once to produce the splits (and
    optionally the stats), then launch tools/train.py.

Both configs share the same splits so S1 vs S2 results can be
compared on identical tile sets.
Remaps the two-color palette saved by SegVisualizationHook:
  black [0,0,0]     (Background) -> #7c7c7c [124,124,124]
  red   [255,0,0]   (Flood)      -> #000bc5 [0,11,197]

Supports in-place overwrite or --dst output directory,
single files, and recursive directory processing.

https://claude.ai/code/session_01GgpRbsrDW4KRenqK8x8pQf
Problem: model predicts class 0/1 for ALL pixels including nodata
regions (label=-1). Nodata pixels falsely shown as Flood in output.

Training loss and IoUMetric already correctly ignore label=255 via
their respective ignore_index defaults, so metrics are unaffected.

Fix:
- SegVisualizationHook._mask_nodata(): sets pred=255 where gt==255
  before visualization. Called after evaluator.process() so metrics
  stay correct. _draw_sem_seg filters class>=num_classes, so nodata
  pixels are left uncolored (black).
- remap_pred_colors.py: add --label-dir to load GT TIFFs and paint
  nodata pixels a distinct color (default: white #ffffff), so they
  are visually separable from Background.

https://claude.ai/code/session_01GgpRbsrDW4KRenqK8x8pQf
Per user preference, nodata and background share the same color.
Both render as black [0,0,0] after SegVisualizationHook._mask_nodata,
and both get remapped to #7c7c7c by the existing COLOR_MAP entry.
No GT-label loading needed.

https://claude.ai/code/session_01GgpRbsrDW4KRenqK8x8pQf
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 17, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ claude
❌ VIncentmuyi
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants