Skip to content

Commit 9d4b561

Browse files
committed
Release coco-stuff-10k model
Release coco-stuff-10k model Release coco-stuff-10k model
1 parent c570a7e commit 9d4b561

File tree

7 files changed

+67
-54
lines changed

7 files changed

+67
-54
lines changed

segmentation/README.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -114,18 +114,18 @@ Prepare datasets according to the [guidelines](https://github.com/open-mmlab/mms
114114
<br>
115115
<div>
116116

117-
| method | backbone | resolution | mIoU (ss/ms) | #params | FLOPs | Config | Download |
118-
| :---------: | :------------: | :--------: | :-----------: | :-----: | :---: | :-----------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
119-
| UperNet | InternImage-T | 512x1024 | 82.58 / 83.40 | 59M | 1889G | [config](./configs/cityscapes/upernet_internimage_t_512x1024_160k_cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_t_512x1024_160k_cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_t_512x1024_160k_cityscapes.log.json) |
120-
| UperNet | InternImage-S | 512x1024 | 82.74 / 83.45 | 80M | 2035G | [config](./configs/cityscapes/upernet_internimage_s_512x1024_160k_cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_s_512x1024_160k_cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_s_512x1024_160k_cityscapes.log.json) |
121-
| UperNet | InternImage-B | 512x1024 | 83.18 / 83.97 | 128M | 2369G | [config](./configs/cityscapes/upernet_internimage_b_512x1024_160k_cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_b_512x1024_160k_cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_b_512x1024_160k_cityscapes.log.json) |
122-
| UperNet | InternImage-L | 512x1024 | 83.68 / 84.41 | 256M | 3234G | [config](./configs/cityscapes/upernet_internimage_l_512x1024_160k_cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_l_512x1024_160k_cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_l_512x1024_160k_cityscapes.log.json) |
123-
| UperNet\* | InternImage-L | 512x1024 | 85.94 / 86.22 | 256M | 3234G | [config](./configs/cityscapes/upernet_internimage_l_512x1024_160k_mapillary2cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_l_512x1024_160k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_l_512x1024_160k_mapillary2cityscapes.log.json) |
124-
| UperNet | InternImage-XL | 512x1024 | 83.62 / 84.28 | 368M | 4022G | [config](./configs/cityscapes/upernet_internimage_xl_512x1024_160k_cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_xl_512x1024_160k_cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_xl_512x1024_160k_cityscapes.log.json) |
125-
| UperNet\* | InternImage-XL | 512x1024 | 86.20 / 86.42 | 368M | 4022G | [config](./configs/cityscapes/upernet_internimage_xl_512x1024_160k_mapillary2cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_xl_512x1024_160k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_xl_512x1024_160k_mapillary2cityscapes.log.json) |
126-
| SegFormer\* | InternImage-L | 512x1024 | 85.16 / 85.67 | 220M | 1580G | [config](./configs/cityscapes/segformer_internimage_l_512x1024_160k_mapillary2cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/segformer_internimage_l_512x1024_160k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/segformer_internimage_l_512x1024_160k_mapillary2cityscapes.log.json) |
127-
| SegFormer\* | InternImage-XL | 512x1024 | 85.41 / 85.93 | 330M | 2364G | [config](./configs/cityscapes/segformer_internimage_xl_512x1024_160k_mapillary2cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/segformer_internimage_xl_512x1024_160k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/segformer_internimage_xl_512x1024_160k_mapillary2cityscapes.log.json) |
128-
| Mask2Former | InternImage-H | 1024x1024 | 86.37 / 86.96 | 1094M | 7878G | [config](./configs/cityscapes/mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes_ss.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes.log.json) |
117+
| method | backbone | resolution | mIoU (ss/ms) | #params | FLOPs | Config | Download |
118+
| :-----------: | :------------: | :--------: | :-----------: | :-----: | :---: | :--------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
119+
| UperNet | InternImage-T | 512x1024 | 82.58 / 83.40 | 59M | 1889G | [config](./configs/cityscapes/upernet_internimage_t_512x1024_160k_cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_t_512x1024_160k_cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_t_512x1024_160k_cityscapes.log.json) |
120+
| UperNet | InternImage-S | 512x1024 | 82.74 / 83.45 | 80M | 2035G | [config](./configs/cityscapes/upernet_internimage_s_512x1024_160k_cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_s_512x1024_160k_cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_s_512x1024_160k_cityscapes.log.json) |
121+
| UperNet | InternImage-B | 512x1024 | 83.18 / 83.97 | 128M | 2369G | [config](./configs/cityscapes/upernet_internimage_b_512x1024_160k_cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_b_512x1024_160k_cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_b_512x1024_160k_cityscapes.log.json) |
122+
| UperNet | InternImage-L | 512x1024 | 83.68 / 84.41 | 256M | 3234G | [config](./configs/cityscapes/upernet_internimage_l_512x1024_160k_cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_l_512x1024_160k_cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_l_512x1024_160k_cityscapes.log.json) |
123+
| UperNet\* | InternImage-L | 512x1024 | 85.94 / 86.22 | 256M | 3234G | [config](./configs/cityscapes/upernet_internimage_l_512x1024_160k_mapillary2cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_l_512x1024_160k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_l_512x1024_160k_mapillary2cityscapes.log.json) |
124+
| UperNet | InternImage-XL | 512x1024 | 83.62 / 84.28 | 368M | 4022G | [config](./configs/cityscapes/upernet_internimage_xl_512x1024_160k_cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_xl_512x1024_160k_cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_xl_512x1024_160k_cityscapes.log.json) |
125+
| UperNet\* | InternImage-XL | 512x1024 | 86.20 / 86.42 | 368M | 4022G | [config](./configs/cityscapes/upernet_internimage_xl_512x1024_160k_mapillary2cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/upernet_internimage_xl_512x1024_160k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/upernet_internimage_xl_512x1024_160k_mapillary2cityscapes.log.json) |
126+
| SegFormer\* | InternImage-L | 512x1024 | 85.16 / 85.67 | 220M | 1580G | [config](./configs/cityscapes/segformer_internimage_l_512x1024_160k_mapillary2cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/segformer_internimage_l_512x1024_160k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/segformer_internimage_l_512x1024_160k_mapillary2cityscapes.log.json) |
127+
| SegFormer\* | InternImage-XL | 512x1024 | 85.41 / 85.93 | 330M | 2364G | [config](./configs/cityscapes/segformer_internimage_xl_512x1024_160k_mapillary2cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/segformer_internimage_xl_512x1024_160k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/segformer_internimage_xl_512x1024_160k_mapillary2cityscapes.log.json) |
128+
| Mask2Former\* | InternImage-H | 1024x1024 | 86.37 / 86.96 | 1094M | 7878G | [config](./configs/cityscapes/mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes.log.json) |
129129

130130
\* denotes the model is trained using extra Mapillary dataset.
131131

@@ -138,9 +138,9 @@ Prepare datasets according to the [guidelines](https://github.com/open-mmlab/mms
138138
<br>
139139
<div>
140140

141-
| method | backbone | resolution | mIoU (ss) | #params | FLOPs | Config | Download |
142-
| :---------: | :-----------: | :--------: | :-------: | :-----: | :---: | :--------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
143-
| Mask2Former | InternImage-H | 896x896 | 52.6 | 1.31B | 4635G | [config](./configs/coco_stuff164k/mask2former_internimage_h_896_80k_cocostuff164k_ss.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_896_80k_cocostuff164k.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_896_80k_cocostuff164k.log.json) |
141+
| method | backbone | resolution | mIoU (ss) | #params | FLOPs | Config | Download |
142+
| :---------: | :-----------: | :--------: | :-------: | :-----: | :---: | :-----------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
143+
| Mask2Former | InternImage-H | 896x896 | 52.6 | 1.31B | 4635G | [config](./configs/coco_stuff164k/mask2former_internimage_h_896_80k_cocostuff164k.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_896_80k_cocostuff164k.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_896_80k_cocostuff164k.log.json) |
144144

145145
</div>
146146

@@ -151,9 +151,9 @@ Prepare datasets according to the [guidelines](https://github.com/open-mmlab/mms
151151
<br>
152152
<div>
153153

154-
| method | backbone | resolution | mIoU (ss) | #params | FLOPs | Config | Download |
155-
| :---------: | :-----------: | :--------: | :-------: | :-----: | :---: | :------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
156-
| Mask2Former | InternImage-H | 896x896 | 52.6 | 1.31B | 4635G | [config](./configs/coco_stuff10k/mask2former_internimage_h_896_80k_cocostuff10k_ss.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_896_80k_cocostuff10k.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_896_80k_cocostuff10k.log.json) |
154+
| method | backbone | resolution | mIoU (ss) | #params | FLOPs | Config | Download |
155+
| :---------: | :-----------: | :--------: | :---------: | :-----: | :---: | :------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
156+
| Mask2Former | InternImage-H | 512x512 | 59.2 / 59.6 | 1.28B | 1528G | [config](./configs/coco_stuff10k/mask2former_internimage_h_512_40k_cocostuff164k_to_10k.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_512_40k_cocostuff164k_to_10k.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_512_40k_cocostuff164k_to_10k.log.json) |
157157

158158
</div>
159159

segmentation/configs/cityscapes/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,6 @@ Mapillary 80k + Cityscapes (w/ coarse data) 160k
4141

4242
Mapillary 80k + Cityscapes (w/ coarse data) 80k
4343

44-
| backbone | resolution | mIoU (ss/ms) | #params | FLOPs | Config | Download |
45-
| :-----------: | :--------: | :-----------: | :-----: | :---: | :----------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
46-
| InternImage-H | 1024x1024 | 86.37 / 86.96 | 1094M | 7878G | [config](./mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes_ss.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes.log.json) |
44+
| backbone | resolution | mIoU (ss/ms) | #params | FLOPs | Config | Download |
45+
| :-----------: | :--------: | :-----------: | :-----: | :---: | :-------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
46+
| InternImage-H | 1024x1024 | 86.37 / 86.96 | 1094M | 7878G | [config](./mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_1024x1024_80k_mapillary2cityscapes.log.json) |
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# COCO-Stuff-10K
2+
3+
<!-- [ALGORITHM] -->
4+
5+
## Introduction
6+
7+
COCO-Stuff-10K is a dataset designed to enhance scene understanding tasks in computer vision by providing pixel-level annotations for both "things" (discrete objects with well-defined shapes, like cars and people) and "stuff" (amorphous background regions, such as grass and sky). This dataset augments 10,000 images from the original COCO dataset, offering detailed labels across 182 classes—91 "thing" classes and 91 "stuff" classes.
8+
9+
## Model Zoo
10+
11+
### Mask2Former + InternImage
12+
13+
| backbone | resolution | mIoU (ss/ms) | #param | FLOPs | Config | Download |
14+
| :-----------: | :--------: | :----------: | :----: | :---: | :-------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
15+
| InternImage-H | 512x512 | 59.2 / 59.6 | 1.28B | 1528G | [config](./mask2former_internimage_h_512_40k_cocostuff164k_to_10k.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask2former_internimage_h_512_40k_cocostuff164k_to_10k.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/raw/main/mask2former_internimage_h_512_40k_cocostuff164k_to_10k.log.json) |

0 commit comments

Comments
 (0)