Skip to content

Commit 36e2bb1

Browse files
author
weilong.cwl
committed
add video for seg blob
1 parent e6843ef commit 36e2bb1

File tree

2 files changed

+15
-13
lines changed

2 files changed

+15
-13
lines changed

content/blog/ming-lite-omni-1_5-seg/index.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ Given an instruction like “*segment the banana in the upper-right corner*”,
6363

6464
The results were painful.
6565

66-
![Struggling with Segmentation](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*acrPSp-7qM8AAAAAgCAAAAgAevzJAQ/original)
66+
![Struggling with Segmentation](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*2BAkRZ9WGTcAAAAAgCAAAAgAevzJAQ/original)
6767

6868
On RefCOCO-val, our cIoU plateaued at **~16%**.
6969

@@ -124,10 +124,10 @@ Against Qwen-Image and Nano Banana, our model:
124124
- Located small or occluded targets more reliably.
125125
- Produced boundaries that were visually and semantically aligned with instructions.
126126

127-
![Segmentation Comparison 1](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*koynTZD5vO8AAAAAgDAAAAgAevzJAQ/original)
127+
![Segmentation Comparison 1](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*DwJpSZyoW-YAAAAAgJAAAAgAevzJAQ/original)
128128
*Our model (right) accurately locates and segments the target subject. Qwen-Image (second from left) fails to locate the correct target, while Nano-banana (third from left) fails to accurately segment the man's head and has loose boundary lines.*
129129

130-
![Segmentation Comparison 2](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*5C7KTbk2WZ0AAAAAgBAAAAgAevzJAQ/original)
130+
![Segmentation Comparison 2](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*yL2MR7vLQdEAAAAAgEAAAAgAevzJAQ/original)
131131
*For the prompt "please segment the girl with red mask," our model (right) is precise. Qwen-Image (second from left) misses the feet, and Nano-banana (third from left) alters the subject's proportions.*
132132

133133
During evaluation, thanks to the high consistency of non-edited regions in our model, we can directly derive the segmentation mask by calculating the difference between the edited result and the original image. The results show that our model's performance on segmentation is now on par with specialized vision models.
@@ -150,18 +150,18 @@ The beauty of this method is that it not only fixed the segmentation weakness bu
150150

151151
Because the model has learned an unprecedented "respect for boundaries" through thousands of "precise coloring" exercises, this "muscle memory" for fine-grained control has transferred to all editing tasks. Our edit controllability score saw a significant jump from **7.69 to 8.12** across sub-tasks like background, color, and material changes.
152152

153-
![Editing Controllability Comparison](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*7PgiRpiJyScAAAAAgCAAAAgAevzJAQ/original)
153+
![Editing Controllability Comparison](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*szjcQqQkC80AAAAAgIAAAAgAevzJAQ/original)
154154
*Prompt: "remove the bow tie of the man on the far right." Our model (right) precisely removes only the target bow tie while maintaining background consistency. Qwen (second from left) incorrectly removes multiple bow ties and introduces inconsistencies. Nano-banana (third from left) also struggles with consistency.*
155155

156156
#### 3. Stronger ID Consistency
157157

158158
A core challenge in portrait editing is maintaining identity. Our model excels here as well. Whether changing a hairstyle or adjusting an expression, the model skillfully preserves the person's core features.
159159

160-
![ID Consistency Comparison](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*19ULQZrBWIAAAAAAd5AAAAgAevzJAQ/original)
160+
![ID Consistency Comparison](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*Tc2-RoAHys8AAAAAd9AAAAgAevzJAQ/original)
161161
*Top Row (Turn head): Our model (right) maintains ID and background consistency, unlike competitors. Middle Row (Smile): Our model (right) correctly follows the prompt while preserving ID, avoiding distortions seen in others. Bottom Row (Change background): Our model (right) excels at preserving the subject's ID and appearance during a background swap.*
162162

163-
<!-- **See More Editing Consistency in Action:**
164-
![More Consistency Examples](占位符:请在这里替换为您的图示链接) -->
163+
**See More Editing Consistency in Action:**
164+
<video src="https://gw.alipayobjects.com/v/huamei_wp0xz6/afts/video/A*CcqdTbafkt8AAAAAgEAAAAgAevzJAQ" width="704px" height="740px" controls></video>
165165

166166
---
167167

@@ -190,6 +190,9 @@ We suspect 3D understanding, video generation, and other domains have their own
190190

191191
**And this is only the overture.**
192192

193+
Try out our open-source model **Ming-lite-omni 1.5** on our [**GitHub Page / Demo Page**](https://github.com/inclusionAI/Ming/blob/main/cookbook.ipynb). Please star our repo if you like it!
194+
195+
193196
<!-- ---
194197
195198
Try out our open-source model **Ming-lite-omni 1.5** on our [**GitHub Page / Demo Page**](占位符:你的GitHub/Demo链接). Please star our repo if you like it!

content/blog/ming-lite-omni-1_5-seg/index.zh.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -122,11 +122,8 @@ show_word_count: true
122122

123123
![ID一致性对比图](https://mdn.alipayobjects.com/huamei_wp0xz6/afts/img/A*19ULQZrBWIAAAAAAd5AAAAgAevzJAQ/original)
124124

125-
<!-- **更多一致性 Case:**
126-
127-
<video src="https://gw.alipayobjects.com/v/huamei_drbxn1/afts/video/TptZRJDixVUAAAAAhqAAAAgADkliAQFr" width="540px" height="800px" controls></video> -->
128-
129-
<!-- ![更多一致性案例](占位符:请在这里替换为您的图示链接) -->
125+
**更多一致性 Case:**
126+
<video src="https://gw.alipayobjects.com/v/huamei_wp0xz6/afts/video/A*CcqdTbafkt8AAAAAgEAAAAgAevzJAQ" width="704px" height="740px" controls></video>
130127

131128
---
132129

@@ -148,4 +145,6 @@ show_word_count: true
148145

149146
“分割即编辑”只是第一个成功的尝试。我们相信,在3D理解、视频生成等更广阔的领域,还隐藏着更多这样的“催化剂”等待我们去发现。
150147

151-
**AI的“左手”与“右手”,终于学会了如何优雅地击掌。而这,仅仅是交响乐的序章。**
148+
**AI的“左手”与“右手”,终于学会了如何优雅地击掌。而这,仅仅是交响乐的序章。**
149+
150+
欢迎使用开源的 **Ming-lite-omni 1.5** [**GitHub Page / Demo Page**](https://github.com/inclusionAI/Ming/blob/main/cookbook.ipynb)

0 commit comments

Comments
 (0)