timescope #2961

orrzohar · 2025-07-14T03:33:06Z

Congratulations! You've made it this far! Once merged, the article will appear at https://huggingface.co/blog. Official articles
require additional reviews. Alternatively, you can write a community article following the process here.

Preparing the Article

You're not quite done yet, though. Please make sure to follow this process (as documented here):

[y] Add an entry to _blog.yml.
[y] Add a thumbnail. There are no requirements here, but there is a template if it's helpful.
[y] Check you use a short title and blog path.
[y] Upload any additional assets (such as images) to the Documentation Images repo. This is to reduce bloat in the GitHub base repo when cloning and pulling. Try to have small images to avoid a slow or expensive user experience.
[y] Add metadata (such as authors) to your md file. You can also specify guest or org for the authors.
[y] Ensure the publication date is correct.
[y] Preview the content. A quick way is to paste the markdown content in https://huggingface.co/new-blog. Do not click publish, this is just a way to do an early check.

Here is an example of a complete PR: #2382

Getting a Review

Please make sure to get a review from someone on your team or a co-author.
Once this is done and once all the steps above are completed, you should be able to merge.
There is no need for additional reviews if you and your co-authors are happy and meet all of the above.

Feel free to add @pcuenca as a reviewer if you want a final check. Keep in mind he'll be biased toward light reviews
(e.g., check for proper metadata) rather than content reviews unless explicitly asked.

orrzohar · 2025-07-14T03:34:42Z

note: hf space renders in local md, but not: https://huggingface.co/new-blog

is that normal? @andimarafioti

merveenoyan

did an initial pass, super nice!

_blog.yml

timescope.md

merveenoyan · 2025-07-16T08:18:33Z

timescope.md

+
+- **Dataset**: [Apollo-LMMs/TimeScope](https://huggingface.co/datasets/Apollo-LMMs/TimeScope)
+- **Leaderboard**: [Apollo-LMMs/TimeScope](https://huggingface.co/spaces/Apollo-LMMs/TimeScope)
+- **Evaluation Framework**: [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval)


would be nice to end with a call for action below

Co-authored-by: Merve Noyan <[email protected]>

andimarafioti

This is great! I left some comments where I think we could still improve it but overall super happy with the blog :)

timescope-video-lmm-benchmark.md

andimarafioti · 2025-07-21T15:24:15Z

timescope-video-lmm-benchmark.md

+To kick things off, we ran TimeScope on a suite of leading vision-language models, from open-source favorites to the juggernauts like Gemini2.5-Pro. The results underscore the benchmark’s value: even models with advertised long-context prowess struggle with authentic temporal tasks at scale. These findings reveal clear patterns—performance cliffs around certain durations, strengths in static retrieval versus weaknesses in motion analysis—and pave the way for targeted improvements in model training. For detailed results and visualizations, check out our Hugging Face Space embedded above.
+
+


I would discuss a bit more in detail the results here. Something like that Gemini is pretty good but in the temporal task no other model gets above 54 even at 20 minutes.

sergiopaniego

The current thumbnail is 1024×1024, so it will likely be resized and may appear cropped. The recommended dimensions are 1300×650.

https://github.com/huggingface/blog?tab=readme-ov-file#how-to-get-a-nice-responsive-thumbnail

sergiopaniego

niceee!

timescope-video-lmm-benchmark.md

Co-authored-by: Andrés Marafioti <[email protected]>

Co-authored-by: Sergio Paniego Blanco <[email protected]>

Co-authored-by: Andrés Marafioti <[email protected]>

Co-authored-by: Sergio Paniego Blanco <[email protected]>

Co-authored-by: Andrés Marafioti <[email protected]>

merveenoyan

thanks a lot! 💗

timescope-video-lmm-benchmark.md

merveenoyan · 2025-07-22T09:52:26Z

timescope-video-lmm-benchmark.md

+
+- **Dataset**: [Apollo-LMMs/TimeScope](https://huggingface.co/datasets/Apollo-LMMs/TimeScope)
+- **Leaderboard**: [Apollo-LMMs/TimeScope](https://huggingface.co/spaces/Apollo-LMMs/TimeScope)
+- **Evaluation Framework**: [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval)


we need to have a conclusion and a call for action here overall

Co-authored-by: Merve Noyan <[email protected]>

timescope-video-lmm-benchmark.md

Co-authored-by: Merve Noyan <[email protected]>

# Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.

orrzohar and others added 10 commits July 13, 2025 01:45

timescope

5811275

adding timescope

d20cfe7

Update timescope.md (typo)

4ffd033

Update _blog.yml

1872b82

Update timescope.md

17bdd94

Update timescope.md

3ffafab

Update _blog.yml

78bc27d

Update timescope.md

831fcb4

Update timescope.md

ddc5753

Update timescope.md

dee2bd8

merveenoyan reviewed Jul 16, 2025

View reviewed changes

orrzohar and others added 3 commits July 21, 2025 00:32

Update timescope.md

42985e5

Co-authored-by: Merve Noyan <[email protected]>

Add TimeScope blog post and index entry

baae667

Rename TimeScope post slug and update index entry

18a0956

andimarafioti approved these changes Jul 21, 2025

View reviewed changes

sergiopaniego reviewed Jul 21, 2025

View reviewed changes

sergiopaniego approved these changes Jul 21, 2025

View reviewed changes

timescope-video-lmm-benchmark.md Outdated Show resolved Hide resolved

timescope-video-lmm-benchmark.md Outdated Show resolved Hide resolved

orrzohar and others added 6 commits July 21, 2025 10:27

Update timescope-video-lmm-benchmark.md

8bb1e7a

Co-authored-by: Andrés Marafioti <[email protected]>

Update timescope-video-lmm-benchmark.md

bbc0045

Co-authored-by: Andrés Marafioti <[email protected]>

Update timescope-video-lmm-benchmark.md

390e3e2

Co-authored-by: Sergio Paniego Blanco <[email protected]>

Update timescope-video-lmm-benchmark.md

5a78fc0

Co-authored-by: Andrés Marafioti <[email protected]>

Update timescope-video-lmm-benchmark.md

ddb3ec6

Co-authored-by: Sergio Paniego Blanco <[email protected]>

Update timescope-video-lmm-benchmark.md

1bd7269

Co-authored-by: Andrés Marafioti <[email protected]>

This comment was marked as resolved.

Sign in to view

orrzohar added 2 commits July 21, 2025 10:58

Merge upstream/main and integrate new blog entries

104173d

final edits to timescope, changing thumbnail

c0ec58e

merveenoyan reviewed Jul 22, 2025

View reviewed changes

orrzohar and others added 2 commits July 22, 2025 09:15

Update timescope-video-lmm-benchmark.md

f4899ee

Co-authored-by: Merve Noyan <[email protected]>

Update timescope-video-lmm-benchmark.md

399c243

Co-authored-by: Merve Noyan <[email protected]>

orrzohar and others added 3 commits July 22, 2025 09:16

Update timescope-video-lmm-benchmark.md

f06d342

Co-authored-by: Merve Noyan <[email protected]>

Update timescope-video-lmm-benchmark.md

4e39807

Co-authored-by: Merve Noyan <[email protected]>

Update timescope-video-lmm-benchmark.md

0fa1d29

Co-authored-by: Merve Noyan <[email protected]>

andimarafioti reviewed Jul 22, 2025

View reviewed changes

timescope-video-lmm-benchmark.md Outdated Show resolved Hide resolved

andimarafioti and others added 5 commits July 22, 2025 18:17

Update timescope-video-lmm-benchmark.md

111c731

Update timescope-video-lmm-benchmark.md

f4131af

Co-authored-by: Merve Noyan <[email protected]>

final edits

fe20828

Merge remote-tracking branch 'upstream/main'

c1fad19

# Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.

Merge branch 'main' into main

49fe38a

andimarafioti merged commit fc70750 into huggingface:main Jul 23, 2025

		To kick things off, we ran TimeScope on a suite of leading vision-language models, from open-source favorites to the juggernauts like Gemini2.5-Pro. The results underscore the benchmark’s value: even models with advertised long-context prowess struggle with authentic temporal tasks at scale. These findings reveal clear patterns—performance cliffs around certain durations, strengths in static retrieval versus weaknesses in motion analysis—and pave the way for targeted improvements in model training. For detailed results and visualizations, check out our Hugging Face Space embedded above.

timescope #2961

timescope #2961

Uh oh!

Conversation

orrzohar commented Jul 14, 2025

Preparing the Article

Getting a Review

Uh oh!

orrzohar commented Jul 14, 2025

Uh oh!

merveenoyan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

merveenoyan Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

andimarafioti left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andimarafioti Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

sergiopaniego left a comment

Choose a reason for hiding this comment

Uh oh!

sergiopaniego left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

merveenoyan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

merveenoyan Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!