Update resource generation script to python #828

Dan-Flores · 2025-08-20T19:44:51Z

This PR updates the resource generation script to be in python and integrates the convert_image_to_tensor.py script, as discussed in #820.

Additionally, the MP3 audio file generation command is moved to test/utils.py, similar to the other TestVideo/Audio definitions.

Note:
In testing, I observed that the generated reference resources differed slightly from the currently checked-in files, but the resulting tensors were the same between the shell and python script, so its likely a difference in FFmpeg.

NicolasHug

Thanks @Dan-Flores , I left a few comments below that should be easy to address, so I'll approve to unblock.

QQ regarding

In testing, I observed that the generated reference resources differed slightly from the currently checked-in files, but the resulting tensors were the same between the shell and python script, so its likely a difference in FFmpeg.

Can you clarify how you observed that the generated references resources were different? I'm a little confused because I would assume that in order to compare the reference resources, we'd be looking at the resulting tensors (which, as you noted, appear to be equal)

NicolasHug · 2025-08-26T08:55:53Z

test/generate_reference_resources.py

+
+
+def convert_image_to_tensor(image_path):
+    if not os.path.exists(image_path):


Here and everywhere else in this file, let's try to modernize the codebase and rely on pathlib and pathlib.Path functionalities, instead of os.*.

The comment above about using pathlib still stands, but I wonder whether the "if file exists" check makes sense? Shouldn't we expect the file to exist, and error if not?

There was one issue I encountered in the for loop over these arrays:

STREAMS = [0, 3] FRAMES = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 15, 20, 25, 30, 35, 386, 387, 388, 389]

One stream index does not have as many frames, hence why stream 3's checked in frames go up to nasa_13013.mp4.stream3.frame000389.pt, but stream 0's checked in frames end at nasa_13013.mp4.stream0.frame000035.pt.
In these cases, FFmpeg errors quietly so the script does not create or delete the frames:

[vost#0:0/bmp @ 0x145f23cb0] No filtered frames for output stream, trying to initialize anyway. ... [out#0/image2 @ 0x145f23150] Output file is empty, nothing was encoded(check -ss / -t / -frames parameters if used)

NicolasHug · 2025-08-26T08:57:27Z

test/generate_reference_resources.py

+    print(img_tensor.shape)
+    print(img_tensor.dtype)


I noticed they were present in the previous util, but they were probably debug leftovers that we can now remove?

NicolasHug · 2025-08-26T09:02:22Z

test/generate_reference_resources.py

+            "-q:v",
+            "2",
+            output_bmp,
+        ]


I haven't looked at the cmd definition in details, but maybe there are opportunities to factorize some of the calls?

scotts · 2025-08-26T13:22:31Z

test/generate_reference_resources.py

+# All rights reserved.
+#
+# This source code is licensed under the BSD-style license found in the
+# LICENSE file in the root directory of this source tree.


I think this copyright notice should be at the very top - although that might conflict with the #! line, so we should make sure it can still be invoked the intended way.

Dan-Flores · 2025-08-26T14:37:13Z

Can you clarify how you observed that the generated references resources were different? I'm a little confused because I would assume that in order to compare the reference resources, we'd be looking at the resulting tensors (which, as you noted, appear to be equal)

I apologize for the confusing language @NicolasHug, when I mentioned generated references resources, I was referring to the tensors.
The tensors output by the existing shell script and the new python script on my machine are identical with each other, but are slightly different from the tensors currently checked in. To confirm the difference was small, I ran a python script with torch.load and torch.allclose and a tolerance of 3 was enough to pass.

Daniel Flores added 4 commits August 20, 2025 11:16

move resources gen script to python

bbcae10

move mp3 generation to comment

e11273d

delete shell script

f234505

restore av1_video

80e1980

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 20, 2025

Daniel Flores added 3 commits August 20, 2025 15:51

update resource workflow to use py

aef1360

Use arg list, rename timestamp variables

8a22597

Delete convert_image_to_tensor script

6c96b2e

Dan-Flores marked this pull request as ready for review August 22, 2025 20:27

NicolasHug approved these changes Aug 26, 2025

View reviewed changes

scotts reviewed Aug 26, 2025

View reviewed changes

Daniel Flores added 3 commits August 26, 2025 16:01

reflect comments

9148c52

Factor out shared ffmpeg functions

390da5a

Update os to Path

de74ae7

Dan-Flores merged commit 6dc8d12 into pytorch:main Aug 27, 2025
57 of 58 checks passed

Dan-Flores deleted the generate_resources_in_python branch August 27, 2025 16:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update resource generation script to python #828

Update resource generation script to python #828

Uh oh!

Dan-Flores commented Aug 20, 2025 •

edited

Loading

Uh oh!

NicolasHug left a comment

Uh oh!

NicolasHug Aug 26, 2025

Uh oh!

NicolasHug Aug 26, 2025

Uh oh!

Dan-Flores Aug 26, 2025

Uh oh!

NicolasHug Aug 26, 2025

Uh oh!

NicolasHug Aug 26, 2025

Uh oh!

scotts Aug 26, 2025

Uh oh!

Dan-Flores commented Aug 26, 2025

Uh oh!

Uh oh!

Uh oh!



		def convert_image_to_tensor(image_path):
		if not os.path.exists(image_path):

Update resource generation script to python #828

Update resource generation script to python #828

Uh oh!

Conversation

Dan-Flores commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

NicolasHug Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

NicolasHug Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

Dan-Flores Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

NicolasHug Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

NicolasHug Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

scotts Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

Dan-Flores commented Aug 26, 2025

Uh oh!

Uh oh!

Uh oh!

Dan-Flores commented Aug 20, 2025 •

edited

Loading