Skip to content

Conversation

Dan-Flores
Copy link
Contributor

@Dan-Flores Dan-Flores commented Aug 20, 2025

This PR updates the resource generation script to be in python and integrates the convert_image_to_tensor.py script, as discussed in #820.

  • Additionally, the MP3 audio file generation command is moved to test/utils.py, similar to the other TestVideo/Audio definitions.

Note:
In testing, I observed that the generated reference resources differed slightly from the currently checked-in files, but the resulting tensors were the same between the shell and python script, so its likely a difference in FFmpeg.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 20, 2025
@Dan-Flores Dan-Flores marked this pull request as ready for review August 22, 2025 20:27
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Dan-Flores , I left a few comments below that should be easy to address, so I'll approve to unblock.

QQ regarding

In testing, I observed that the generated reference resources differed slightly from the currently checked-in files, but the resulting tensors were the same between the shell and python script, so its likely a difference in FFmpeg.

Can you clarify how you observed that the generated references resources were different? I'm a little confused because I would assume that in order to compare the reference resources, we'd be looking at the resulting tensors (which, as you noted, appear to be equal)



def convert_image_to_tensor(image_path):
if not os.path.exists(image_path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and everywhere else in this file, let's try to modernize the codebase and rely on pathlib and pathlib.Path functionalities, instead of os.*.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above about using pathlib still stands, but I wonder whether the "if file exists" check makes sense? Shouldn't we expect the file to exist, and error if not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was one issue I encountered in the for loop over these arrays:

STREAMS = [0, 3]
FRAMES = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 15, 20, 25, 30, 35, 386, 387, 388, 389]

One stream index does not have as many frames, hence why stream 3's checked in frames go up to nasa_13013.mp4.stream3.frame000389.pt, but stream 0's checked in frames end at nasa_13013.mp4.stream0.frame000035.pt.
In these cases, FFmpeg errors quietly so the script does not create or delete the frames:

[vost#0:0/bmp @ 0x145f23cb0] No filtered frames for output stream, trying to initialize anyway.
...
[out#0/image2 @ 0x145f23150] Output file is empty, nothing was encoded(check -ss / -t / -frames parameters if used)

Comment on lines 27 to 28
print(img_tensor.shape)
print(img_tensor.dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed they were present in the previous util, but they were probably debug leftovers that we can now remove?

"-q:v",
"2",
output_bmp,
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the cmd definition in details, but maybe there are opportunities to factorize some of the calls?

# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this copyright notice should be at the very top - although that might conflict with the #! line, so we should make sure it can still be invoked the intended way.

@Dan-Flores
Copy link
Contributor Author

Can you clarify how you observed that the generated references resources were different? I'm a little confused because I would assume that in order to compare the reference resources, we'd be looking at the resulting tensors (which, as you noted, appear to be equal)

I apologize for the confusing language @NicolasHug, when I mentioned generated references resources, I was referring to the tensors.
The tensors output by the existing shell script and the new python script on my machine are identical with each other, but are slightly different from the tensors currently checked in. To confirm the difference was small, I ran a python script with torch.load and torch.allclose and a tolerance of 3 was enough to pass.

@Dan-Flores Dan-Flores merged commit 6dc8d12 into pytorch:main Aug 27, 2025
57 of 58 checks passed
@Dan-Flores Dan-Flores deleted the generate_resources_in_python branch August 27, 2025 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants