Skip to content

Question about merge_input function - Does it really include different resolutions? #231

@pldlgb

Description

@pldlgb

Sample: From low resolution to high resolution

Hi, I have a question regarding the merge_input function in your code. Specifically, the docstring mentions:

def merge_input(self, sample, encoder_hidden_length, encoder_attention_mask):
    """
        Merge the input video with different resolutions into one sequence
        Sample: From low resolution to high resolution
    """

However, when looking at the implementation, it seems to me that this function might not actually handle different resolutions, but rather incorporates historical frame information. Could you please clarify if this function indeed processes inputs of varying resolutions, or if it only deals with historical conditions from past frames?

Thank you for your time and for providing this project!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions