Skip to content

[FEATURE] Is there a mapping between filenames and data categories for the 'cyto' dataset? #1386

@Zhiyuan-Weng

Description

@Zhiyuan-Weng

Before you fill out this form:

Did you review the FAQ?

Did you look through previous (open AND closed) issues posted on GH?

Now fill this form out completely:

Is your feature request related to a problem? Please describe.

Yes, I am unable to distinguish the specific data categories (e.g., neurons vs. non-microscopy images) within the cyto dataset because the filenames are sequential (e.g., 000_img.png to 539_img.png) and no metadata file is provided.

I attempted to infer the categories based on the file index, assuming they were strictly grouped by type as described in the paper (e.g., 100 neurons, 216 fluorescent cells, etc.). However, the ordering does not seem to be strictly sequential. For instance, I found that 439_img.png in the training set is a dual-channel fluorescent image, appearing late in the sequence amongst what seem to be single-channel or non-microscopy images.

Describe the solution you'd like

I would like to request a metadata file, a mapping list, or a clear description of the file ordering that links the filenames (e.g., 000_img.png) to the specific data sources/categories mentioned in the paper (e.g., "fluorescent images of cultured neurons", "non-microscopy images", etc.).

Describe alternatives you've considered

I have tried to manually inspect the images and classify them based on visual appearance and channel count (single vs. dual channel). However, this method is imprecise and inefficient for our research pipeline, which requires accurate categorization of the data types.

Additional context

We are conducting further research based on the cyto dataset and need to analyze model performance across different image domains (e.g., biological vs. non-biological, fluorescent vs. brightfield). Access to the ground truth category mapping is essential for the integrity and accuracy of our subsequent experiments and analysis.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions