Skip to content

Conversation

@mfs4rd
Copy link
Collaborator

@mfs4rd mfs4rd commented Nov 3, 2025

Done

  • Update inputs (and documentation) of inside-outside_classification.py to default to the example movie used in paper and output to default figure folder
  • Updated inside-outside code to use the new z-thresholding approach

ToDo

  • need help updating pdm.lock file

@mfs4rd mfs4rd requested review from pgarrison and smishra3 November 3, 2025 23:10
]]

df_io = io.load_inside_outside_classification()
df_io = df_io[df_io['Z']<27]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to run this script and got an error on this line?

Traceback (most recent call last):
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Z'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/analysis_scripts/Analysis_tools.py", line 1558, in <module>
    run_all_analyses()
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/analysis_scripts/Analysis_tools.py", line 41, in run_all_analyses
    plot_inside_outside_migration_timing(df, FIGS_DIR, OUT_TYPE)
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/analysis_scripts/Analysis_tools.py", line 952, in plot_inside_outside_migration_timing
    dfio_merge = load_io_data(df)
                 ^^^^^^^^^^^^^^^^
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/analysis_scripts/Analysis_tools.py", line 91, in load_io_data
    df_io = df_io[df_io['Z']<27]
                  ~~~~~^^^^^
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/pandas/core/frame.py", line 4102, in __getitem__
    indexer = self.columns.get_loc(key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3812, in get_loc
    raise KeyError(key) from err
KeyError: 'Z'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because the current manifest on quilt doesn't have the coordinates of nuclei centroids. For testing I temporarily changed the manifest to point to the copy currently on VAST but it needs to be uploaded
/allen/aics/users/filip.sluzewski/Public_Repos/emt-data-analysis/resubmission_scripts/nuclei_localization/mesh_features-resegmentation.csv

This will generate the plots in the manuscript and store them in `results/figures` folder. The manifests used as inputs in this workflow are automatically downloaded from [AWS](https://open.quiltdata.com/b/allencell/tree/aics/emt_timelapse_dataset/manifests/) by default. The user can opt to also use local version of these manifests if they produced locally by running the scripts `Feature_extraction.py`, `Metric_computation.py` and `Nuclei_localization.py`. To use local version of the manifests, please set `load_from_aws=False` everywhere in the script `Analysis_plots.py`.
This will generate the plots in the manuscript and store them in `results/figures` folder. The manifests used as inputs in this workflow are automatically downloaded from [AWS](https://open.quiltdata.com/b/allencell/tree/aics/emt_timelapse_dataset/manifests/) by default.

## 5 - [Optional] 3D Example Rendering
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## 5 - [Optional] 3D Example Rendering
## 5 - 3D Example Rendering

I was under the impression that all the steps are optional? Do the other steps depend on the results of previous steps?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically no, as all of the code pulls from the quilt dataset, but if the user were to process their own data each step would be dependent on the previous

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. If you want to make those relationships between steps explicit, I'd recommend writing more of an introduction at the top of the README.

Our goal for reproducibility for this repo is just that people can run our code on our data and produce the figures in the paper: if we want to try to support users running on their own data, there's a lot more work we have to do.


## 5 - [Optional] 3D Example Rendering

The functions in `EMT_data_analysis/figure_generation` can be used to generate 3D renderings shown in the paper. Functions have only been tested on Ubuntu 18.04/22.04
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The functions in `EMT_data_analysis/figure_generation` can be used to generate 3D renderings shown in the paper. Functions have only been tested on Ubuntu 18.04/22.04
The functions in `EMT_data_analysis/figure_generation` can be used to generate 3D renderings shown in the paper.

At the top of the readme we already said our code was tested on 18.04. If some of the code doesn't work on 18.04 and needs 22.04, that's a different thing. The top of the readme specifies 18.04.2 though, and our machines have upgraded to 18.04.4 and 18.04.6, so we should update that to be accurate to how we are testing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code will work on both 18.04 and 22.04 since the resubmission data was processed on the A100 machines. I don't think we need to be so specific as listing a specific sub-version of Ubuntu though

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we test all the code in this repo on 22.04, I think it's easier for users to understand "all the EMT_data_analysis code was tested on 18.04" than "all the EMT_data_analysis code was tested on 18.04, and also some of the code was tested on 22.04, too."

df_meta = df_meta[df_meta['Data ID'] == data_id]
df = io.load_inside_outside_classification()
df = df[df['Data ID'] == data_id]
df = df[df['Z']<27]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I try to run this script it hits an error here.

$ pdm run EMT_data_analysis/figure_generation/inside-outside_classification.py 
Total number of movies in the dataset: 3491
Traceback (most recent call last):
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Z'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/figure_generation/inside-outside_classification.py", line 165, in <module>
    main(args.data_id, args.output)
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/figure_generation/inside-outside_classification.py", line 50, in main
    df = df[df['Z']<27]
            ~~^^^^^
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/pandas/core/frame.py", line 4102, in __getitem__
    indexer = self.columns.get_loc(key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3812, in get_loc
    raise KeyError(key) from err
KeyError: 'Z'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the previous similar error, the quilt manifest needs to be updated to include this column

```bash
python colony_mask.py --data_id [Optional] --output_directory [Optional]
```
If no input arguments are provided, the code will default to the data shown in the paper and output results to `EMT_data_analysis/results/3D_all_cells_mask`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If no input arguments are provided, the code will default to the data shown in the paper and output results to `EMT_data_analysis/results/3D_all_cells_mask`.
If no input arguments are provided (i.e., `python EMT_data_analysis/figure_generation/colony_mask.py`), the code will default to the data shown in the paper and output results to `EMT_data_analysis/results/3D_all_cells_mask`.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input argument means --data_id or --output_directory

Copy link
Collaborator

@pgarrison pgarrison Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my suggestion was to provide clarity for users who might not understand that. For example, I could imagine someone leaving out the "[Optional]" pieces and running python colony_mask.py --data_id --output_directory.

```bash
python inside-outside_classification.py --data_id [Optional] --output_directory [Optional]
```
If no input arguments are provided, the code will default to the data shown in the paper and output results to `EMT_data_analysis/results/Inside-Outside/mesh-figures`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If no input arguments are provided, the code will default to the data shown in the paper and output results to `EMT_data_analysis/results/Inside-Outside/mesh-figures`.
If no input arguments are provided (i.e., `python EMT_data_analysis/figure_generation/inside-outside_classification.py`), the code will default to the data shown in the paper and output results to `EMT_data_analysis/results/Inside-Outside/mesh-figures`.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input argument means --data_id or --output_directory

### Inside-Outside Classification
run
```bash
python inside-outside_classification.py --data_id [Optional] --output_directory [Optional]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python inside-outside_classification.py --data_id [Optional] --output_directory [Optional]
python EMT_data_analysis/figure_generation/inside-outside_classification.py --data_id [Optional] --output_directory [Optional]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users should be in EMT_data_analysis/figure_generation/ already

### All Cells Mask
run
```bash
python colony_mask.py --data_id [Optional] --output_directory [Optional]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python colony_mask.py --data_id [Optional] --output_directory [Optional]
python EMT_data_analysis/figure_generation/colony_mask.py --data_id [Optional] --output_directory [Optional]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users should be in EMT_data_analysis/figure_generation/ already

Copy link
Collaborator

@pgarrison pgarrison Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users should be in EMT_data_analysis/figure_generation/ already

In that case, the instructions should specify that. However, I think it's simpler (fewer steps for the user) if we make all the instructions work from the top level of the repo. (This is also an issue with the instructions for the previous steps; I can make a PR for that.)

mfs4rd and others added 2 commits November 7, 2025 12:56
Co-authored-by: Philip Garrison <[email protected]>
Co-authored-by: Philip Garrison <[email protected]>
```bash
python colony_mask.py --data_id [Optional] --output_directory [Optional]
```
If no input arguments are provided, the code will default to the data shown in the paper and output results to `EMT_data_analysis/results/3D_all_cells_mask`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran colony_mask.py but did not get a 3D_all_cells_mask

$ pdm run EMT_data_analysis/figure_generation/colony_mask.py                                  
/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/tools/io.py:20: DtypeWarning: Columns (0,2,4,6,7,13,18,19,20,21,22,24,25,26,28,32,40,41,46,47,48,54,55,58,68,71,72,79,80,83,85,86,88,90,93) have mixed types. Specify dtype option on import or set low_memory=False.
  df = pd.read_csv(path)
Total number of movies in the dataset: 3491
/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/bioio_ome_zarr/reader.py:87: UserWarning: Warning: reading from S3 without fs_kwargs. Consider providing fs_kwargs (e.g., {'anon': True} for public S3) to ensure accurate reading.
  warnings.warn(
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [06:56<00:00, 104.02s/it]
/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/tools/io.py:20: DtypeWarning: Columns (0,2,4,6,7,13,18,19,20,21,22,24,25,26,28,32,40,41,46,47,48,54,55,58,68,71,72,79,80,83,85,86,88,90,93) have mixed types. Specify dtype option on import or set low_memory=False.
  df = pd.read_csv(path)
Total number of movies in the dataset: 3491
/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/bioio_ome_zarr/reader.py:87: UserWarning: Warning: reading from S3 without fs_kwargs. Consider providing fs_kwargs (e.g., {'anon': True} for public S3) to ensure accurate reading.
  warnings.warn(
100%|████████████████████████████████████████████████████████████████| 4/4 [06:58<00:00, 104.65s/it]
$ ls EMT_data_analysis/results/                                   
feature_extraction  figures  metric_computation  nuclei_localization

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to correct the code or readme. Right now they would be saved in figures/3D Renders

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants