-
Notifications
You must be signed in to change notification settings - Fork 0
Re sub/august updatefig #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: reSub-v1
Are you sure you want to change the base?
Changes from all commits
a4c3b4b
cbe3b98
2ffc720
fa90639
c341338
07ae553
73345f3
d637859
affd91f
44e74e7
40d5e90
7a5eae7
9702ca4
af2f2d2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -171,4 +171,7 @@ cython_debug/ | |
| pdm.toml | ||
|
|
||
| #csv | ||
| *.csv | ||
| *.csv | ||
|
|
||
| #mesh temporary director | ||
| emt_tmp/ | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,38 +11,43 @@ | |
| import pandas as pd | ||
| import argparse | ||
| import quilt3 as q3 | ||
| from typing import Optional | ||
|
|
||
| from EMT_data_analysis.tools import alignment, io | ||
| from EMT_data_analysis.tools import alignment, io, const | ||
|
|
||
|
|
||
| def main( | ||
| data_id: str, | ||
| output: str | ||
| data_id: Optional[str]=None, | ||
| output: Optional[str]=None | ||
| ): | ||
| ''' | ||
| Generate three figures for the inside-outside classification of nuclei | ||
| at 0, 16, and 32 hours. | ||
| Parameters | ||
| ---------- | ||
| mesh_fn: str | ||
| Path to the .vtm file for the whole colony timelapse. | ||
| mid: str | ||
| data_id: str | ||
| Data ID of the movie. | ||
| data_csv: str | ||
| Path to the CSV file containing the inside-outside classification data. | ||
| output: str | ||
| Path to the output directory where the figures will be saved. | ||
| ''' | ||
| # ensure output directory exists | ||
| output = Path(output) | ||
| output.mkdir(exist_ok=True, parents=True) | ||
|
|
||
| if data_id is None: | ||
| data_id = const.EXAMPLE_IO_ID | ||
|
|
||
| if output is None: | ||
| output = io.setup_base_directory_name("figures/Inside-Outside/mesh-figures") | ||
| else: | ||
| output = Path(output) | ||
| output.mkdir(exist_ok=True, parents=True) | ||
|
|
||
| # load data | ||
| df_meta = io.load_imaging_and_segmentation_dataset() | ||
| df_meta = df_meta[df_meta['Data ID'] == data_id] | ||
| df = io.load_inside_outside_classification() | ||
| df = df[df['Data ID'] == data_id] | ||
| df = df[df['Z']<27] | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When I try to run this script it hits an error here. $ pdm run EMT_data_analysis/figure_generation/inside-outside_classification.py
Total number of movies in the dataset: 3491
Traceback (most recent call last):
File "/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc
return self._engine.get_loc(casted_key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc
File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Z'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/figure_generation/inside-outside_classification.py", line 165, in <module>
main(args.data_id, args.output)
File "/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/figure_generation/inside-outside_classification.py", line 50, in main
df = df[df['Z']<27]
~~^^^^^
File "/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/pandas/core/frame.py", line 4102, in __getitem__
indexer = self.columns.get_loc(key)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3812, in get_loc
raise KeyError(key) from err
KeyError: 'Z'
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as the previous similar error, the quilt manifest needs to be updated to include this column |
||
|
|
||
| tmp_dir = Path("./emt_tmp/nuclei_localization/") | ||
| tmp_dir.mkdir(exist_ok=True, parents=True) | ||
|
|
@@ -146,14 +151,12 @@ def create_nucleus_mesh(df_nucleus: pd.DataFrame): | |
| parser = argparse.ArgumentParser(description='Generate figures for inside-outside classification of nuclei.') | ||
| parser.add_argument( | ||
| '--data_id', | ||
| type=str, | ||
| default='3500005828_45', | ||
| help='FMS ID of the movie.' | ||
| type=str, | ||
| help='Data ID of the movie.' | ||
| ) | ||
| parser.add_argument( | ||
| '--output', | ||
| type=str, | ||
| required=True, | ||
| help='Path to the output directory where the figures will be saved.' | ||
| ) | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -44,7 +44,35 @@ This will generate CSV for individual nuclei classified as inside the basement m | |||||
|
|
||||||
| Run: `python Analysis_tools.py` | ||||||
|
|
||||||
| This will generate the plots in the manuscript and store them in `results/figures` folder. The manifests used as inputs in this workflow are automatically downloaded from [AWS](https://open.quiltdata.com/b/allencell/tree/aics/emt_timelapse_dataset/manifests/) by default. The user can opt to also use local version of these manifests if they produced locally by running the scripts `Feature_extraction.py`, `Metric_computation.py` and `Nuclei_localization.py`. To use local version of the manifests, please set `load_from_aws=False` everywhere in the script `Analysis_plots.py`. | ||||||
| This will generate the plots in the manuscript and store them in `results/figures` folder. The manifests used as inputs in this workflow are automatically downloaded from [AWS](https://open.quiltdata.com/b/allencell/tree/aics/emt_timelapse_dataset/manifests/) by default. | ||||||
|
|
||||||
| ## 5 - [Optional] 3D Example Rendering | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I was under the impression that all the steps are optional? Do the other steps depend on the results of previous steps?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Technically no, as all of the code pulls from the quilt dataset, but if the user were to process their own data each step would be dependent on the previous
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Gotcha. If you want to make those relationships between steps explicit, I'd recommend writing more of an introduction at the top of the README. Our goal for reproducibility for this repo is just that people can run our code on our data and produce the figures in the paper: if we want to try to support users running on their own data, there's a lot more work we have to do. |
||||||
|
|
||||||
| The functions in `EMT_data_analysis/figure_generation` can be used to generate 3D renderings shown in the paper. Functions have only been tested on Ubuntu 18.04/22.04 | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
At the top of the readme we already said our code was tested on 18.04. If some of the code doesn't work on 18.04 and needs 22.04, that's a different thing. The top of the readme specifies 18.04.2 though, and our machines have upgraded to 18.04.4 and 18.04.6, so we should update that to be accurate to how we are testing.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The code will work on both
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unless we test all the code in this repo on 22.04, I think it's easier for users to understand "all the EMT_data_analysis code was tested on 18.04" than "all the EMT_data_analysis code was tested on 18.04, and also some of the code was tested on 22.04, too." |
||||||
|
|
||||||
| On Ubuntu or Debian: | ||||||
| ```bash | ||||||
| sudo apt-get install xvfb libgl1-mesa-glx | ||||||
| ``` | ||||||
| On Windows: | ||||||
| Comment out any instance of `pv.start_xvfb()` in the code before running. | ||||||
|
|
||||||
| ### All Cells Mask | ||||||
| Run | ||||||
| ```bash | ||||||
| python colony_mask.py --data_id [Optional] --output_directory [Optional] | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. users should be in
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In that case, the instructions should specify that. However, I think it's simpler (fewer steps for the user) if we make all the instructions work from the top level of the repo. (This is also an issue with the instructions for the previous steps; I can make a PR for that.) |
||||||
| ``` | ||||||
| If no input arguments are provided, the code will default to the data shown in the paper and output results to `EMT_data_analysis/results/3D_all_cells_mask`. | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. input argument means
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, my suggestion was to provide clarity for users who might not understand that. For example, I could imagine someone leaving out the "[Optional]" pieces and running
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I ran $ pdm run EMT_data_analysis/figure_generation/colony_mask.py
/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/tools/io.py:20: DtypeWarning: Columns (0,2,4,6,7,13,18,19,20,21,22,24,25,26,28,32,40,41,46,47,48,54,55,58,68,71,72,79,80,83,85,86,88,90,93) have mixed types. Specify dtype option on import or set low_memory=False.
df = pd.read_csv(path)
Total number of movies in the dataset: 3491
/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/bioio_ome_zarr/reader.py:87: UserWarning: Warning: reading from S3 without fs_kwargs. Consider providing fs_kwargs (e.g., {'anon': True} for public S3) to ensure accurate reading.
warnings.warn(
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [06:56<00:00, 104.02s/it]
/home/philip.garrison/workspace/aics/EMT_data_analysis/EMT_data_analysis/tools/io.py:20: DtypeWarning: Columns (0,2,4,6,7,13,18,19,20,21,22,24,25,26,28,32,40,41,46,47,48,54,55,58,68,71,72,79,80,83,85,86,88,90,93) have mixed types. Specify dtype option on import or set low_memory=False.
df = pd.read_csv(path)
Total number of movies in the dataset: 3491
/home/philip.garrison/workspace/aics/EMT_data_analysis/.venv/lib/python3.11/site-packages/bioio_ome_zarr/reader.py:87: UserWarning: Warning: reading from S3 without fs_kwargs. Consider providing fs_kwargs (e.g., {'anon': True} for public S3) to ensure accurate reading.
warnings.warn(
100%|████████████████████████████████████████████████████████████████| 4/4 [06:58<00:00, 104.65s/it]
$ ls EMT_data_analysis/results/
feature_extraction figures metric_computation nuclei_localization
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. need to correct the code or readme. Right now they would be saved in |
||||||
| Data ID values are only valid inputs if they have a none-empty value for `All Cells Mask File Download` in the `image_and_segmentation_data.csv` manifest on [AWS](https://open.quiltdata.com/b/allencell/tree/aics/emt_timelapse_dataset/manifests/) | ||||||
|
|
||||||
| ### Inside-Outside Classification | ||||||
| Run | ||||||
| ```bash | ||||||
| python inside-outside_classification.py --data_id [Optional] --output_directory [Optional] | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. users should be in |
||||||
| ``` | ||||||
| If no input arguments are provided, the code will default to the data shown in the paper and output results to `EMT_data_analysis/results/Inside-Outside/mesh-figures`. | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. input argument means |
||||||
| Data ID values are only valid inputs if they have a none-empty value for `CollagenIV Segmentation Mesh Folder` in the `image_and_segmentation_data.csv` manifest on [AWS](https://open.quiltdata.com/b/allencell/tree/aics/emt_timelapse_dataset/manifests/) | ||||||
|
|
||||||
|
|
||||||
| # Contact | ||||||
| If you have questions about this code, please reach out to us at [email protected]. | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to run this script and got an error on this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because the current manifest on quilt doesn't have the coordinates of nuclei centroids. For testing I temporarily changed the manifest to point to the copy currently on VAST but it needs to be uploaded
/allen/aics/users/filip.sluzewski/Public_Repos/emt-data-analysis/resubmission_scripts/nuclei_localization/mesh_features-resegmentation.csv