Loading and alignment of of P1-P4 proteints: p1_p4.ipynb 
Extracting segments: translocation_finder.py 
Segmentation of PASTORs into VRs and YY dips, and featurization of VRs: YetAnotherYYSegmenter.ipynb 
Sequence -> signal model: squiggler.py 
Bayesian-based segmentation algorithm: chunkation.py 
ClpX stepping analysis: clpx_stepping_analysis.ipynb 
Random Forest training, as used in rereads_simulation.ipynb: humpsClassifier.py 
CNN training: cnn_training.py 
Reread evaluation: rereads_simulation.ipynb 
Barcode space decoding accuracy evaluation: encoding_decoding.ipynb 
Deamidation analysis:  n_to_d_analysis.ipynb
Example raw .fast5 file (for PASTOR-AVLIM): DESKTOP_CHF4GRO_20221220_FAV72770_MN40387_sequencing_run_12_20_22_run04_a.fast5 
Segmented raw and processed P1-P4 signals: yy_mutants.json 
Ordering of channels in the DTW-distance features, as created in YetAnotherYYSegmenter.ipynb and used in humpsClassifier:channels_arr.npy
Segmented raw and processed PASTOR signals: pretty_segments_df.json 
Segmented raw and processed PASTOR-VGDNY signals in deamidation catalyzing conditions: n_to_d_segments_df.json
Manually labeled YY dips for ClpX stepping analysis: pretty_df.json 
Reread simulation results, as created in rereads_simulation.ipynb and used in encoding_decoding.ipynb: rereads_acc.npy 
Barcode accuracy evaluation results, as created in encoding_decoding.ipynb: all contents in barcode_results 
Segmented raw folded domain signals:
- Amyloid Beta 15: segments_df_beta_15.json
- Amyloid Beta 42: segments_df_beta_42.json
- Titin: segments_df_titin_vp15.json
- dTitin: segments_df_titin_vp15ee.json
Segmented raw folded domain signals with the second (N-terminal) half of the PASTOR context:
- Amyloid Beta 15: second_beta_15_segs_df.json
- Amyloid Beta 42: second_beta_42_segs_df.json
- Titin: second_titin_segs_df.json
- dTitin: second_titin_ee_segs_df.json
- Install miniconda: https://docs.anaconda.com/free/miniconda/miniconda-install/
- Run conda env create -f environment.ymlShould take ~20 minutes Code has only been tested on versions specified in the yml file and on MacOS
Expected results of files should match (barring variability from randomness) data seen in rereads_acc.npy, channels_arr.npy, pretty_segments_df.json, the images created and saved within .ipynb files, and the results seen in the manuscript. All code should take <5 min to run, unless otherwise specified in the comments (e.g. pairwise DTW comparison YetAnotherYYSegmenter.ipynb, reread simulation).