Compiling a list of pipeline improvements that could be made in the future, if time allows:
- Update Snakefile to run git submodule add to get river-dl code into 03b_model/src directory
- Use dask to distribute the work in fetch_coawst_model.py
- Set up a Docker container and push that image to dockerhub to run our pipeline in
- Get Docker container running on Tallgrass
- Get Snakemake/S3 connection working so inputs/outputs can come directly from S3
- Use snakemake inputs/outputs/params in python script instead of having them hardcoded in script
- Review Snakefiles for other improvements, based on information gained in Snakemake tutorial writing process