This Python package provides methods for accessing and analysing data from MalariaGEN.
The malariagen_data Python package is available from the Python
package index (PyPI) and can be installed via pip, e.g.:
pip install malariagen-dataDocumentation of classes and methods in the public API are available from the following locations:
See GitHub releases for release notes.
To get setup for development, see this video if you prefer VS Code, or this older video if you prefer PyCharm, and the instructions below.
Fork and clone this repo:
git clone [email protected]:[username]/malariagen-data-python.gitInstall Python, e.g.:
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.10 python3.10-venvInstall pipx, e.g.:
python3.10 -m pip install --user pipx
python3.10 -m pipx ensurepathInstall poetry, e.g.:
pipx install poetryCreate development environment:
cd malariagen-data-python
poetry use 3.10
poetry installActivate development environment:
poetry shellInstall pre-commit and pre-commit hooks:
pipx install pre-commit
pre-commit installRun pre-commit checks (isort, black, blackdoc, flake8, ...) manually:
pre-commit run --all-filesRun fast unit tests using simulated data:
poetry run pytest -v tests/anophTo run legacy tests which read data from GCS, you'll need to request access to MalariaGEN data on GCS.
Once access has been granted, install the Google Cloud CLI. E.g., if on Linux:
./install_gcloud.shYou'll then need to obtain application-default credentials, e.g.:
./google-cloud-sdk/bin/gcloud auth application-default loginOnce this is done, you can run legacy tests:
poetry run pytest --ignore=tests/anoph -v testsTests will run slowly the first time, as data required for testing will be read from GCS. Subsequent runs will be faster as data will be cached locally in the "gcs_cache" folder.
Create a new GitHub release. That's it. This will automatically trigger publishing of a new release to PyPI and a new version of the documentation via GitHub Actions.
The version switcher for the documentation can then be updated by
modifying the docs/source/_static/switcher.json file accordingly.
If you use the malariagen_data package in a publication
or include any of its functions or code in other materials (e.g. training resources),
please cite: doi.org/10.5281/zenodo.11173411
Some functions may require additional citations to acknowledge specific contributions. These are indicated in the description for each relevant function.
For any questions, please feel free to contact us at: [email protected]
This project is currently supported by the following grants:
This project was previously supported by the following grants: