Common Python functions and classes used by FlyBase developers at Harvard.
pip install -e git+https://github.com/FlyBase/harvdev-utils.git@master#egg=harvdev_utils
... and don't forget to the requirements for this module before use.
pip install -r requirements.txt
- Detailed information for some functions can be found in the Read the Docs documentation. This documentation does not include information regarding SQLAlchemy classes and functions (see below).
harvdev_utilscontains two sets of SQLAlchemy classes for use with FlyBase Harvard'sproductionandreportingdatabases. The class names correspond to tables within the Chado database and serve as an integral part of writing SQLAlchemy code.- To use these classes, include the appropriate imports at the top of your Python module:
- When using production or reporting individually (the classes share overlapping names, so only use this approach if
production/reportingqueries are not written together in the same module):from harvdev_utils.production import *from harvdev_utils.reporting import *
- When using production or reporting both within the same module:
from harvdev_utils import production as prodfrom harvdev_utils import reporting as rep- Code can then be written by prefixing the classes as appropriate when calling them, e.g.
prod.Feature,rep.Feature,rep.Pub,prod.Cvterm, etc.
- When using production or reporting individually (the classes share overlapping names, so only use this approach if
-
harvdev_utilscontains a set of commonly used Chado-SQLAlchemy functions:harvdev_utils.chado_functions.get_or_create- This function allows for values to be inserted into a specific Chado table. If the values already exist in the table, nothing is inserted. If the table uses
rank, therankvalue is automatically incremented and the values are always inserted. - Example import:
from harvdev_utils.chado_functions import get_or_create - The function as defined in the module:
def get_or_create(session, model, **kwargs) - Required fields:
session: Your SQLAlchemy session.model: The model (aka table) where you'd like to insert data.kwargs**: Values used to look up the appropriate row of a table to insert the data. Please see the example below.
- The function always returns two variables: the first is an sql alchemy object (equivalent to a row in a table) and the second is
True(if a new entry was created) orFalse(if an existing entry was retrieved). - Column values in the returned sql alchemy object can be accessed as such:
uniquename = my_returned_object.uniquename - The debugging level is set to
INFOby default and can be changed toDEBUGby using the following line in your script where appropriate:logging.getLogger('harvdev_utils.chado_functions.get_or_create').setLevel(logging.DEBUG)
- This function allows for values to be inserted into a specific Chado table. If the values already exist in the table, nothing is inserted. If the table uses
-
The dev_readme.md file contains instructions for regenerating SQLAlchemy classes.
-
Please use PEP8 whenever possible.
-
Docstrings should follow Google's style guides (Sphinx guide, additional example 1, additional example 2) and are used to generate Read the Docs documentation.
-
Tests should be written for each non-trivial function. Please see the
testsfolder for examples. We're using pytest via Travis CI for testing. Tests can be run locally with the commandpython -m pytest -vfrom the root directory of the repository (-vflag is optional).
- Please branch from develop and merge back via pull request.
- Merges from develop into master should coincide with a new release tag on master and a version increment.
- The file
docs/index.rstshould be updated after a new module is added. Theautomodulecommand will automatically pull in information for specified modules once the code is pushed to GitHub. Please see the automodule documentation for help.
- Clone the repository and branch off develop.
- Navigate to the directory
harvdev_utilsand use an existing folder (e.g.char_conversions) or create a new folder based on the goal of your module. - Create a single python file containing a function to be used. Feel free to add multiple functions to a single python file if you feel it's appropriate.
- Be sure to add an entry to the
__init__.pyfile in the folder where you're working.- e.g.
from .unicode_to_plain_text import unicode_to_plain_text
- e.g.
- Update the file
__init__.pyinharvdev_utilsand add your function to the list of default loaded functions. If the folder you are using does not exist at the top of the file, be sure to import it.- e.g.
from .char_conversions import *
- e.g.
- Navigate to the
testsfolder and create a new sub-folder if you're not using a currently deployed folder (i.e. if you're usingchar_conversions, the folder already exists). - Create your
testpython file with the prefixtest_.- e.g.
test_sgml_to_plain_text.py
- e.g.
- Tests can be run locally with
python -m pytestfrom the root directory of the repository. - Edit the file
docs/index.rstand be sure the folder that you're using is listed as an automodule.- e.g.
.. automodule:: harvdev_utils :members:
- e.g.
- Additional text can be added to
docs/index.rstas necessary. We can restructure this file if it becomes too long / complex. - Push your branch to GitHub and open a PR to develop when ready.
- A subsequence merge to master and tagged release can be coordinated with other devs when appropriate.