Skip to content

pymc-labs/homepy

Repository files navigation

homepy

Build Code style: black License

What does homepy do for you?

We want to make it easy to model large hierarchically structured, longitudinal datasets that are ragged and can potentially have effects that occur with different temporal resolution (e.g. within a day or across multiple weeks).

Time series modeling

Modeling how the passage of time affects observations is key to extracting valuable insights from longitudinal studies. homepy aims to help in this endeavor by providing a set of tools to be able to create time series models of observations. The key feature of homepy in this respect is to be able to flexibly model multiple time scales simultaneously. To illustrate this, let's look at the cartoon figure shown below

multi time scale image

where a longitudinal study was carried out across several weeks (long time scale). At each of the week recordings, activity was measured throughout the day (short time scale). It's plain to see in this cartoon, that there is some short time scale modulation of the activity (starts out at zero, peaks in the middle and then dies back down) and that there is also a long time scale modulation of the activity (the absolute value of activity drops and the short time modulation's amplitude also becomes smaller). If a model only focuses on what happens in the long time scale, the modulation information of the short time scale is ignored. On the other hand, if a model focuses only on the short time scale, the long scale effect is lost. What homepy aims to do is to build models that can cope with both time scales simultaneously.

To do so, homepy provides a way to build relatively flexible Gaussian Process models for short time scales (e.g. the recordings of multiple days) and some simple linear or quadratic models for long time scales (e.g. the effect of the passage of weeks across recordings). However, homepy is not limited to these models and you can build custom models as you deem necessary, to be able to adapt it to your specific needs.

Modeling complex nested and ragged hierarchical datasets

Sometimes the data that you collect or have to work with can be parsed into a hierarchical structure. For example, you collect data from multiple groups. Each group is made up of many individuals, and these are gathered into small collectives (e.g. in animal experiments, they could inhabit the same cage). This kind of data shows a clear hierarchical pattern, that will condition the final observations:

groups -> cages -> individuals

To be able to learn meaningful insights from the data, you have to take into account the nested hierarchical nature of the data collection process. Preclinpack is designed to make it easy for you to build models that take into account the potentially complex hierarchical structure in your data.

There are many packages available to model hierarchical data, but homepy is specifically designed to tackle ragged hierarchies. The hierarchical data structure can be viewed as a tree with many branches.

Ragged hierarchy

As one moves down the levels in the tree, each node can branch out producing multiple descendants. Ragged hierarchies imply that the nodes at a given depth in the tree, don't branch out into the same number of descendants. Preclinpack can efficiently traverse complex ragged hierarchies producing models that exploit the nested hierarchical structure efficiently.

Installation

You can do a basic install of homepy using pip as

pip install git+https://github.com/pymc-labs/homepy.git

However, to get the most out of the package, we strongly recommend that you also install jax and dask. In our github repository, we provide a conda environment file to help you install all of the optional dependencies and get the package ready to sample using GPUs.

Installation using conda environment.yml

  1. Clone homepy repository (skip this step if it was already cloned)

    git clone [email protected]:pymc-labs/homepy.git

  2. Go to cloned homepy repository directory

    cd homepy

    To update repository if it was cloned some time ago:

    git pull

  3. Install new conda environment (or update it if you already created it)

    conda env create --file environment.yml

    To update, instead of recreating the same environment, you can run the command:

    conda env update --file environment.yml --prune

  4. Activate new conda environment

    conda activate homepy

  5. Install homepy

    python -m pip install --editable .

Testing GPU support

We can run inference on the GPU using jax. To check if you can use your GPU for sampling you need to do the following:

import jax
from jax.lib import xla_bridge

print(jax.devices())
assert xla_bridge.get_backend().platform == "gpu"

If jax.devices() does not show GpuDevice or the assertion fails, you might need to install a specific jax and jaxlib version with support for your GPU. For example, if you have CUDA version 11.0 and CuDNN 8.2, then you will need to run

pip install "jax[cuda11_cudnn82]" -f \
https://storage.googleapis.com/jax-releases/jax_cuda_releases.html \
cuda version

Please check the jax documentation for more information.

Once the two lines above work, you can test that homepy can access the GPU by running

from homepy.tests.test_gpu_sampling import test_gpu_sampling
test_gpu_sampling()

If the above runs without errors, then you are ready to run inference on your GPU!

About

Package for analysing home cage monitoring data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 7