We want to make it easy to model large hierarchically structured, longitudinal datasets that are ragged and can potentially have effects that occur with different temporal resolution (e.g. within a day or across multiple weeks).
Modeling how the passage of time affects observations is key to extracting valuable insights from longitudinal studies. homepy aims to help in this endeavor by providing a set of tools to be able to create time series models of observations. The key feature of homepy in this respect is to be able to flexibly model multiple time scales simultaneously. To illustrate this, let's look at the cartoon figure shown below
where a longitudinal study was carried out across several weeks (long time scale). At each of the week recordings, activity was measured throughout the day (short time scale). It's plain to see in this cartoon, that there is some short time scale modulation of the activity (starts out at zero, peaks in the middle and then dies back down) and that there is also a long time scale modulation of the activity (the absolute value of activity drops and the short time modulation's amplitude also becomes smaller). If a model only focuses on what happens in the long time scale, the modulation information of the short time scale is ignored. On the other hand, if a model focuses only on the short time scale, the long scale effect is lost. What homepy aims to do is to build models that can cope with both time scales simultaneously.
To do so, homepy provides a way to build relatively flexible Gaussian Process models for short time scales (e.g. the recordings of multiple days) and some simple linear or quadratic models for long time scales (e.g. the effect of the passage of weeks across recordings). However, homepy is not limited to these models and you can build custom models as you deem necessary, to be able to adapt it to your specific needs.
Sometimes the data that you collect or have to work with can be parsed into a hierarchical structure. For example, you collect data from multiple groups. Each group is made up of many individuals, and these are gathered into small collectives (e.g. in animal experiments, they could inhabit the same cage). This kind of data shows a clear hierarchical pattern, that will condition the final observations:
To be able to learn meaningful insights from the data, you have to take into account the nested hierarchical nature of the data collection process. Preclinpack is designed to make it easy for you to build models that take into account the potentially complex hierarchical structure in your data.
There are many packages available to model hierarchical data, but homepy is specifically designed to tackle ragged hierarchies. The hierarchical data structure can be viewed as a tree with many branches.
As one moves down the levels in the tree, each node can branch out producing multiple descendants. Ragged hierarchies imply that the nodes at a given depth in the tree, don't branch out into the same number of descendants. Preclinpack can efficiently traverse complex ragged hierarchies producing models that exploit the nested hierarchical structure efficiently.
You can do a basic install of homepy using pip as
pip install git+https://github.com/pymc-labs/homepy.gitHowever, to get the most out of the package, we strongly recommend that you also install jax and dask. In our github repository, we provide a conda environment file to help you install all of the optional dependencies and get the package ready to sample using GPUs.
-
Clone homepy repository (skip this step if it was already cloned)
git clone [email protected]:pymc-labs/homepy.git -
Go to cloned homepy repository directory
cd homepyTo update repository if it was cloned some time ago:
git pull -
Install new conda environment (or update it if you already created it)
conda env create --file environment.ymlTo update, instead of recreating the same environment, you can run the command:
conda env update --file environment.yml --prune -
Activate new conda environment
conda activate homepy -
Install homepy
python -m pip install --editable .
We can run inference on the GPU using jax. To check if you can use your GPU for sampling you need to do the following:
import jax
from jax.lib import xla_bridge
print(jax.devices())
assert xla_bridge.get_backend().platform == "gpu"If jax.devices() does not show GpuDevice or the assertion fails, you might need to install a specific jax and jaxlib version with support for your GPU. For example, if you have CUDA version 11.0 and CuDNN 8.2, then you will need to run
pip install "jax[cuda11_cudnn82]" -f \
https://storage.googleapis.com/jax-releases/jax_cuda_releases.html \
cuda versionPlease check the jax documentation for more information.
Once the two lines above work, you can test that homepy can access the GPU by running
from homepy.tests.test_gpu_sampling import test_gpu_sampling
test_gpu_sampling()If the above runs without errors, then you are ready to run inference on your GPU!



