Skip to content

JHoeflich1/Databank

 
 

Repository files navigation

NMRlipids Databank

This is the NMRlipids databank — a community-driven catalogue containing atomistic MD simulations of biologically relevant lipid membranes emerging from the NMRlipids open collaboration.

Documentation

The NMRlipids databank documentation is available in here. More information and example applications are available from the NMRlipids databank manuscript.

API

The DatabankLib python module provides programmatic access to all simulation data in the NMRlipids Databank. This enables wide range of novel data-driven applications — from construction of machine learning models that predict membrane properties, to automatic analysis of virtually any property across all simulations in the Databank.

NMRlipids Databank-API documentation is available in here.

How to use

A template can be used to get started with the analyses utilizing the NMRlipids databank. Codes that analyze area per lipid, C-H bond order parameters, X-ray scattering form factors, and principal component equilibration are also available as examples.

Connection of Universal molecule and atom naming conventions with simulation specific names delivered by mapping files can be used to perform automatic analyses over large sets of simulations. The results for large analyses can be stored using the same structure as used for README.yaml files as done, for example, for water permeation and lipid flip-flop rates in the repository related to the NMRlipids databank manuscript.

GUI

NMRlipids Databank-GUI provides easy access to the NMRlipids Databank content through a graphical user interface (GUI). Simulations can be searched based on their molecular composition, force field, temperature, membrane properties, and quality; the search results are ranked based on the simulation quality as evaluated against experimental data when available. Membranes can be visualized, and properties between different simulations and experiments compared.

Installation

The code has been tested in Linux and MacOS environment with python 3.9 or newer. Recent Gromacs version should be available in the system. All dependecies are listed in requirements.txt.

Note that the data is stored as a submodule repository and should be loaded after clonning. Default data storage is BilayerData, and it is loaded automatically by using

$ git submodule update --init --remote

We recomend installing python libraries into an environment, for example, using conda:

 $ conda create --name databank python==3.10 'numpy<2.0' MDAnalysis periodictable -c conda-forge
 $ conda activate databank
 $ (databank) conda install tqdm yaml -c conda-forge

You should also activate DatabankLib package:

 $ (databank) cd Databank
 $ (databank) pip install -e .

You can install the package in non-development mode, without -e (it's obligatory in Colab runtime environment); however, in this case, the package will be installed to the folder with other pip-packages and it will not know about the path to the Data folder. Then you should provide the path to the repository root by setting the environment variable NMLDB_ROOT_PATH.

Contribution

The project is open for contributions! For code development, please use extended requirements described in requirements-dev.txt:

 $ (databank) pip install -e . -r Scripts/DatabankLib/requirements-dev.txt

It will install pytest for unit tests and flake8 for syntax check.

About

NMRlipids databank

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.1%
  • Other 1.9%