CV-FEST + HLDA for Peptide Kinetics | guiding-peptide-kinetics (JCTC Snapshot)

This repository is the fixed snapshot used to reproduce the figures and analysis from our publication in the Journal of Chemical Theory and Computation (JCTC).

Paper (JCTC):

Guiding Peptide Conformational Kinetics via Collective-Variable Control of Free-Energy Barriers
https://pubs.acs.org/doi/10.1021/acs.jctc.6c00418

Preprint:

https://doi.org/10.48550/arXiv.2602.19936

Overview

We approach peptide kinetic engineering using HLDA-based collective variables within the CV-FEST framework, constructed only from short simulations confined to folded and unfolded states.

This provides a data-efficient way to model and control free-energy surfaces and barrier heights, enabling prediction of mutation-dependent kinetics and guiding rational peptide design from local fluctuations alone.luctuations alone.

Reproduce the paper flow

Create and activate the environment:

conda env create -f environment.yml
conda activate protein-fes
pip install -e .

Unpack archived data:

./scripts/unpack_data.sh

Run notebooks for paper figures:

Open notebooks in src/paper_plots/
Execute the required notebooks to regenerate plots

Data in this repo

Data is stored as split archives in data_archives/.

data_core.zip contains shared analysis assets, including preprocessed MFPT files:
- data/mfpt_threshold_summaries_ref.pkl
- data/mfpt_samples_pace25000_ref.pkl
hlda_trajectories_*.zip contains per-mutant trajectory cache data under data/hlda_trajectories/

Short description of the MFPT files used in this paper flow:

mfpt_samples_pace25000_ref.pkl: dictionary keyed by mutant, then threshold, containing per-run MFPT samples from the PACE=25000 setup (typically about 200 runs per mutant/threshold; a few entries are slightly fewer due to missing/failed runs).
mfpt_threshold_summaries_ref.pkl: dictionary keyed by MFPT threshold (lim), each value a per-mutant summary DataFrame used by notebooks (for example mfpt, lambda, tF, tU, residue_idx, property_grp, Tm, dTm, nF, nU, etc.). This summary includes HLDA-derived quantities (for example lambda, tF, tU) through hlda_lambda_grid.pkl, which is computed from data/hlda_trajectories/ via src/common/hlda_utils.py.

If you need to rebuild archives from an unpacked data/ tree:

./scripts/pack_data.sh

MFPT reproducibility options

You can reproduce MFPT-based results in two ways:

Generate MFPT samples from FPT simulations using src/fpt_plumed/ templates (for example through src/fpt_single_run.sh).
Use the preprocessed MFPT files from data_core.zip (recommended for paper reproduction):
- data/mfpt_threshold_summaries_ref.pkl
- data/mfpt_samples_pace25000_ref.pkl

How HLDA is run (`hlda_utils`)

HLDA grid generation is implemented in src/common/hlda_utils.py.

compute_lambda_grid(...) loads folded/unfolded COLVAR data for each mutant from data/hlda_trajectories/, sweeps (tF, tU) RMSD thresholds, prunes highly correlated descriptors (Spearman), and computes HLDA weights/eigenvalue (lambda) per grid point.
load_lambda_grid(...) is the notebook-facing entrypoint: it loads cached results from data/hlda_lambda_grid.pkl if present, otherwise computes and caches them.

Minimal usage pattern (same flow used by paper notebooks):

from pathlib import Path
from common.hlda_utils import load_lambda_grid

data_dir = Path("data")
lambda_grid = load_lambda_grid(
    cache_path=data_dir / "hlda_lambda_grid.pkl",
    base_dir=data_dir / "hlda_trajectories",
    force=False,
)

Set force=True to recompute the HLDA grid from raw trajectory-derived data.

Minimal relevant paths

src/paper_plots/: notebooks that generate paper plots
src/fpt_plumed/: PLUMED templates for FPT workflows
scripts/unpack_data.sh: restore data/ from *.zip archives
scripts/pack_data.sh: rebuild split data archives
data_archives/: committed paper snapshot data archives

Citation

See CITATION.cff for software and paper citation metadata.

License

Code: MIT (LICENSE)
Paper/manuscript materials: CC BY-NC-ND 4.0 (LICENSE-paper)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CV-FEST + HLDA for Peptide Kinetics | guiding-peptide-kinetics (JCTC Snapshot)

Overview

Reproduce the paper flow

Data in this repo

MFPT reproducibility options

How HLDA is run (`hlda_utils`)

Minimal relevant paths

Citation

License

About

Licenses found

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 171 Commits
data		data
data_archives		data_archives
scripts		scripts
src		src
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
LICENSE-paper		LICENSE-paper
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

CV-FEST + HLDA for Peptide Kinetics | guiding-peptide-kinetics (JCTC Snapshot)

Overview

Reproduce the paper flow

Data in this repo

MFPT reproducibility options

How HLDA is run (hlda_utils)

Minimal relevant paths

Citation

License

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

How HLDA is run (`hlda_utils`)

Packages