Skip to content

X-lab-3D/pmhc-diffusion-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pMHC diffusion model

This diffusion model network predicts the structure of peptides (p) within the pocket of a major histocompatibility complex (MHC).

  1. It predicts C-alpha positions from gaussian noise.
  2. It predicts the orientation of the N, C, C-beta atoms around C-alpha from any random rotation quaternion.
  3. It predicts the torsion angles of the side chain atoms from any random angle.

dependencies

  • Pytorch 2.0.0
  • h5py 3.11.0
  • Numpy 2.0.0
  • BioPython 1.84
  • OpenFold 0.0.1

input data

The input format is HDF5 like in SwiftMHC. The output of SwiftMHC preprocessing can be used as input data. (https://github.com/x-lab-3d/swiftmhc)

Format:

HDF5 file:
 |
 + -- complex 1:
 |     + -- name (str)
 |     |
 |     + -- peptide
 |     |     |
 |     |     + -- backbone_rigid_tensor (P x 4 x 4)
 |     |     + -- aatype (P)
 |     |     + -- sequence_onehot (P x 22)
 |     |     + -- torsion_angles_sin_cos (P x 7 x 2)
 |     |     + -- torsion_angles_mask (P x 7)
 |     |
 |     + -- protein
 |           |
 |           + -- backbone_rigid_tensor (M x 4 x 4)
 |           + -- aatype (M)
 |           + -- sequence_onehot (M x 22)
 |           + -- atom14_gt_positions (M x 14 x 3)
 |           + -- atom14_gt_exists (M x 14)
 |           + -- cross_residues_mask (M)
 + -- complex 2:
       + ....

Where M is the number of amino acids in the MHC and P is the number of amino acids in the peptide.

training

To train for a 100 epochs with 1000 noise steps (default):

$ python optimize.py train_set.hdf5 100 model.pth

A pretrained model is already included with this repository. It's named model.pth.

testing

To make a pretrained model sample structures with 1000 noise steps (default) for an unseen test set:

$ python test.py model.pth test_set.hdf5

This will automatically create a directory named test_set-sampled. The structures will be stored in this directory.

About

Code to build diffusion models for pMHC structural data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages