Skip to content

Conversation

@bviggiano
Copy link

@bviggiano bviggiano commented Dec 14, 2025

Hello! This branch is an implementation that I built for a project I'm working on. I've found it really helpful for programmatic use, and thought it might be a nice addition to the codebase. I tried to make very minimal changes to the core implementation. Happy to clean this up or make changes based on feedback!

Additions

  1. Wrapper Class
    The primary addition is a wrapper class (in protein_mpnn.py) which provides a stateful API that separates model initialization from inference:
  • Model weights are loaded during init, then reused across multiple calls
  • Flexible input: Accepts either PDB file paths OR PDB content as strings (no disk I/O required)
  • Same functionality as CLI

Example usage:

from protein_mpnn import ProteinMPNN

model = ProteinMPNN(model_name='v_48_020', device='cuda')

# From file path
results = model.sample(pdb_path_or_str='protein.pdb', num_seq_per_target=10)

# From PDB string (no file I/O)
pdb_content = fetch_pdb_from_database()  # or any string source
results = model.sample(pdb_path_or_str=pdb_content, num_seq_per_target=10)
  1. Pip install
    I also added the option to pip install the package, which automatically downloads the weights to the correct folders so users can import the wrapper class without needing to clone.

Modifications

The only change to the original source code is to protein_mpnn_utils.py, where I modified parse_PDB() to accept PDB content strings in addition to file paths (backward compatible)

Tests

Added some tests to ensure that the pip installation works, and added a CI workflow test to ensure changes people push to the repo do not break this behavior

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant