SugarSwitch🪄

SugarSwitch is a pipeline for protein N-glycosylation modification.

📄 Paper: This is the title for the article
🌐 Webserver: This is the name for the SugarSwitch webserver

Quick start

Installation

Clone the repository

git clone https://github.com/EricXY0706/sugarswitch.git
cd sugarswitch

Run the automated setup procedure
```
chmod +x setup.sh
./setup.sh
```

⚠️ Notice: Foldseek binary need to be downloaded from here and place it in the ./SaProt/foldseek folder.

✅ This procedure will automatically:

Create and initiate sugarswitch conda environment with Python 3.10

Install all the required dependencies

Download SaProt and SPIRED weights

Download and install PyRosetta and DSSP

Usage

Update the configrations in config.py

Must-ToDos:
➡️ Manually specify the protein chain ID to be modified with protein_chain_id and the important sites which you think the modification pipeline should ignore with functional_hotspots in user_configs

# An example

user_configs = {
 "protein_chain_id": "A", # Specify the chain ID to be modified and glycosylated.
 "functional_hotspots": [1,2,3], # e.g. [1,2,3,"4-10"], if an interval is input, both the lower and the upper bond are included.
}

Optional-ToDos:

pipeline configurations:
⚠️ We recommand not adjusting the EVCouplings related configs (EVC, unless you are familiar with the parameters) and empirically pre-defined glycan chain topological parameters (bond length, angles, and dihedrals).
✅ Conservation_threshold, evc_coupling_threshold, and sasa_cutoff can be appropriately adjusted.

# An example

pipeline_configs = {
 "Cb_interaction_threshold": 6.0,
 "num_neighbors_to_shield": 3,
 "evc_min_sequence_distance": 6,
 "evc_theta": 0.8,
 "evc_num_iterations": 100,
 "evc_lambda_h": 0.01,
 "evc_lambda_J": 0.01,
 "evc_num_cpu": 10,
 "conservation_threshold": {"loop": 0.5, "ss": 0.5}, # The residues on loop region/other secondary structures with conservation value above the given threshold will be discarded from the prefilter. The higher the thresholds, the less strict the filtering.
 "evc_coupling_threshold": 0.5, # The residues with co-evolving strength above the given threshold will be discarded from the prefilter. The higher the threshold, the less strict the filtering.
 "sasa_cutoff": 0.5, # The residues with SASA values falling in the last n% will be discarded from the prefilter. The higher the threshold, the more strict the filtering.
 "bond_length_C1_ND2": 1.43,
 "angle_C1_ND2_CG": 120.0,
 "angle_C2_C1_ND2": 109.5,
 "angle_O5_C1_ND2": 109.5,
 "dihedral_C1_ND2_CG_CB": {"loop": -120.0, "helix": -180.0, "sheet": 180.0},
 "dihedral_C2_C1_ND2_CG": {"loop": 120.0, "helix": 140.0, "sheet": 100.0},
 "dihedral_O5_C1_ND2_CG": {"loop": -120.0, "helix": -100.0, "sheet": -140.0},
}

ranker configurations:
➡️ Manually set the weights for all the factors for glycosylation sites ranking. The higher the value, the more important the factor is.

# An example

ranker_configs = {
  "conservation_weight": 1.0, # Conservation level. The next 2 values are computed by EVCouplings.
  "coupling_weight": 0.5, # Co-evolving strength.
  "sasa_weight": 1.0, # SASA of the current site. The next 5 values are computed by PyRosetta.
  "sasa_next1_weight": 0.7, # SASA of the next site to the current site.
  "sasa_next2_weight": 0.5, # SASA of the next 2nd site to the current site.
  "sasa_around_weight": 1.0, # Mean SASA value of the triad sequon around the current site.
  "sasa_next_weight": 0.6, # Mean SASA value of the triad sequon following the current site.
  "ddG_weight": 0.3, # ΔΔG of the protein mutant (NXX) and WT. The next 6 values are computed by SPIRED.
  "dTm_weight": 0.3, # ΔTm of the protein mutant (NXX) and WT
  "ddG_S_weight": 0.2, # ΔΔG of the protein mutant (NXS) and WT
  "dTm_S_weight": 0.2, # ΔTm of the protein mutant (NXS) and WT
  "ddG_T_weight": 0.2, # ΔΔG of the protein mutant (NXT) and WT
  "dTm_T_weight": 0.2, # ΔTm of the protein mutant (NXT) and WT
  "mut_score_weight": 0.3, # Mutation score of the protein mutant (NXX). The next 3 values are computed by SaProt.
  "mut_score_S_weight": 0.2, # Mutation score of the protein mutant (NXS).
  "mut_score_T_weight": 0.2, # Mutation score of the protein mutant (NXT).
 }

Run the pipeline
Run the pipeline with the following command:

python run_sugarswitch.py prefilter --input [input_fasta_file] --out_dir [output_dir]

⚠️ The input file in FASTA format should contain at least one sequence. Please note that the chain ID specified in protein_chain_id in the config.py should correspond to the number and order of the chains

# An example
# In this case, test_seq_1 will be chain A-C, and test_seq_2 will be chain D

>test_seq_1 copies: 3
MSSQIRQNYSTDVEAAVNSLVNLYLQASYTYLSLGFYFDR...
>test_seq_2 copies: 1
WNPPTFSPALLVVTEGDNATFTCSFSNTSESFVLNWYRMS...

Acknowledgements

Hopf, Thomas A., et al. "The EVcouplings Python framework for coevolutionary sequence analysis." Bioinformatics 35.9 (2019): 1582-1584.
Chaudhury, Sidhartha, Sergey Lyskov, and Jeffrey J. Gray. "PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta." Bioinformatics 26.5 (2010): 689-691.
Chen, Yinghui, et al. "An end-to-end framework for the prediction of protein structure and fitness from single sequence." Nature Communications 15.1 (2024): 7400.
Su, Jin, et al. "Saprot: Protein language modeling with structure-aware vocabulary." BioRxiv (2023): 2023-10.

Licence & Citation

License: MIT License - See LICENSE file for details
Citation: If you use SugarSwitch in your research, please cite:

@article{xxx,
  title={title},
  author={authors},
  journal={journal},
  pages={pages},
  year={2025},
  publisher={publisher}
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
SaProt		SaProt
Spired		Spired
boltz_ckpt		boltz_ckpt
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
run_sugarswitch.py		run_sugarswitch.py
setup.sh		setup.sh
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SugarSwitch🪄

Quick start

Installation

Usage

Acknowledgements

Licence & Citation

About

Uh oh!

Releases

Packages

Languages

License

EricXY0706/sugarswitch

Folders and files

Latest commit

History

Repository files navigation

SugarSwitch🪄

Quick start

Installation

Usage

Acknowledgements

Licence & Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages