miRBind 2.0

Deep-learning models for predicting miRNA–mRNA interactions.

This repository ships two models:

Pairwise binding-site model — a CNN that predicts whether a given miRNA binds a given target site (≈50 nt window). Use this to score candidate binding sites.
Gene-level repression model — predicts the gene-level fold change a miRNA induces from a full 3'UTR sequence. Built on top of the binding-site model via transfer learning.

Installation

Clone the repo and install the dependencies (Python ≥ 3.9, PyTorch ≥ 1.9):

git clone https://github.com/BioGeMT/miRBind_2.0.git
cd miRBind_2.0
pip install -r code/pairwise_binding_site_model/requirements.txt

A GPU is recommended but not required — the models will fall back to CPU automatically.

Quick start: predicting miRNA binding sites

The trained binding-site model is included in models/pairwise_onehot_model_20260105_200141.pt.

1. Prepare your input

A TSV file with at minimum these columns:

column 0 (target/mRNA)	column 1 (miRNA)	label
`TTTTTTTT...GACAGTGG`	`TGTGCAAATCTATGCAAAACTGA`	0

The label column is required by the data loader but is ignored at inference. Set it to 0 if you don't have ground truth. A small example is provided in data/chimeric_datasets/sample_dataset/.

2. Run inference

cd code/pairwise_binding_site_model

python -m inference.predict \
    --model_path ../../models/pairwise_onehot_model_20260105_200141.pt \
    --input_file path/to/your_sites.tsv \
    --output_file predictions.tsv \
    --model_type pairwise_onehot \
    --batch_size 32

The output TSV is your input plus two columns:

prediction_score — binding probability in [0, 1]
predicted_class — 1 if prediction_score > 0.5, else 0

There is also a ready-to-edit wrapper script at analysis/pairwise_binding_site_model/inference.sh.

Quick start: predicting gene-level repression

See analysis/gene_level_model/README.md for the full walkthrough. Briefly:

# install gene-level model dependencies
pip install -r analysis/gene_level_model/requirements.txt

# download the training/eval data
bash analysis/gene_level_model/download_data.sh

# evaluate on a test set (or train your own — see analysis/gene_level_model/train.sh)
bash analysis/gene_level_model/evaluate.sh

The gene-level model takes a full 3'UTR sequence (up to several thousand nt) and a miRNA sequence and predicts a scalar fold change.

Explainability

The binding-site model supports SHAP-based attribution (via Captum's GradientShap). See code/pairwise_binding_site_model/README.md for the SHAP, clustering, and aggregation pipelines.

Downloading the public datasets

To reproduce the published results or train from scratch:

bash data/scripts/run_zenodo_downloader.sh

This pulls the AGO2 eCLIP Manakov 2022 train / test / leftout splits from Zenodo into data/chimeric_datasets/.

Repository layout

code/ — model definitions, encoders, training and inference scripts.
analysis/ — runnable wrapper scripts (train.sh, inference.sh, etc.) for each model.
data/ — placeholder; populated by the download scripts above.
models/ — trained model checkpoints.

Models leaderboard

We track model performance on the Manakov22 test and leftout datasets, ranked by Average Precision score (AP) AP(test) + AP(leftout).

Rank	Model	AP(test)	AP(leftout)	Model	Code	Date	Authors
1	Pairwise encoding with conservation (+2 channels)	85.93	82.26	model	code	2025-03-27	Dimos, David, Panos
2	Pairwise encoding CNN	84.97	83.08	model	code	2025-03-19	David, Panos
3	Retrained miRBind CNN (miRBench)	84.00	81.00	—	—	2025-03-19	Eva
4	TargetScanCNN	77.00	76.00	—	—	2025-03-19	TargetScan

Transcript repression predictions

Transcript repression predictions stored on Google Drive

Citation

If you use miRBind 2.0 in your work, please cite the corresponding manuscript: miRBind2 enables sequence-only prediction of miRNA binding and transcript repression.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
analysis		analysis
code		code
data		data
models		models
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

miRBind 2.0

Installation

Quick start: predicting miRNA binding sites

1. Prepare your input

2. Run inference

Quick start: predicting gene-level repression

Explainability

Downloading the public datasets

Repository layout

Models leaderboard

Transcript repression predictions

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

miRBind 2.0

Installation

Quick start: predicting miRNA binding sites

1. Prepare your input

2. Run inference

Quick start: predicting gene-level repression

Explainability

Downloading the public datasets

Repository layout

Models leaderboard

Transcript repression predictions

Citation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages