[ENH] making `AptaTrans` sklearn-like and notebook for `Benchmarking` #165

satvshr · 2025-09-30T05:41:31Z

Stacks on #114
closes #164 and closes #190

Adds a tutorial notebook for benchmarking.

…on tests and bug fixing

This reverts commit 9949dbc, reversing changes made to 18833f0.

satvshr · 2025-11-09T10:50:05Z

@NennoMP for whenever you have time, given I think I have done most of what I could do:

Check out predictas it doesnt use Trainer.test at the moment it uses MCTS' recommend.
If you run the current benchmarking notebook you will get this error: RuntimeError: The size of tensor a (0) must match the size of tensor b (128) at non-singleton dimension 1 which is something to look into.
Update tests to present pipeline implementation if you find current work satsfying enough.

NennoMP · 2025-11-09T18:32:08Z

@NennoMP for whenever you have time, given I think I have done most of what I could do:

Check out predictas it doesnt use Trainer.test at the moment it uses MCTS' recommend.

If you run the current benchmarking notebook you will get this error: RuntimeError: The size of tensor a (0) must match the size of tensor b (128) at non-singleton dimension 1 which is something to look into.

Update tests to present pipeline implementation if you find current work satsfying enough.

I'll give it a look this week.

Where is bug (2.) occurring specifically, an you provide the traceback? The AptaTrans notebook is working on my side, so the problem could be in the benchmark class and/or benchmark notebook.

satvshr · 2025-11-09T18:45:56Z

Where is bug (2.) occurring specifically, an you provide the traceback? The AptaTrans notebook is working on my side, so the problem could be in the benchmark class and/or benchmark notebook.

I do not think I implemented AptaTrans correctly😅 It is failing before benchmarking in this cell in the benchmarking notebook:

# specify the target protein sequence here
target_protein = (
    "STEYKLVVVGADGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAM"
    "RDQYMRTGEGFLCVFAINNTKSFEDIHHYREQIKRVKDSEDVPMVLVGNKCDLPSRTVDTKQAQDLARSYGIPFIETSAKTR"
    "QGVDDAFYTLVREIRKHKEKMSK"
)

pipeline = AptaTransPipeline(
    device=device,
    model=model,
    prot_words=prot_words,
    depth=1,  # depth of the search (i.e., length of generated candidates)
    n_iterations=1,  # higher is better but slower, suggested: 1000
)
candidates = pipeline.recommend(
    target=target_protein,
    n_candidates=1,  # number of candidates to generate
    verbose=True,
)

with this error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[8], [line 15](vscode-notebook-cell:?execution_count=8&line=15)
      2 target_protein = (
      3     "STEYKLVVVGADGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAM"
      4     "RDQYMRTGEGFLCVFAINNTKSFEDIHHYREQIKRVKDSEDVPMVLVGNKCDLPSRTVDTKQAQDLARSYGIPFIETSAKTR"
      5     "QGVDDAFYTLVREIRKHKEKMSK"
      6 )
      8 pipeline = AptaTransPipeline(
      9     device=device,
     10     model=model,
   (...)     13     n_iterations=1,  # higher is better but slower, suggested: 1000
     14 )
---> [15](vscode-notebook-cell:?execution_count=8&line=15) candidates = pipeline.recommend(
     16     target=target_protein,
     17     n_candidates=1,  # number of candidates to generate
     18     verbose=True,
     19 )

File c:\Users\satvm\miniconda3\envs\pyaptamer-latest\Lib\site-packages\torch\utils\_contextlib.py:116, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    113 @functools.wraps(func)
    114 def decorate_context(*args, **kwargs):
    115     with ctx_factory():
--> [116](file:///C:/Users/satvm/miniconda3/envs/pyaptamer-latest/Lib/site-packages/torch/utils/_contextlib.py:116)         return func(*args, **kwargs)
...
---> [85](https://file+.vscode-resource.vscode-cdn.net/c%3A/Users/satvm/pyaptamer/examples/~/pyaptamer/pyaptamer/aptatrans/layers/_encoder.py:85) out = x + self.pe[:, : x.shape[1], :]
     86 if self.dropout:
     87     out = self.dropout(out)

RuntimeError: The size of tensor a (0) must match the size of tensor b (128) at non-singleton dimension 1

satvshr · 2025-11-13T12:39:04Z

I think we can close #166 with this too @NennoMP ?

NennoMP · 2025-11-18T16:48:29Z

Check out predictas it doesnt use Trainer.test at the moment it uses MCTS' recommend.

This method was intended as a quick, easy way to evaluate a pair (candidate, target protein) in string format. You could keep it and simply rename to evaluate(...). Then, add a new method predict(...) where you put the Trainer.test(...) logic (you can get this from the notebook). As I mentioned, the pipeline for AptaTrans wasn't designed with the same purpose of the pipeline for AptaNet.

If you run the current benchmarking notebook you will get this error: RuntimeError: The size of tensor a (0) must match the size of tensor b (128) at non-singleton dimension 1 which is something to look into.

I just ran this and you are right. The issue is in setting the tree search depth of MCTS to 1 (i.e., depth=1). Specifically, it seems the positional encodings in AptaTrans break for depth <= 2. I will try to look into this separately, for now just use a depth >=3.

This isn't really a issue though. I think you would never want aptamer candidate of length as short as 1 or 2.

File \pyaptamer\pyaptamer\aptatrans\layers\_encoder.py:85, in PositionalEncoding.forward(self, x)
     68 """Forward pass.
     69 
     70 Parameters
   (...)     79     positional encodings applied.
     80 """
     81 assert x.shape[1] <= self.max_len, (
     82     f"Input sequence length {x.shape[1]} exceeds maximum length {self.max_len}."
     83 )
---> 85 out = x + self.pe[:, : x.shape[1], :]
     86 if self.dropout:
     87     out = self.dropout(out)

RuntimeError: The size of tensor a (0) must match the size of tensor b (128) at non-singleton dimension 1

By the way, the branch is a few commits behind main.

satvshr · 2025-11-18T20:26:35Z

Oh I thought you could take over on this is what I implied last meeting 😅 The points I made were the points for you to address/things still left to do. If you cannot pick this up let me know and I will try and have a look at it, felt it would be way easier of a task for you than me hence the suggestion.

Edit: @NennoMP I just realised we are yet to make AptaTrans regression-friendly! (maybe part of another PR though)

satvshr added 30 commits July 7, 2025 13:21

Added the pseaac encoding algorithm

e37135c

Added Aptanet implementation

6ea5ff7

Made pseaac to a class and made the functions private, still working …

a5f01e0

…on tests and bug fixing

Made a few readability changes

3773a90

Edited tests

9b9a3da

Added pytest to tests

2dfe0c7

Added numpy style docstrings and ruff formatting

1e182d3

Added docstrings, made functions pvt and made code more clean

20d7e37

Removed AptaNet from root

fc2f051

Added example

62f6c42

Removed AptaNet from root

848fc9b

Made requested changes

1515efe

Merge branch 'main' into issue28

75d4efb

Made requested changes and updated tests

733f908

Made suggested changes

04ab599

Removed lint. from pyproject, will push it as a separate PR

dc78e44

Refactored code

c347988

Added pandas as a dependancy

d9537f4

Renamed parent folder name to put it in the same level as AptaNet

1c46c55

Merge remote-tracking branch 'origin/main' into issue13

a716872

Refactored code and made architecture flexible

7781441

Edited docstrings and directory structure

e762cc8

Merge branch 'main' into issue28

e844d4f

weird rename experiment

f9392ef

weird rename experiment pt. 2

beb45ec

Made requested changes

d603d07

Made requested changes

6ecf576

Made requested changes

b91c511

chore: dummy commit to retrigger CI

b2428b0

Added missing init file to utils

2982954

satvshr added 8 commits September 28, 2025 02:32

Update _csv_loader.py

51ccc41

Update pyproject.toml

4dd92bd

Merge branch 'main' into issue109

bb80a81

fixed docstring and example

18833f0

Merge branch 'issue150' into issue164

9949dbc

Create benchmarking_tutorial.ipynb

d7b72cd

Revert "Merge branch 'issue150' into issue164"

50d1279

This reverts commit 9949dbc, reversing changes made to 18833f0.

Update pyproject.toml

08f5ee6

fkiraly assigned satvshr Sep 30, 2025

satvshr mentioned this pull request Oct 15, 2025

[ENH] Benchmarking framework #114

Merged

merged main

67790be

satvshr changed the title ~~[ENH] Notebook for Benchmarking~~ [ENH] making AptaTrans sklearn-like and notebook for Benchmarking Nov 6, 2025

satvshr added 3 commits November 6, 2025 13:22

AptaTrans changes start

d0aa150

Update benchmarking_tutorial.ipynb

0d57e1f

Delete test_csv_loader.py

94c6c64

satvshr marked this pull request as draft November 6, 2025 12:57

satvshr added 5 commits November 9, 2025 02:55

Updated pipeline

5034a6f

Update _pipeline.py

4f0d94b

Update _pipeline.py

96a0043

Update _pipeline.py

e18456f

Updated pipeline and notebook

9bbf35f

satvshr assigned NennoMP Nov 9, 2025

NennoMP marked this pull request as ready for review November 18, 2025 16:33

NennoMP marked this pull request as draft November 18, 2025 16:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ENH] making `AptaTrans` sklearn-like and notebook for `Benchmarking` #165

[ENH] making `AptaTrans` sklearn-like and notebook for `Benchmarking` #165

Uh oh!

satvshr commented Sep 30, 2025 •

edited

Loading

Uh oh!

satvshr commented Nov 9, 2025

Uh oh!

NennoMP commented Nov 9, 2025 •

edited

Loading

Uh oh!

satvshr commented Nov 9, 2025

Uh oh!

satvshr commented Nov 13, 2025

Uh oh!

NennoMP commented Nov 18, 2025 •

edited

Loading

Uh oh!

satvshr commented Nov 18, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[ENH] making AptaTrans sklearn-like and notebook for Benchmarking #165

Are you sure you want to change the base?

[ENH] making AptaTrans sklearn-like and notebook for Benchmarking #165

Uh oh!

Conversation

satvshr commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

satvshr commented Nov 9, 2025

Uh oh!

NennoMP commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

satvshr commented Nov 9, 2025

Uh oh!

satvshr commented Nov 13, 2025

Uh oh!

NennoMP commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

satvshr commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[ENH] making `AptaTrans` sklearn-like and notebook for `Benchmarking` #165

[ENH] making `AptaTrans` sklearn-like and notebook for `Benchmarking` #165

satvshr commented Sep 30, 2025 •

edited

Loading

NennoMP commented Nov 9, 2025 •

edited

Loading

NennoMP commented Nov 18, 2025 •

edited

Loading

satvshr commented Nov 18, 2025 •

edited

Loading