[ENH] Benchmarking framework #114

satvshr · 2025-08-25T08:04:47Z

closes #141

This PR:

Fixes a small bug in AptaNetPipeline
Makes AptaNetPipeline inherit from BaseObject to prevent errors during benchmarking
Removes an unnecessary test (test_pfoa), the loader is already being tested in test_loaders
The benchmarking framework

…on tests and bug fixing

fkiraly · 2025-09-30T07:56:25Z

pyaptamer/datasets/tests/test_pfoa.py

-from pyaptamer.datasets._loaders import load_pfoa_structure
-
-
-def test_pfoa_loader():


why are we deleting this file?

Added in the description of the PR

fkiraly · 2025-09-30T07:57:08Z

could you kindly make sure you write a good PR description in the first post?

satvshr · 2025-09-30T10:13:27Z

could you kindly make sure you write a good PR description in the first post?

Done.

fkiraly

We are making changes to AptaNetPipeline - is this required for benchmarking? Would this not interact with other PRs, e.g., #153?

I would recommend to make this a separate PR.

NennoMP · 2025-10-02T19:57:54Z

While investigating #144, I noticed that the benchmark class performs cross-validation on the given dataset (training data) to generate validation splits on which the model is then evaluated. However, the purpose of cross-validation is to maximize the use of training data during model selection to select the best hyperparameters. Here there is no model selection though.

Shouldn't the purpose of the benchmarking class be to train the model on training data using the best hyperparameters previously identified, and then perform a final evaluation on held-out test data (i.e., the test_li2014.csv file from AptaTrans)?

satvshr · 2025-10-06T12:36:25Z

is this required for benchmarking?

If you read the PR description, the BaseObject part is required for benchmarking, the bug on the other hand should be patched and is small so I thought id put it here too.

satvshr · 2025-10-06T12:39:20Z

Shouldn't the purpose of the benchmarking class be to train the model on training data using the best hyperparameters previously identified, and then perform a final evaluation on held-out test data (i.e., the test_li2014.csv file from AptaTrans)?

We can use StratifiedSplit if we want a train-test split for evaluation, and "generic" cv to ensure a certain split is not the cause for one model performing better than the other one, and we can get a more accurate result by evaluating it over n iterations.

fkiraly · 2025-10-08T22:09:59Z

is this required for benchmarking?

If you read the PR description, the BaseObject part is required for benchmarking, the bug on the other hand should be patched and is small so I thought id put it here too.

What exactly is the bug?

fkiraly · 2025-10-08T22:11:27Z

While investigating #144, I noticed that the benchmark class performs cross-validation on the given dataset (training data) to generate validation splits on which the model is then evaluated. However, the purpose of cross-validation is to maximize the use of training data during model selection to select the best hyperparameters. Here there is no model selection though.

Shouldn't the purpose of the benchmarking class be to train the model on training data using the best hyperparameters previously identified, and then perform a final evaluation on held-out test data (i.e., the test_li2014.csv file from AptaTrans)?

@NennoMP, re-sampling or CV can be used both in benchmarking and in tuning - it is even possible to combine both, to benchmark a model including the tuning algorithm. For benchmarking, one has to be careful about error bars etc, the CV-based error bars are not reliable (due to sample correlation from the cv splits).

satvshr · 2025-10-09T06:37:08Z

What exactly is the bug?

FunctionTransformer takes a dictionary as input to kw_args, the test was only passing as I was not giving an input to kw_args. Once I did that, tests started failing.

fkiraly

merge conflicts with main

fkiraly

Requests have not been addressed. Repeating them here

May I request to add a short notebook with a small benchmarking experiment and some reasonably chosen dataset? - first change request #114 (review)
- as said there, this is necessary for me to understand your design
I asked for the bugfix to be moved to a separate PR, and for a description of the bug you claim to fix. #114 (comment)

satvshr · 2025-10-15T13:48:06Z

as said there, this is necessary for me to understand your design

#165 is waiting for the AptaTrans notebook before adding AptaTrans to it.

I asked for the bugfix to be moved to a separate PR, and for a description of the bug you claim to fix. #114 (comment)

FunctionTransformer takes a dictionary as input to kw_args, the test was only passing as I was not giving an input to kw_args. Once I did that, tests started failing.

You asked me what the bug was, not to move it to another PR. The description of the bug is as mentioned above and is a very small fix, I understand it is better to move it to another PR but its only 2 lines and I thought its easier to get it done in this one than make an issue and a PR for a 2 line fix that is not that important rn.

fkiraly · 2025-10-16T22:59:30Z

you still have not explained what you think the bug is/was.

satvshr · 2025-10-17T05:22:24Z

you still have not explained what you think the bug is/was.

My bad, I realised I was not clear enough. FunctionTransformer takes a dictionary as input to kw_args, currently it'sonly taking a constant k as input.
The tests pass currently, as in the tests for AptaNet, k value is not is being passed to the AptaNetPipeline, which I also fix as part of this PR.

satvshr added 30 commits July 7, 2025 13:21

Added the pseaac encoding algorithm

e37135c

Added Aptanet implementation

6ea5ff7

Made pseaac to a class and made the functions private, still working …

a5f01e0

…on tests and bug fixing

Made a few readability changes

3773a90

Edited tests

9b9a3da

Added pytest to tests

2dfe0c7

Added numpy style docstrings and ruff formatting

1e182d3

Added docstrings, made functions pvt and made code more clean

20d7e37

Removed AptaNet from root

fc2f051

Added example

62f6c42

Removed AptaNet from root

848fc9b

Made requested changes

1515efe

Merge branch 'main' into issue28

75d4efb

Made requested changes and updated tests

733f908

Made suggested changes

04ab599

Removed lint. from pyproject, will push it as a separate PR

dc78e44

Refactored code

c347988

Added pandas as a dependancy

d9537f4

Renamed parent folder name to put it in the same level as AptaNet

1c46c55

Merge remote-tracking branch 'origin/main' into issue13

a716872

Refactored code and made architecture flexible

7781441

Edited docstrings and directory structure

e762cc8

Merge branch 'main' into issue28

e844d4f

weird rename experiment

f9392ef

weird rename experiment pt. 2

beb45ec

Made requested changes

d603d07

Made requested changes

6ecf576

Made requested changes

b91c511

chore: dummy commit to retrigger CI

b2428b0

Added missing init file to utils

2982954

satvshr requested a review from fkiraly September 29, 2025 20:59

satvshr mentioned this pull request Sep 30, 2025

[ENH] making AptaTrans sklearn-like and notebook for Benchmarking #165

Draft

satvshr changed the title ~~[ENH] Benchmarking framework~~ [ENH] Benchmarking framework and csv loader Sep 30, 2025

satvshr mentioned this pull request Sep 30, 2025

[ENH] Update AptaNet notebook to use AptaTrans data and a pdb for prediction #153

Open

fkiraly reviewed Sep 30, 2025

View reviewed changes

satvshr requested a review from fkiraly September 30, 2025 10:13

Update _aptanet_utils.py

84a2754

fkiraly requested changes Oct 1, 2025

View reviewed changes

satvshr requested a review from fkiraly October 8, 2025 14:01

fkiraly requested changes Oct 14, 2025

View reviewed changes

satvshr added 3 commits October 14, 2025 13:08

Merge branch 'main' into issue109

ffc4e95

Merge branch 'main' into issue109

17a5076

Delete test_csv_loader.py

f9cece0

satvshr changed the title ~~[ENH] Benchmarking framework and csv loader~~ [ENH] Benchmarking framework Oct 14, 2025

satvshr requested a review from fkiraly October 14, 2025 08:12

fkiraly requested changes Oct 15, 2025

View reviewed changes

satvshr requested a review from fkiraly October 17, 2025 07:39

fkiraly approved these changes Oct 26, 2025

View reviewed changes

fkiraly merged commit 68777eb into main Oct 26, 2025
15 checks passed

		from pyaptamer.datasets._loaders import load_pfoa_structure


		def test_pfoa_loader():

[ENH] Benchmarking framework #114

[ENH] Benchmarking framework #114

Uh oh!

Conversation

satvshr commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fkiraly Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

satvshr Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

fkiraly commented Sep 30, 2025

Uh oh!

satvshr commented Sep 30, 2025

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

NennoMP commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

satvshr commented Oct 6, 2025

Uh oh!

satvshr commented Oct 6, 2025

Uh oh!

fkiraly commented Oct 8, 2025

Uh oh!

fkiraly commented Oct 8, 2025

Uh oh!

satvshr commented Oct 9, 2025

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

fkiraly left a comment

Choose a reason for hiding this comment

Uh oh!

satvshr commented Oct 15, 2025

Uh oh!

fkiraly commented Oct 16, 2025

Uh oh!

satvshr commented Oct 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

satvshr commented Aug 25, 2025 •

edited

Loading

NennoMP commented Oct 2, 2025 •

edited

Loading