-
Notifications
You must be signed in to change notification settings - Fork 5
Closed
Description
We already have some tests for data preprocessing. However, those are more integration tests that capture the behaviour of the tool as a whole than unit tests for specific functions.
In order to efficiently test the different preprocessing functionalities, we need to add some smaller-scale unit tests. Those should not include real data, but sample input values that can be generated from scratch.
Here are the classes / functions that should be covered (from the implementation in the protein_prediction
branch
reader.py
:
- DataReader:
to_data()
- ChemDataReader:
_read_data()
- DeepChemDataReader:
_read_data()
- SelfiesReader:
_read_data()
- ProteinDataReader:
_read_data()
collate.py
: - DefaultCollator:
__call__()
- RaggedCollator:
__call__()
,process_label_rows()
datasets/base.py
- XYBaseDataModule:
_filter_labels()
- DynamicDataset:
get_test_split()
,get_train_val_splits_given_test()
datasets/chebi.py
- _ChEBIDataExtractor:
_extract_class_hierarchy()
,_graph_to_raw_dataset()
,_load_dict()
,_setup_pruned_test_set()
- ChEBIOverX:
select_classes()
- ChEBIOverXPartial:
extract_class_hierarchy()
term_callback()
datasets/go_uniprot.py
:- _GOUniprotDataExtractor:
_extract_class_hierarchy()
,term_callback()
,_graph_to_raw_dataset()
,_get_swiss_to_go_mapping()
,_load_dict()
- _GoUniProtOverX:
select_classes()
datasets/tox21.py
: Tox21MolNet:setup_processed()
,_load_data_from_file()
- Tox21Challenge:
setup_processed()
,_load_data_from_file()
,_load_dict()
For some functions, it is necessary to read from / write to files. Instead of real files, I would suggest to use mock objects (see e.g. this comment)
Metadata
Metadata
Assignees
Labels
No labels