python-chebai-proteins
repository for protein prediction and classification, built on top of the python-chebai
codebase.
To install, follow these steps:
- Clone the repository:
git clone https://github.com/ChEB-AI/python-chebai-proteins.git
- Install the package:
cd python-chebai
pip install .
To combine configuration files from both python-chebai
and python-chebai-proteins
, structure your project like this:
my_projects/
├── python-chebai/
│ ├── chebai/
│ ├── configs/
│ └── ...
└── python-chebai-proteins/
├── chebai_proteins/
├── configs/
└── ...
This setup enables shared access to data and model configurations.
Before running any training scripts, ensure the environment is correctly configured:
-
Either:
-
Install the
python-chebai
repository as a package using:pip install .
-
-
OR
-
Manually set the
PYTHONPATH
environment variable if working across multiple directories (python-chebai
andpython-chebai-proteins
):-
If your current working directory is
python-chebai-proteins
, set:export PYTHONPATH=path/to/python-chebai
or vice versa.
-
If you're working within both repositories simultaneously or facing module not found errors, we recommend configuring both directories:
# Linux/macOS export PYTHONPATH=path/to/python-chebai:path/to/python-chebai-proteins # Windows (use semicolon instead of colon) set PYTHONPATH=path\to\python-chebai;path\to\python-chebai-proteins
-
-
🔎 See the PYTHONPATH Explained section below for more details.
Assuming your current working directory is python-chebai-proteins
, run the following command to start training:
python -m chebai fit --trainer=../configs/training/default_trainer.yml --trainer.callbacks=../configs/training/default_callbacks.yml --trainer.logger.init_args.name=scope50 --trainer.accumulate_grad_batches=4 --trainer.logger=../configs/training/wandb_logger.yml --trainer.min_epochs=100 --trainer.max_epochs=100 --data=configs/data/scope/scope50.yml --data.init_args.batch_size=32 --data.init_args.num_workers=10 --model=../configs/model/electra.yml --model.train_metrics=../configs/metrics/micro-macro-f1.yml --model.test_metrics=../configs/metrics/micro-macro-f1.yml --model.val_metrics=../configs/metrics/micro-macro-f1.yml --model.pass_loss_kwargs=false --model.criterion=../configs/loss/bce.yml --model.criterion.init_args.beta=0.99
Same command can be used for DeepGO just by changing the config path for data.
PYTHONPATH
is an environment variable that tells Python where to search for modules that aren't installed via pip
or not in your current working directory.
If your config refers to a custom module like:
class_path: chebai_proteins.preprocessing.datasets.scope.scope.SCOPe50
...and you're running the code from python-chebai
, Python won't know where to find chebai_proteins
(from another repo like python-chebai-proteins/
) unless you add it to PYTHONPATH
.
Python looks for imports in this order:
- Current directory
- Standard library
- Paths in
PYTHONPATH
- Installed packages (
site-packages
)
You can inspect the full search paths:
python -c "import sys; print(sys.path)"
export PYTHONPATH=/path/to/python-chebai-graph
echo $PYTHONPATH
set PYTHONPATH=C:\path\to\python-chebai-graph
echo %PYTHONPATH%
💡 Note: This is temporary for your terminal session. To make it permanent, add it to your system environment variables.