Simple Python unsupervised and semi-supervised Anomaly Detection framework that implements deep and shallow machine learning models published in various papers.
| Model | Paper |
|---|---|
| ALAD | Adversarially Learned Anomaly Detection |
| DAGMM | [Deep Autoencoding Gaussian Mixture Model |
| For Unsupervised Anomaly Detection](https://bzong.github.io/doc/iclr18-dagmm.pdf) | |
| DeepSVDD | Deep One-Class Classification |
| DROCC | DROCC: Deep Robust One-Class Classification |
| DSEBM | Deep Structured Energy Based Models for Anomaly Detection |
| GOAD | Classification-Based Anomaly Detection for General Data |
| SOM-DAGMM | Self-Organizing Map assisted Deep Autoencoding Gaussian Mixture Model for Intrusion Detection |
| MemAE | Memorizing Normality to Detect Anomaly: Memory-augmented Deep Autoencoder for Unsupervised Anomaly Detection |
| NeuTraLAD | Neural Transformation Learning for Deep Anomaly Detection Beyond Images |
| OC-SVM | Support Vector Method for Novelty Detection |
| LOF | LOF: identifying density-based local outliers |
Using conda:
conda env create -f environment.yamlUsing pip:
pip install -r requirements.txtThe experiments are defined in YAML configuration files under config. Simply upload your data in the data folder and use the preprocessing scripts to clean them or simply change the variables in _data.yaml to use a different path. Configurations are split in three different files: data, trainer and the model.
python main.py --config=./config/<data_fname> --config=./config/<trainer_fname> --config=./config/<model_fname>where <data_fname>, <trainer_fname>, and <model_fname> refer respectively to the data, trainer, and model configuration files.
Example training an AutoEncoder on Arrhythmia:
python main.py --config=./config/_data.yaml --config=./config/_trainer.yaml --config=./config/autoencoder.yamlWe also provide a utility powershell script to automate training of multiple models on the same dataset. To launch it, simply run the following commands
cd scripts
conda activate <my-env>
./train.ps1 <absolute-path-to-config> <data-config-fname> <trainer-config-fname>where <my-env>, <absolute-path-to-config>, <data-config-fname>, and <trainer-config-fname> refer respectively to name of your conda environment, the absolute path to the configuration folder, the name of the data configuration file, and the name of the training configuration file.
Example on Arrhythmia:
cd scripts
conda activate pyad
./train.ps1 C:\Users\me\path\to\pyad\config\arrhythmia _data.yaml _trainer.yamlOptionally, you can log your experiments using Neptune. To do so, add a logger key in the init_args of the trainer and define NEPTUNE_API_TOKEN and NEPTUNE_PROJECT as environment variables.
# _trainer.yaml
trainer:
class_path: pyad.models.trainer.ModuleTrainer
init_args:
...params
logger:
class_path: pyad.loggers.NeptuneLogger