This repo contains the official code of the Orthrus paper.
@inproceedings{jian2025,
title={{ORTHRUS: Achieving High Quality of Attribution in Provenance-based Intrusion
Detection Systems}},
author={Jiang, Baoxiang and Bilot, Tristan and El Madhoun, Nour and Al Agha, Khaldoun and Zouaoui, Anis and Iqbal, Shahrear and Han, Xueyuan and Pasquier, Thomas},
booktitle={Security Symposium (USENIX Sec'25)},
year={2025},
organization={USENIX}
}
[2025.06.06] Orthrus is now available in PIDSMaker!
[2025.06.05] Orthrus' weights are available.
[2025.06.04] Installation guidelines are now simplified. The DARPA TC databases can be directly downloaded and installed locally. No need to fill them locally anymore.
git clone --recurse-submodules https://github.com/ubc-provenance/orthrus.git
We have made the installation of DARPA TC/OpTC easy and fast, simply follow these guidelines.
The following commands should be executed within the pids container.
Launching Orthrus is as simple as running:
python src/orthrus.py [dataset] [config args...]Running orthrus.py will run by default the graph_construction, edge_featurization, detection and attack_reconstruction tasks configured within the config/orthrus.yml file. This configuration can be updated directly in the YML file or from the CLI, as shown above.
Note
The original results could not be exactly replicated due to a missing PYTHONHASHSEED affecting Gensim's Word2Vec, though the following experiments yield similar results in most cases.
| Name | TP | FP | TN | FN | Precision | MCC |
|---|---|---|---|---|---|---|
| CADETS_E3_full | 22 | 10 | 268,075 | 46 | 0.69 | 0.47 |
| CADETS_E3_ano | 15 | 0 | 268,085 | 53 | 1.00 | 0.47 |
| THEIA_E3_full | 22 | 0 | 699,177 | 96 | 1.00 | 0.43 |
| THEIA_E3_ano | 2 | 0 | 699,177 | 116 | 1.00 | 0.13 |
| CADETS_E5_full | 3 | 1318 | 3,132,823 | 120 | 0.00 | 0.01 |
| CADETS_E5_ano | 1 | 2 | 3,134,139 | 122 | 0.33 | 0.05 |
| THEIA_E5_full | 13 | 2 | 747,381 | 56 | 0.86 | 0.40 |
| THEIA_E5_ano | 2 | 0 | 747,383 | 67 | 1.00 | 0.17 |
| CLEARSCOPE_E3_full | 1 | 647 | 110,715 | 40 | 0.00 | 0.00 |
| CLEARSCOPE_E3_ano | 1 | 5 | 111,357 | 40 | 0.17 | 0.06 |
| CLEARSCOPE_E5_full | 4 | 8 | 150,666 | 47 | 0.33 | 0.16 |
| CLEARSCOPE_E5_ano | 2 | 5 | 150,669 | 49 | 0.29 | 0.10 |
These experiments use pre-trained weights of Orthrus.
CADETS_E3
PYTHONHASHSEED=0 python src/orthrus.py CADETS_E3 --from_weights --detection.gnn_training.encoder.graph_attention.dropout=0.25 --detection.gnn_training.node_hid_dim=256 --detection.gnn_training.node_out_dim=256 --detection.gnn_training.lr=0.001 --detection.gnn_training.num_epochs=20 --seed=4
THEIA_E3
PYTHONHASHSEED=0 python src/orthrus.py THEIA_E3 --from_weights --detection.gnn_training.encoder.graph_attention.dropout=0.1 --seed=2
CLEARSCOPE_E3
PYTHONHASHSEED=0 python src/orthrus.py CLEARSCOPE_E3 --from_weights --graph_construction.build_graphs.time_window_size=1.0 --detection.gnn_training.encoder.graph_attention.dropout=0.1 --seed=2
CADETS_E5
PYTHONHASHSEED=0 python src/orthrus.py CADETS_E5 --from_weights --detection.gnn_training.node_out_dim=128 --detection.gnn_training.lr=0.0001 --detection.gnn_training.encoder.graph_attention.dropout=0.1 --graph_construction.build_graphs.time_window_size=1.0
THEIA_E5
PYTHONHASHSEED=0 python src/orthrus.py THEIA_E5 --from_weights
CLEARSCOPE_E5
PYTHONHASHSEED=0 python src/orthrus.py CLEARSCOPE_E5 --from_weights --detection.gnn_training.lr=0.0001 --detection.gnn_training.encoder.graph_attention.dropout=0.1 --detection.gnn_training.node_out_dim=64
When run once, datasets are preprocessed and stored in the ROOT_ARTIFACT_DIR path within config.py. There is thus no need to recompute them. To avoid re-computing the graph_construction and edge_featurization tasks, Orthrus can be run directly from the detection task using the arg --run_from_training.
python src/orthrus.py CADETS_E3 --run_from_trainingW&B is used as the default interface to visualize and historize experiments. First log into your account from the CLI using:
wandb loginSet your API key, which can be found on the website. Then you can push the logs and results of experiments to the interface using the --wandb arg.
The preferred solution is to run the run.sh script, which directly logs the experiments to the W&B interface.
python src/orthrus.py THEIA_E3 --wandbSee licence.