CASCADE stands for Causality-Aware Single-Cell Adaptive Discover/Deduction/Design Engine. It is a deep learning-based bioinformatics tool for causal gene regulatory network discovery, counterfactual perturbation effect prediction, and targeted intervention design based on high-content single-cell perturbation screens.
Trained on single-cell perturbation data, CASCADE models the causal gene regulatory network as a directed acyclic graph (DAG) and leverages differentiable causal discovery (DCD) to transform the search of discrete network structures into a manageable optimization problem. We achieve causal discovery with thousands of genes by incorporating a scaffold graph built from context-agnostic, coarse prior regulatory knowledge to constrain search space and enhance computational efficiency in an evidence-guided manner. Additionally, technical confounding covariate as well as gene-wise perturbation latent variables encoded from gene ontology (GO) annotations are also included to account for effects not explained by the causal structure. The complete CASCADE model is constructed within a Bayesian framework, allowing for the estimation of causal uncertainty under limited data regimes typical of practical biological experiments.
Using the inferred causal regulatory network, CASCADE supports two types of downstream inference. First, it performs counterfactual deduction of unseen perturbation effects by iteratively propagating perturbation effects following the topological order of the causal graph. Notably, this deduction process remains end-to-end differentiable, allowing it to be inverted into intervention design by treating gene intervention as an optimizable parameter trained to minimize deviation between the counterfactual outcome and desired target transcriptomes.
For more details, please check out our preprint at TODO.
CASCADE is implemented in the cascade-reg package. It can be installed
via conda/mamba:
mamba install bioconda::cascade-regOr using pip:
pip install cascade-regTo avoid potential dependency conflicts, installing within a conda environment is recommended.
Proceed to our documentation site for how to
use the cascade-reg package.
- Check out the repository to branch
repicate:git checkout replicate
- Create a local conda environment using the
env.shscript:./env.sh create
- Activate the local conda environment:
mamba activate ./conda
- Use scripts in
data/downloadto prepare necessary data - Use scripts in
data/scaffoldto prepare the scaffold graphs - Use pipeline in
evaluationfor running systematic benchmarks - Use notebooks in
experimentsfor intervention design case studies
Instructions below are only for development purpose.
Use the following commands to manage the development environment:
./env.sh create # Create new environment based on config files
./env.sh export # Export environment changes to config files
./env.sh update # Update environment based on config filesUse the following commands to activate and deactivate the environment:
mamba activate ./conda
mamba deactivatesphinx-build -b html -D language=en docs docs/_build/html/en