Skip to content

Commit e5ed946

Browse files
committed
Filled in the documentation
1 parent 2821745 commit e5ed946

File tree

3 files changed

+45
-0
lines changed

3 files changed

+45
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ Steps marked with the boat icon are not yet implemented. For the other steps, th
4949
- [scrublet](https://scanpy.readthedocs.io/en/stable/api/generated/scanpy.pp.scrublet.html)
5050
- [DoubletDetection](https://doubletdetection.readthedocs.io/en/v2.5.2/doubletdetection.doubletdetection.html)
5151
- [SCDS](https://bioconductor.org/packages/devel/bioc/vignettes/scds/inst/doc/scds.html)
52+
7. Cell cycle scoring ([Tirosh et al. 2015](https://doi.org/10.1038/nature14590))
5253
2. Sample aggregation
5354
1. Merge into a single h5ad file
5455
2. Present QC for merged counts ([`MultiQC`](http://multiqc.info/))

docs/output.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
2525
- [scrublet](https://scanpy.readthedocs.io/en/stable/api/generated/scanpy.pp.scrublet.html)
2626
- [DoubletDetection](https://doubletdetection.readthedocs.io/en/v2.5.2/doubletdetection.doubletdetection.html)
2727
- [SCDS](https://bioconductor.org/packages/devel/bioc/vignettes/scds/inst/doc/scds.html)
28+
7. Cell cycle scoring ([Tirosh et al. 2015](https://doi.org/10.1038/nature14590))
2829
2. Sample aggregation
2930
1. Merge into a single h5ad file
3031
2. Present QC for merged counts ([`MultiQC`](http://multiqc.info/))
@@ -60,6 +61,9 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
6061
- `(doubletdetection|scds|scrublet|solo)/`: Results of doublet detection. Each directory contains a filtered `h5ad`/`rds` and a `csv`/`pkl` file with the doublet annotations.
6162
- `${sample_id}.h5ad`: The h5ad without doublets.
6263
- `qc_preprocessed/`: QC plots for the preprocessed data.
64+
- `cell_cycle/`: Cell cycle scoring results.
65+
- `${sample_id}_cellcycle.pkl`: `S_score`, `G2M_score`, and `phase` columns for each cell. Merged into the final h5ad via `FINALIZE_QC_ANNDATAS`.
66+
- `${sample_id}_cellcycle.h5ad`: Intermediate h5ad with cell cycle scores added, available for inspection.
6367

6468
</details>
6569

docs/usage.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,46 @@ monaco_immune,label.fine,/path/to/monaco_immune.tar
138138

139139
Example tar archives can be found [here](https://github.com/nf-core/test-datasets/tree/scdownstream/singleR).
140140

141+
### Cell cycle scoring
142+
143+
Cell cycle scoring assigns each cell an S-phase score, G2M-phase score, and a predicted cell cycle phase (`S`, `G2M`, or `G1`) based on the expression of curated marker genes (Tirosh et al. 2015, same gene sets as Seurat). The scores are stored in `adata.obs` as `S_score`, `G2M_score`, and `phase`, and are available as covariates in downstream integration steps.
144+
145+
Cell cycle scoring is enabled by default. To skip it:
146+
147+
```bash
148+
nextflow run nf-core/scdownstream --input samplesheet.csv --outdir results --cell_cycle_scoring false
149+
```
150+
151+
#### Species
152+
153+
Bundled gene lists are provided for human and mouse. Select the appropriate species with `--species`:
154+
155+
```bash
156+
# mouse
157+
nextflow run nf-core/scdownstream --input samplesheet.csv --outdir results --species mouse
158+
```
159+
160+
#### Custom gene lists
161+
162+
For other organisms (e.g. rat, zebrafish), you can provide your own gene lists — one gene symbol per line — via `--s_genes` and `--g2m_genes`:
163+
164+
```bash
165+
nextflow run nf-core/scdownstream --input samplesheet.csv --outdir results \
166+
--s_genes /path/to/my_s_genes.txt \
167+
--g2m_genes /path/to/my_g2m_genes.txt
168+
```
169+
170+
The bundled gene lists can be found in [`assets/cell_cycle_genes/`](../assets/cell_cycle_genes/) and serve as templates for custom lists.
171+
172+
#### Using scores in downstream analysis
173+
174+
The `S_score` and `G2M_score` columns can be passed to integration tools as continuous covariates to regress out cell cycle effects:
175+
176+
```bash
177+
nextflow run nf-core/scdownstream --input samplesheet.csv --outdir results \
178+
--scvi_continuous_covariates S_score,G2M_score
179+
```
180+
141181
### Reference mapping
142182

143183
The pipeline supports mapping new samples into the latent space of an existing scVI/scANVI model.

0 commit comments

Comments
 (0)