Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 0 additions & 3 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,6 @@ jobs:
unzip qsv-0.112.0-x86_64-unknown-linux-musl.zip
cp qsv_musl-1.2.3 /usr/local/bin/qsv

- name: Install annonars
run: |
sudo bash ./utils/install-annonars.sh
- name: Install python package
run: |
pip install -e .
Expand Down
478 changes: 229 additions & 249 deletions Snakefile

Large diffs are not rendered by default.

64 changes: 45 additions & 19 deletions download_urls.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,16 +57,17 @@
count: null

# dbNSFP v4.5a
- url: https://dbnsfp.s3.amazonaws.com/dbNSFP4.5a.zip
- url: https://usf.box.com/shared/static/2hzcx5s6p1xui7oen16xqzndfrkt8l9l
excerpt_strategy:
strategy: manual
count: null
# dbNSFP v4.5c
- url: https://dbnsfp.s3.amazonaws.com/dbNSFP4.5c.zip
- url: https://usf.box.com/shared/static/03xsrpna0nzgrytfo2pzk326t8jad4oc
excerpt_strategy:
strategy: manual
count: null
- url: ftp://dbnsfp:[email protected]/dbscSNV1.1.zip
# dbscSNV v1.1
- url: https://usf.box.com/shared/static/ffwlywsat3q5ijypvunno3rg6steqfs8
skip_upstream_check: true # does not work reliably in tests
excerpt_strategy:
strategy: manual
Expand Down Expand Up @@ -115,10 +116,10 @@

- url: https://www.deciphergenomics.org/files/downloads/HI_Predictions_Version3.bed.gz

- url: ftp://ftp.clinicalgenome.org/ClinGen_region_curation_list_GRCh37.tsv
- url: ftp://ftp.clinicalgenome.org/ClinGen_region_curation_list_GRCh38.tsv
- url: ftp://ftp.clinicalgenome.org/ClinGen_gene_curation_list_GRCh37.tsv
- url: ftp://ftp.clinicalgenome.org/ClinGen_gene_curation_list_GRCh38.tsv
- url: https://ftp.clinicalgenome.org/ClinGen_region_curation_list_GRCh37.tsv
- url: https://ftp.clinicalgenome.org/ClinGen_region_curation_list_GRCh38.tsv
- url: https://ftp.clinicalgenome.org/ClinGen_gene_curation_list_GRCh37.tsv
- url: https://ftp.clinicalgenome.org/ClinGen_gene_curation_list_GRCh38.tsv

- url: https://storage.googleapis.com/adult-gtex/annotations/v8/metadata-files/GTEx_Analysis_v8_Annotations_SampleAttributesDS.txt
excerpt_strategy:
Expand All @@ -141,38 +142,63 @@
url: https://search.clinicalgenome.org/kb/reports/curation-activity-summary-report
skip_upstream_check: true # does not work reliably in tests

- url: https://github.com/varfish-org/clinvar-data-jsonl/releases/download/clinvar-weekly-20240612/clinvar-data-extract-vars-20240612+0.17.0.tar.gz
- url: https://github.com/varfish-org/clinvar-data-jsonl/releases/download/clinvar-weekly-20250410/clinvar-data-extract-vars-20250410+0.18.5.tar.gz
excerpt_strategy:
strategy: manual
count: null

- url: https://github.com/bihealth/mehari-data-tx/releases/download/v0.4.4/mehari-data-txs-grch37-0.4.4.bin.zst
- url: https://github.com/bihealth/mehari-data-tx/releases/download/v0.10.3/mehari-data-txs-grch37-ensembl-0.10.3.bin.zst
excerpt_strategy:
strategy: no-excerpt
count: null
- url: https://github.com/bihealth/mehari-data-tx/releases/download/v0.4.4/mehari-data-txs-grch38-0.4.4.bin.zst
- url: https://github.com/bihealth/mehari-data-tx/releases/download/v0.10.3/mehari-data-txs-grch38-ensembl-0.10.3.bin.zst
excerpt_strategy:
strategy: no-excerpt
count: null

- url: https://github.com/obophenotype/human-phenotype-ontology/releases/download/v2024-01-16/hp.obo
- url: https://github.com/bihealth/mehari-data-tx/releases/download/v0.10.3/mehari-data-txs-grch37-refseq-0.10.3.bin.zst
excerpt_strategy:
strategy: no-excerpt
count: null
- url: https://github.com/obophenotype/human-phenotype-ontology/releases/download/v2024-01-16/phenotype.hpoa
- url: https://github.com/bihealth/mehari-data-tx/releases/download/v0.10.3/mehari-data-txs-grch38-refseq-0.10.3.bin.zst
excerpt_strategy:
strategy: no-excerpt
count: null
- url: https://github.com/obophenotype/human-phenotype-ontology/releases/download/v2024-01-16/phenotype_to_genes.txt

- url: https://github.com/bihealth/mehari-data-tx/releases/download/v0.10.3/mehari-data-txs-grch37-ensembl-and-refseq-0.10.3.bin.zst
excerpt_strategy:
strategy: no-excerpt
count: null
- url: https://github.com/bihealth/mehari-data-tx/releases/download/v0.10.3/mehari-data-txs-grch38-ensembl-and-refseq-0.10.3.bin.zst
excerpt_strategy:
strategy: no-excerpt
count: null

- url: https://github.com/bihealth/mehari-data-tx/releases/download/v0.10.3/mehari-data-txs-grch37-ensembl-0.10.3.bin.zst
excerpt_strategy:
strategy: no-excerpt
count: null
- url: https://github.com/obophenotype/human-phenotype-ontology/releases/download/v2024-01-16/genes_to_phenotype.txt
- url: https://github.com/bihealth/mehari-data-tx/releases/download/v0.10.3/mehari-data-txs-grch38-ensembl-0.10.3.bin.zst
excerpt_strategy:
strategy: no-excerpt
count: null

- url: https://ftp.ensembl.org/pub/current_README
- url: https://github.com/obophenotype/human-phenotype-ontology/releases/download/v2025-03-03/hp.obo
excerpt_strategy:
strategy: no-excerpt
count: null
- url: https://github.com/obophenotype/human-phenotype-ontology/releases/download/v2025-03-03/phenotype.hpoa
excerpt_strategy:
strategy: no-excerpt
count: null
- url: https://github.com/obophenotype/human-phenotype-ontology/releases/download/v2025-03-03/phenotype_to_genes.txt
excerpt_strategy:
strategy: no-excerpt
count: null
- url: https://github.com/obophenotype/human-phenotype-ontology/releases/download/v2025-03-03/genes_to_phenotype.txt
excerpt_strategy:
strategy: no-excerpt
count: null

- comment: The UCSC listing is used for checking the versions for GRCh37.
url: https://hgdownload.cse.ucsc.edu/goldenpath/hg19/database
Expand All @@ -199,11 +225,11 @@
count: 10000
- url: https://hgdownload.cse.ucsc.edu/goldenpath/hg38/multiz100way/alignments/knownGene.exonAA.fa.gz

- url: http://3dgenome.fsm.northwestern.edu/downloads/hg19.TADs.zip
- url: https://3dgenome.fsm.northwestern.edu/downloads/hg19.TADs.zip
excerpt_strategy:
strategy: no-excerpt
count: null
- url: http://3dgenome.fsm.northwestern.edu/downloads/hg38.TADs.zip
- url: https://3dgenome.fsm.northwestern.edu/downloads/hg38.TADs.zip
excerpt_strategy:
strategy: no-excerpt
count: null
Expand Down Expand Up @@ -250,8 +276,8 @@
strategy: head
count: 10000

- url: 'https://ensembl.org/biomart/martservice?query=<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" ><Dataset name = "hsapiens_gene_ensembl" interface = "default" ><Attribute name = "ensembl_gene_id" /><Attribute name = "ensembl_transcript_id" /><Attribute name = "entrezgene_id" /><Attribute name = "external_gene_name" /></Dataset></Query>'
- url: 'https://ftp.ebi.ac.uk/pub/databases/genenames/hgnc/json/hgnc_complete_set.json'
- url: 'https://may2024.archive.ensembl.org/biomart/martservice?query=<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Query><Query virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" ><Dataset name = "hsapiens_gene_ensembl" interface = "default" ><Attribute name = "ensembl_gene_id" /><Attribute name = "ensembl_transcript_id" /><Attribute name = "entrezgene_id" /><Attribute name = "external_gene_name" /></Dataset></Query>'
- url: 'https://storage.googleapis.com/public-download-files/hgnc/json/json/hgnc_complete_set.json'
skip_upstream_check: true # does not work reliably in tests
excerpt_strategy:
strategy: manual
Expand Down
16 changes: 8 additions & 8 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,23 +33,23 @@ dependencies:
- jq
# Tools for file downloads.
- aria2 >=1.36.0
- wget
# Tool for processing BED files.
- bedops =2
# VCF/BCF/HTSlib/Samtools.
- bcftools =1.17
- htslib =1.17
- samtools =1.17
- bcftools =1.21
- htslib =1.21
- samtools =1.21
# Parallel (de)compression.
- pigz
# Varfish related
# - annonars =0.41.3 # current versions not on bioconda due to build issue, but docker images are available
- viguno =0.3.1
- mehari =0.25.5
- varfish-server-worker =0.13.0
- annonars =0.44.0
- viguno =0.4.0
- mehari =0.35.1
- varfish-server-worker =0.17.2
# S3 uploads
- s5cmd =2.1.0
# async HTTP requests
- httpx =0.25.0
- httpcore =0.18.0
- trio
- qsv

This file was deleted.

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/0106e2c5435e5972/url.txt

This file was deleted.

Git LFS file not shown
3 changes: 3 additions & 0 deletions excerpt-data/03531c89f88e4ce8/genes_to_phenotype.txt
Git LFS file not shown
3 changes: 3 additions & 0 deletions excerpt-data/03531c89f88e4ce8/url.txt
Git LFS file not shown

This file was deleted.

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/03c9dba47d8d1fbc/url.txt

This file was deleted.

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/05d48443127f19c1/url.txt

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/05e93e6f1f5d60e6/gnomad.v4.0.sv.chr4.vcf.gz

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/05e93e6f1f5d60e6/gnomad.v4.0.sv.chr4.vcf.gz.tbi

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/05e93e6f1f5d60e6/url.txt

This file was deleted.

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/066e9189e40704c2/url.txt

This file was deleted.

Git LFS file not shown

This file was deleted.

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/083f4075b9190a42/url.txt

This file was deleted.

Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
3 changes: 0 additions & 3 deletions excerpt-data/0be9b2561c9397f2/gnomad.v4.0.sv.chr15.vcf.gz

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/0be9b2561c9397f2/gnomad.v4.0.sv.chr15.vcf.gz.tbi

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/0be9b2561c9397f2/url.txt

This file was deleted.

Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
3 changes: 3 additions & 0 deletions excerpt-data/0ccc4915e7ecfd38/url.txt
Git LFS file not shown

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/0e7eb7069eb4d354/url.txt

This file was deleted.

2 changes: 1 addition & 1 deletion excerpt-data/111d8c6e08038f62/20
Git LFS file not shown
Git LFS file not shown

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/13652bb20d0252c1/url.txt

This file was deleted.

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/13affcfaed12d83b/url.txt

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/13d4b1406e769b80/gnomad.v4.0.sv.chr3.vcf.gz

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/13d4b1406e769b80/gnomad.v4.0.sv.chr3.vcf.gz.tbi

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/13d4b1406e769b80/url.txt

This file was deleted.

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/141f65a2d306a79f/url.txt

This file was deleted.

Git LFS file not shown

This file was deleted.

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/172c2a003f154e5a/url.txt

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/17f0d5f9c4671d95/url.txt

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/182806147755e799/gnomad.v4.0.sv.chr5.vcf.gz

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/182806147755e799/gnomad.v4.0.sv.chr5.vcf.gz.tbi

This file was deleted.

3 changes: 0 additions & 3 deletions excerpt-data/182806147755e799/url.txt

This file was deleted.

Loading
Loading