Skip to content

Commit f32edca

Browse files
kishwarshafincopybara-github
authored andcommitted
Update case studies for 1.9
PiperOrigin-RevId: 757960591
1 parent 7f285b2 commit f32edca

File tree

3 files changed

+22
-23
lines changed

3 files changed

+22
-23
lines changed

docs/deepvariant-fast-pipeline-case-study.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -48,13 +48,13 @@ Please refer to the following documentation for more details.
4848
[Installing the NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
4949

5050
For this case study we used the
51-
[script](https://github.com/google/deepvariant/blob/r1.8.0/scripts/install_nvidia_docker.sh)
51+
[script](https://github.com/google/deepvariant/blob/r1.9/scripts/install_nvidia_docker.sh)
5252
that automates the CUDA and container tools kit installation.
5353

5454
Please note that the script takes about 30 minutes to run.
5555

5656
```bash
57-
wget https://raw.githubusercontent.com/google/deepvariant/refs/heads/r1.8.0/scripts/install_nvidia_docker.sh
57+
wget https://raw.githubusercontent.com/google/deepvariant/refs/heads/r1.9/scripts/install_nvidia_docker.sh
5858
chmod +x install_nvidia_docker.sh
5959
./install_nvidia_docker.sh
6060
```
@@ -64,7 +64,7 @@ chmod +x install_nvidia_docker.sh
6464
### Get DeepVariant Docker image
6565

6666
```bash
67-
BIN_VERSION="1.8.0"
67+
BIN_VERSION="1.9.0"
6868
sudo docker pull google/deepvariant:"${BIN_VERSION}-gpu"
6969
```
7070

@@ -217,9 +217,9 @@ variants.gvcf.chr20.vcf
217217
With the same settings the pipeline takes approximately 10 minutes.
218218

219219
```
220-
real 8m15.252s
221-
user 0m0.007s
222-
sys 0m0.035s
220+
real 12m45.795s
221+
user 0m0.018s
222+
sys 0m0.038s
223223
```
224224

225225
## Benchmark output
@@ -256,8 +256,8 @@ time sudo docker run \
256256
```
257257
Benchmarking Summary:
258258
Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt FP.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio
259-
INDEL ALL 10628 10543 85 22403 74 11375 40 29 0.992002 0.993290 0.507744 0.992646 NaN NaN 1.748961 2.138647
260-
INDEL PASS 10628 10543 85 22403 74 11375 40 29 0.992002 0.993290 0.507744 0.992646 NaN NaN 1.748961 2.138647
261-
SNP ALL 70166 70101 65 105602 71 35342 12 12 0.999074 0.998989 0.334672 0.999032 2.296566 1.713281 1.883951 1.503192
262-
SNP PASS 70166 70101 65 105602 71 35342 12 12 0.999074 0.998989 0.334672 0.999032 2.296566 1.713281 1.883951 1.503192
259+
INDEL ALL 10628 10553 75 22560 72 11522 37 28 0.992943 0.993477 0.510727 0.993210 NaN NaN 1.748961 2.180292
260+
INDEL PASS 10628 10553 75 22560 72 11522 37 28 0.992943 0.993477 0.510727 0.993210 NaN NaN 1.748961 2.180292
261+
SNP ALL 70166 70106 60 102415 69 32148 9 9 0.999145 0.999018 0.313899 0.999081 2.296566 1.72911 1.883951 1.442237
262+
SNP PASS 70166 70106 60 102415 69 32148 9 9 0.999145 0.999018 0.313899 0.999081 2.296566 1.72911 1.883951 1.442237
263263
```

docs/deepvariant-training-case-study.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -534,7 +534,7 @@ sudo docker run --gpus 1 \
534534
--disable_small_model
535535
```
536536

537-
Starting in v1.8.0, by default we use a small model to classify some
537+
We use a small model to classify some
538538
candidates. In this example, we set `--disable_small_model` so
539539
that small model is disabled. This allows us to run all examples
540540
through the model we just trained.

docs/deepvariant-vg-case-study.md

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,6 @@ Get the same reference we used for
172172

173173
```bash
174174
FTPDIR=ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids
175-
176175
curl ${FTPDIR}/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz | gunzip > ${DATA_DIR}/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
177176
samtools faidx ${DATA_DIR}/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
178177
```
@@ -184,7 +183,7 @@ And then, run DeepVariant.
184183
[DeepVariant Case Study](deepvariant-case-study.md).)
185184

186185
```bash
187-
BIN_VERSION="1.8.0"
186+
BIN_VERSION="1.9.0"
188187

189188
sudo docker pull google/deepvariant:"${BIN_VERSION}"
190189

@@ -204,9 +203,9 @@ time sudo docker run \
204203

205204
Stage | Time (minutes)
206205
-------------------------------- | -----------------
207-
make_examples | 59m19.845s
208-
call_variants | 49m41.643s
209-
postprocess_variants (with gVCF) | 7m46.195s
206+
make_examples | 81m11.112s
207+
call_variants | 38m27.228s
208+
postprocess_variants (with gVCF) | 9m13.565s
210209

211210

212211
### Run hap.py
@@ -244,16 +243,16 @@ Output:
244243
```
245244
Benchmarking Summary:
246245
Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt FP.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio
247-
INDEL ALL 504501 502210 2291 954974 1522 429900 956 362 0.995459 0.997101 0.450169 0.996279 NaN NaN 1.489759 1.942299
248-
INDEL PASS 504501 502210 2291 954974 1522 429900 956 362 0.995459 0.997101 0.450169 0.996279 NaN NaN 1.489759 1.942299
249-
SNP ALL 3327496 3316336 11160 3823082 4229 500683 1696 356 0.996646 0.998727 0.130963 0.997686 2.102576 1.990152 1.535137 1.449299
250-
SNP PASS 3327496 3316336 11160 3823082 4229 500683 1696 356 0.996646 0.998727 0.130963 0.997686 2.102576 1.990152 1.535137 1.449299
246+
INDEL ALL 504501 502342 2159 956579 1444 431515 881 290 0.995721 0.99725 0.451102 0.996485 NaN NaN 1.489759 1.924206
247+
INDEL PASS 504501 502342 2159 956579 1444 431515 881 290 0.995721 0.99725 0.451102 0.996485 NaN NaN 1.489759 1.924206
248+
SNP ALL 3327496 3319188 8308 4031912 5621 705300 1705 469 0.997503 0.99831 0.174929 0.997907 2.102576 1.889869 1.535137 1.312185
249+
SNP PASS 3327496 3319188 8308 4031912 5621 705300 1705 469 0.997503 0.99831 0.174929 0.997907 2.102576 1.889869 1.535137 1.312185
251250
```
252251

253252
This can be compared with
254-
https://github.com/google/deepvariant/blob/r1.8/docs/metrics.md#accuracy.
253+
https://github.com/google/deepvariant/blob/r1.9/docs/metrics.md#accuracy.
255254

256255
Which shows that `vg giraffe` improves F1:
257256

258-
- Indel F1: 0.995945 --> 0.996279
259-
- SNP F1: 0.996213 --> 0.997686
257+
- Indel F1: 0.995845 --> 0.996485
258+
- SNP F1: 0.996133 --> 0.997907

0 commit comments

Comments
 (0)