@@ -172,7 +172,6 @@ Get the same reference we used for
172
172
173
173
``` bash
174
174
FTPDIR=ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids
175
-
176
175
curl ${FTPDIR} /GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz | gunzip > ${DATA_DIR} /GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
177
176
samtools faidx ${DATA_DIR} /GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
178
177
```
@@ -184,7 +183,7 @@ And then, run DeepVariant.
184
183
[ DeepVariant Case Study] ( deepvariant-case-study.md ) .)
185
184
186
185
``` bash
187
- BIN_VERSION=" 1.8 .0"
186
+ BIN_VERSION=" 1.9 .0"
188
187
189
188
sudo docker pull google/deepvariant:" ${BIN_VERSION} "
190
189
@@ -204,9 +203,9 @@ time sudo docker run \
204
203
205
204
Stage | Time (minutes)
206
205
-------------------------------- | -----------------
207
- make_examples | 59m19.845s
208
- call_variants | 49m41.643s
209
- postprocess_variants (with gVCF) | 7m46.195s
206
+ make_examples | 81m11.112s
207
+ call_variants | 38m27.228s
208
+ postprocess_variants (with gVCF) | 9m13.565s
210
209
211
210
212
211
### Run hap.py
@@ -244,16 +243,16 @@ Output:
244
243
```
245
244
Benchmarking Summary:
246
245
Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt FP.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio
247
- INDEL ALL 504501 502210 2291 954974 1522 429900 956 362 0.995459 0.997101 0.450169 0.996279 NaN NaN 1.489759 1.942299
248
- INDEL PASS 504501 502210 2291 954974 1522 429900 956 362 0.995459 0.997101 0.450169 0.996279 NaN NaN 1.489759 1.942299
249
- SNP ALL 3327496 3316336 11160 3823082 4229 500683 1696 356 0.996646 0.998727 0.130963 0.997686 2.102576 1.990152 1.535137 1.449299
250
- SNP PASS 3327496 3316336 11160 3823082 4229 500683 1696 356 0.996646 0.998727 0.130963 0.997686 2.102576 1.990152 1.535137 1.449299
246
+ INDEL ALL 504501 502342 2159 956579 1444 431515 881 290 0.995721 0.99725 0.451102 0.996485 NaN NaN 1.489759 1.924206
247
+ INDEL PASS 504501 502342 2159 956579 1444 431515 881 290 0.995721 0.99725 0.451102 0.996485 NaN NaN 1.489759 1.924206
248
+ SNP ALL 3327496 3319188 8308 4031912 5621 705300 1705 469 0.997503 0.99831 0.174929 0.997907 2.102576 1.889869 1.535137 1.312185
249
+ SNP PASS 3327496 3319188 8308 4031912 5621 705300 1705 469 0.997503 0.99831 0.174929 0.997907 2.102576 1.889869 1.535137 1.312185
251
250
```
252
251
253
252
This can be compared with
254
- https://github.com/google/deepvariant/blob/r1.8 /docs/metrics.md#accuracy .
253
+ https://github.com/google/deepvariant/blob/r1.9 /docs/metrics.md#accuracy .
255
254
256
255
Which shows that ` vg giraffe ` improves F1:
257
256
258
- - Indel F1: 0.995945 --> 0.996279
259
- - SNP F1: 0.996213 --> 0.997686
257
+ - Indel F1: 0.995845 --> 0.996485
258
+ - SNP F1: 0.996133 --> 0.997907
0 commit comments