pgsc_calc can't generate scorefile #416
-
|
Hello! I'm running pgsc_calc on a VCF file generated by the TOPMed2 imputation server (hg38). The VCF contains all chromosomes without the "chr" prefix and about 503 samples. Here's my command: sudo nextflow run pgscatalog/pgsc_calc \
-profile docker \
-w .nextflow/work \
--input samplesheet.csv \
--target_build GRCh38 \
--pgs_id PGS002124,PGS002738,PGS001910,PGS001049,PGS003358,PGS002222,PGS001932,PGS003724,PGS004521,PGS003753,PGS000327,PGS002786,PGS000193,PGS002098,PGS000133 \
--run_ancestry ancestry/pgsc_HGDP+1kGP_v1.tar.zst \
--outdir ./pgsc_calc_results \
--keep_multiallelic true \
--keep_ambiguous true \
-r v2.0.0-alpha.5Then, I get the following error: executor > local (8)
[ec/6e658b] PGS…000327 PGS002786 PGS000193 PGS002098 PGS000133, pgp_id:, trait_efo:]) [100%] 1 of 1 ✔
[2d/d5c00d] PGSCATALOG_PGSCCALC:PGSCCALC:INPUT_CHECK:COMBINE_SCOREFILES (1) [100%] 1 of 1 ✔
[- ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELBIM -
[- ] PGSCATALOG_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_RELABELPVAR -
[skipped ] PGS…_PGSCCALC:PGSCCALC:MAKE_COMPATIBLE:PLINK2_VCF (rocgda chromosome ALL) [100%] 1 of 1, stored: 1 ✔
[skipped ] PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:EXTRACT_DATABASE (1) [100%] 1 of 1, stored: 1 ✔
[skipped ] PGS…:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_VARIANTS (rocgda chromosome ALL) [100%] 1 of 1, stored: 1 ✔
[skipped ] PGS…OG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:FILTER_VARIANTS (rocgda GRCh38) [100%] 1 of 1, stored: 1 ✔
[skipped ] PGS…LOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:PLINK2_MAKEBED_REF (reference) [100%] 1 of 1, stored: 1 ✔
[skipped ] PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:INTERSECT_THINNED (rocgda) [100%] 1 of 1, stored: 1 ✔
[skipped ] PGS…LOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:RELABEL_IDS (rocgda null pvar) [100%] 1 of 1, stored: 1 ✔
[skipped ] PGS…LOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:PLINK2_MAKEBED_TARGET (rocgda) [100%] 1 of 1, stored: 1 ✔
[skipped ] PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:PLINK2_ORIENT (rocgda) [100%] 1 of 1, stored: 1 ✔
[skipped ] PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:FRAPOSA_PCA (reference) [100%] 1 of 1, stored: 1 ✔
[skipped ] PGSCATALOG_PGSCCALC:PGSCCALC:ANCESTRY_PROJECT:FRAPOSA_PROJECT (rocgda) [100%] 1 of 1, stored: 1 ✔
[41/c99788] PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_VARIANTS (rocgda chromosome ALL) [100%] 1 of 1 ✔
[c3/b6f7b0] PGSCATALOG_PGSCCALC:PGSCCALC:MATCH:MATCH_COMBINE (rocgda) [100%] 1 of 1 ✔
[dc/231529] PGS…C:PGSCCALC:APPLY_SCORE:RELABEL_SCOREFILES (rocgda additive scorefile) [ 50%] 1 of 2, failed: 1
[skipped ] PGS…TALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:RELABEL_AFREQ (rocgda null afreq) [100%] 1 of 1, stored: 1 ✔
[1e/393bca] PGS…PLY_SCORE:PLINK2_SCORE (rocgda chromosome ALL effect type additive 1) [ 50%] 1 of 2
[- ] PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:SCORE_AGGREGATE -
[- ] PGSCATALOG_PGSCCALC:PGSCCALC:REPORT:ANCESTRY_ANALYSIS -
[- ] PGSCATALOG_PGSCCALC:PGSCCALC:REPORT:SCORE_REPORT -
[- ] PGSCATALOG_PGSCCALC:PGSCCALC:DUMPSOFTWAREVERSIONS -
Execution cancelled -- Finishing pending tasks before exit
ERROR ~ Error executing process > 'PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:RELABEL_SCOREFILES (rocgda additive scorefile)'
Caused by:
Process `PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:RELABEL_SCOREFILES (rocgda additive scorefile)` terminated with an error exit status (1)
Command executed:
relabel_ids --maps rocgda_ALL_matched.txt.gz --col_from ID_TARGET --col_to ID_REF --target_file rocgda_ALL_additive_0.scorefile.gz --target_col ID --dataset rocgda.scorefile --verbose --split --combined
cat <<-END_VERSIONS > versions.yml
RELABEL_SCOREFILES:
pgscatalog_utils: $(echo $(python -c 'import pgscatalog_utils; print(pgscatalog_utils.__version__)'))
END_VERSIONS
Command exit status:
1
Command output:
(empty)
Command error:
root: 2025-04-16 14:15:36 DEBUG Verbose logging enabled
pgscatalog_utils.relabel.relabel_ids: 2025-04-16 14:15:36 DEBUG Writing split output enabled
pgscatalog_utils.relabel.relabel_ids: 2025-04-16 14:15:36 DEBUG Writing combined output enabled
pgscatalog_utils.relabel.relabel_ids: 2025-04-16 14:15:36 DEBUG Reading map file rocgda_ALL_matched.txt.gz with gzip.open
pgscatalog_utils.relabel.relabel_ids: 2025-04-16 14:16:25 DEBUG Creating split output, current chrom: 1
pgscatalog_utils.relabel.relabel_ids: 2025-04-16 14:16:25 DEBUG Opening rocgda.scorefile_1_relabelled.gz and writing header
Traceback (most recent call last):
File "/venv/bin/relabel_ids", line 8, in <module>
sys.exit(relabel_ids())
File "/venv/lib/python3.10/site-packages/pgscatalog_utils/relabel/relabel_ids.py", line 179, in relabel_ids
[_relabel_target(args=args, mapping=mapping, split_output=x) for x in split_output]
File "/venv/lib/python3.10/site-packages/pgscatalog_utils/relabel/relabel_ids.py", line 179, in <listcomp>
[_relabel_target(args=args, mapping=mapping, split_output=x) for x in split_output]
File "/venv/lib/python3.10/site-packages/pgscatalog_utils/relabel/relabel_ids.py", line 107, in _relabel_target
_relabel(in_target=io.TextIOWrapper(reader), mapping=mapping, split_output=split_output, args=args)
File "/venv/lib/python3.10/site-packages/pgscatalog_utils/relabel/relabel_ids.py", line 149, in _relabel
line[i_target_col] = mapping[line[i_target_col]] # revalue column
KeyError: '1:1929306:C:T'Only the "match" directory is created, containing the files Would be great to have some help :) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
|
I think this might be a similar problem to #393 I'd suggest removing |
Beta Was this translation helpful? Give feedback.
I think this might be a similar problem to #393
I'd suggest removing
--keep_ambiguous/--keep_ambiguousand upgrading to-r v2.0.1to see if that helps