PRS Scoring File Low Overlap #417
Unanswered
Lord-of-Bugs
asked this question in
Q&A
Replies: 1 comment
-
|
The matching doesn't work on rsIDs, it works on chromosome, position, and then the alleles. Could you extract some matched rows from the df and the vcf and some that didn't? Might be able to help that way. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there,
Would appreciate someone help pointing out the issue here. I am trying to prepare my own sample individual data for calculating PRS. Here's how I prepared my data:
My nextflow pipeline is set up just fine. But every time the pipeline breaks down before neither score passes the match rate threshold:
I ran a manual inspection about my data. Basically, I grabbed the rs IDs present in the PGS000001 scoring file and checked for existence in the GSA v3 build 38 manifest, and it seems that > 90% of the markers in the scoring file are present in my data.
For example, here's my called markers matched to the manifest file and processed to be almost like a VCF (
df):Scoring file dataframe (
breast_cancer_df):And
So now I am completely unsure why the pipeline does not meet the overlap threshold. I ran the following commands:
Would greatly appreciate someone taking a stab at this!!!
Beta Was this translation helpful? Give feedback.
All reactions