Skip to content

Duplicate marking produces malformed QS tags in SAM output #145

@eboyden

Description

@eboyden

With -so and writing to bam (file or stdout), every read gets a QS tag, including duplicates. But when writing to sam (file or stdout), it looks like reads marked as duplicates are missing the QS tag; instead, an empty field is present, which can cause downstream tools to crash when additional tags are appended (thus the empty field appears to be a "tag" with an improper format).

Command:

/snap/v2.0.0/snap-aligner paired /snap/index/ -pairedInterleavedFastq - -t 8 -mrl 30 -s 0 500 -fs -so -R "@RG\tID:id\tLB:lb\tPL:ILLUMINA\tPU:pu\tSM:sm" -o -sam NOINPUT.sam

Output:

cat -et NOINPUT.sam | egrep -v "^@" | head -11
SN0796:846:HK7WNBCX3:1:1201:20613:31700^I1171^Ichr6^I80129075^I70^I27M6S^I=^I80129246^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGAA^IHIIIIHG@HHHHEC@HDHEGHHFCHCE=HHEGH^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^I$
SN0796:846:HK7WNBCX3:1:1207:5416:83002^I1171^Ichr6^I80129075^I70^I27M6S^I=^I80129245^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGTA^IIGIIIIIIIIIIIIIIHIIIIIHIHHIIHIIII^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^I$
SN0796:846:HK7WNBCX3:1:1213:4603:47822^I1171^Ichr6^I80129075^I70^I27M6S^I=^I80129246^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGAA^IHHG@G<1<<@HEFCEHCC<1CD1HGE=HD111H^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^I$
SN0796:846:HK7WNBCX3:1:2104:2125:28883^I1171^Ichr6^I80129075^I70^I27M6S^I=^I80129246^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGCA^IHIIIIIIIIIHIIHIIIHIHHEH?GHHIHIIII^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^I$
SN0796:846:HK7WNBCX3:1:2104:5543:45261^I1171^Ichr6^I80129075^I70^I27M6S^I=^I80129245^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGTA^IHIHIIHHHIHIIIHIIIIIIIIIIIIIIIIIII^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^I$
SN0796:846:HK7WNBCX3:1:2215:16958:24783^I1171^Ichr6^I80129075^I70^I27M6S^I=^I80129246^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGGA^IHIHD1FIIIHF1IHGG<<1CC<<<0C/<G?HEH^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^I$
SN0796:846:HK7WNBCX3:2:1110:10832:12681^I1171^Ichr6^I80129075^I70^I27M6S^I=^I80129245^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGTA^IIIIIIIIIIHIIIIIIIIIHHHHIIIIIIHIHH^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^I$
SN0796:846:HK7WNBCX3:2:1115:14120:23567^I1171^Ichr6^I80129075^I70^I27M6S^I=^I80129246^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGGA^IGDHGIHHIHHG1HHGF@HHIIHHEHC<<IIIHH^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^I$
SN0796:846:HK7WNBCX3:2:1115:5463:88475^I1171^Ichr6^I80129075^I70^I27M6S^I=^I80129246^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGAA^IFIHGCFEEHCHHIIIHHIHHHEGC0C=FHHIII^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^I$
SN0796:846:HK7WNBCX3:2:1201:13401:6342^I1171^Ichr6^I80129075^I70^I27M6S^I=^I80129245^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGTA^IFIIIHH@?HHIIHIHHHEGHGHGHIHCCHC@EH^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^I$
SN0796:846:HK7WNBCX3:2:2203:7573:55404^I147^Ichr6^I80129075^I70^I27M6S^I=^I80129246^I132^IGTAAACATTTAAATTACTTATTTCTCAGCGGGA^IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII^IPG:Z:SNAP^INM:i:0^IRG:Z:id^ILB:Z:lb^IPL:Z:ILLUMINA^IPU:Z:pu^ISM:Z:sm^IQS:i:1240$

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions