-
Notifications
You must be signed in to change notification settings - Fork 33
Description
Naive sequence reconstruction is introducing a stop codon in a clonal family that originally had no stops.
I'm guessing that the reconstruction doesn't know about frame.
No big deal if you don't want to address this, but it was a surprise for me.
Issue Summary
File: /fh/fast/matsen_e/shared/bcr-mut-sel/working/data/loris/jaffe-paired/jaffe_donor2_memory.csv.gz
Family: 100150-IGK-100150 (donor 2, single-member family)
Problem: Naive reconstruction introduces TAG stop codon at nucleotide positions 303-305
Details
The original memory sequence translates cleanly with no stop codons:
Memory sequence CDR3: ...CARDSVQWDRLIGYFQHW...
But the reconstructed naive sequence contains a premature stop:
Naive sequence CDR3: ...CARDSV*WELLLGYFQHW... (stop codon at position 101)
The stops_heavy: False field in the output correctly reflects the memory sequence, but the naive sequence itself contains the TAG stop codon.
Full CSV row
d2,100150-IGK-100150,AGATTGCAGCCCAGCT-1-1279054_contig_h,CAGGTGCACCTGGTGGAGTCTGGGGGAGGCGTGGTCCAGCCTGGGAGGTCCCTGAGACTCTCTTGTGCAGCCTCTGGATTCACCATAAGTACTTATGCTATGCACTGGGTCCGCCAGGCTCCAGGCAAGGGGCTGGAGTGGGTGGCAGTTACATTATATGATGGAAGCAATACATACTATGGAGATTCCGTGAAGGGCCGATTCACCATCTCCAGAGACAATTCCAAGAACACCCTGTATCTGCAAATGAACGGCCTGAGAGCTGAGGACACGGCTCTATATTATTGTGCGAGAGATAGCGTACAGTGGGACCGACTAATAGGCTACTTCCAGCACTGGGGCCAGGGCACCCTGGTCACCGTCTCCTCA,CAGGTGCACCTGGTGGAGTCTGGGGGAGGCGTGGTCCAGCCTGGGAGGTCCCTGAGACTCTCTTGTGCAGCCTCTGGATTCACCATAAGTACTTATGCTATGCACTGGGTCCGCCAGGCTCCAGGCAAGGGGCTGGAGTGGGTGGCAGTTACATTATATGATGGAAGCAATACATACTATGGAGATTCCGTGAAGGGCCGATTCACCATCTCCAGAGACAATTCCAAGAACACCCTGTATCTGCAAATGAACGGCCTGAGAGCTGAGGACACGGCTCTATATTATTGTGCGAGAGATAGCGTACAGTGGGACCGACTAATAGGCTACTTCCAGCACTGGGGCCAGGGCACCCTGGTCACCGTCTCCTCA,CAGGTGCAGCTGGTGGAGTCTGGGGGAGGCGTGGTCCAGCCTGGGAGGTCCCTGAGACTCTCCTGTGCAGCCTCTGGATTCACCTTCAGTAGCTATGGCATGCACTGGGTCCGCCAGGCTCCAGGCAAGGGGCTGGAGTGGGTGGCAGTTATATCATATGATGGAAGTAATAAATACTATGCAGACTCCGTGAAGGGCCGATTCACCATCTCCAGAGACAATTCCAAGAACACGCTGTATCTGCAAATGAACAGCCTGAGAGCTGAGGACACGGCTGTGTATTACTGTGCGAGAGATAGCGTATAGTGGGAGCTACTACTAGGCTACTTCCAGCACTGGGGCCAGGGCACCCTGGTCACCGTCTCCTCA,IGH,IGHV3-30*03,IGHJ1*01,CARDSVQWDRLIGYFQHW,True,False,False,0,75,96,150,171,288,333,75,87,156,165,297,330,23,AGATTGCAGCCCAGCT-1-1279054_contig_l,AAAATTGTGTTGACACAGTCTCCAGCCACCCTGTCTTTGTCTCCAGGGGAAAGAGCCACCCTCTCCTGCAGGGCCAGTCAGAGTGTTAGCAGCTACTTAGCCTGGTACCAACAGAAGCCTGGCCAGCCTCCCAGGCTCCTCATCTATGATGCATCCTACAGGGCCACTGGCATCCCAGCCAGGTTCAGTGGCAGTGGGTCTGGGACAGACTTCACTCTCACCATCAGCAGCCTAGAGCCTGAAGATTTTGCAGTTTATTACTGTCAGCAGCGTAGCAGCTGGCCTCTCACTTTCGGCGGAGGGACCAAGGTGGAGATCAAA,AAAATTGTGTTGACACAGTCTCCAGCCACCCTGTCTTTGTCTCCAGGGGAAAGAGCCACCCTGTCAGGGCCAGTCAGAGTGTTAGCAGCTACTTAGCCTGGTACCAACAGAAGCCTGGCCAGCCTCCCAGGCTCCTCATCTATGATGCATCCTACAGGGCCACTGGCATCCCAGCCAGGTTCAGTGGCAGTGGGTCTGGGACAGACTTCACTCTCACCATCAGCAGCCTAGAGCCTGAAGATTTTGCAGTTTATTACTGTCAGCAGCGTAGCAGCTGGCCTCTCACTTTCGGCGGAGGGACCAAGGTGGAGATCAAA,GAAATTGTGTTGACACAGTCTCCAGCCACCCTGTCTTTGTCTCCAGGGGAAAGAGCCACCCTCTCCTGCAGGGCCAGTCAGAGTGTTAGCAGCTACTTAGCCTGGTACCAACAGAAACCTGGCCAGGCTCCCAGGCTCCTCATCTATGATGCATCCAACAGGGCCACTGGCATCCCAGCCAGGTTCAGTGGCAGTGGGTCTGGGACAGACTTCACTCTCACCATCAGCAGCCTAGAGCCTGAAGATTTTGCAGTTTATTACTGTCAGCAGCGTAGCAACTGGCCTCTCACTTTCGGCGGAGGGACCAAGGTGGAGATCAAA,IGK,IGKV3-11*01,IGKJ4*01,CQQRSSWPLTF,True,False,False,0,78,93,147,153,264,288,75,93,147,132,270,285,5.0