Skip to content

Naive sequence reconstruction is introducing a stop codon #332

@matsen

Description

@matsen

Naive sequence reconstruction is introducing a stop codon in a clonal family that originally had no stops.
I'm guessing that the reconstruction doesn't know about frame.
No big deal if you don't want to address this, but it was a surprise for me.

Issue Summary

File: /fh/fast/matsen_e/shared/bcr-mut-sel/working/data/loris/jaffe-paired/jaffe_donor2_memory.csv.gz
Family: 100150-IGK-100150 (donor 2, single-member family)
Problem: Naive reconstruction introduces TAG stop codon at nucleotide positions 303-305

Details

The original memory sequence translates cleanly with no stop codons:

Memory sequence CDR3: ...CARDSVQWDRLIGYFQHW...

But the reconstructed naive sequence contains a premature stop:

Naive sequence CDR3:  ...CARDSV*WELLLGYFQHW...  (stop codon at position 101)

The stops_heavy: False field in the output correctly reflects the memory sequence, but the naive sequence itself contains the TAG stop codon.

Full CSV row

d2,100150-IGK-100150,AGATTGCAGCCCAGCT-1-1279054_contig_h,CAGGTGCACCTGGTGGAGTCTGGGGGAGGCGTGGTCCAGCCTGGGAGGTCCCTGAGACTCTCTTGTGCAGCCTCTGGATTCACCATAAGTACTTATGCTATGCACTGGGTCCGCCAGGCTCCAGGCAAGGGGCTGGAGTGGGTGGCAGTTACATTATATGATGGAAGCAATACATACTATGGAGATTCCGTGAAGGGCCGATTCACCATCTCCAGAGACAATTCCAAGAACACCCTGTATCTGCAAATGAACGGCCTGAGAGCTGAGGACACGGCTCTATATTATTGTGCGAGAGATAGCGTACAGTGGGACCGACTAATAGGCTACTTCCAGCACTGGGGCCAGGGCACCCTGGTCACCGTCTCCTCA,CAGGTGCACCTGGTGGAGTCTGGGGGAGGCGTGGTCCAGCCTGGGAGGTCCCTGAGACTCTCTTGTGCAGCCTCTGGATTCACCATAAGTACTTATGCTATGCACTGGGTCCGCCAGGCTCCAGGCAAGGGGCTGGAGTGGGTGGCAGTTACATTATATGATGGAAGCAATACATACTATGGAGATTCCGTGAAGGGCCGATTCACCATCTCCAGAGACAATTCCAAGAACACCCTGTATCTGCAAATGAACGGCCTGAGAGCTGAGGACACGGCTCTATATTATTGTGCGAGAGATAGCGTACAGTGGGACCGACTAATAGGCTACTTCCAGCACTGGGGCCAGGGCACCCTGGTCACCGTCTCCTCA,CAGGTGCAGCTGGTGGAGTCTGGGGGAGGCGTGGTCCAGCCTGGGAGGTCCCTGAGACTCTCCTGTGCAGCCTCTGGATTCACCTTCAGTAGCTATGGCATGCACTGGGTCCGCCAGGCTCCAGGCAAGGGGCTGGAGTGGGTGGCAGTTATATCATATGATGGAAGTAATAAATACTATGCAGACTCCGTGAAGGGCCGATTCACCATCTCCAGAGACAATTCCAAGAACACGCTGTATCTGCAAATGAACAGCCTGAGAGCTGAGGACACGGCTGTGTATTACTGTGCGAGAGATAGCGTATAGTGGGAGCTACTACTAGGCTACTTCCAGCACTGGGGCCAGGGCACCCTGGTCACCGTCTCCTCA,IGH,IGHV3-30*03,IGHJ1*01,CARDSVQWDRLIGYFQHW,True,False,False,0,75,96,150,171,288,333,75,87,156,165,297,330,23,AGATTGCAGCCCAGCT-1-1279054_contig_l,AAAATTGTGTTGACACAGTCTCCAGCCACCCTGTCTTTGTCTCCAGGGGAAAGAGCCACCCTCTCCTGCAGGGCCAGTCAGAGTGTTAGCAGCTACTTAGCCTGGTACCAACAGAAGCCTGGCCAGCCTCCCAGGCTCCTCATCTATGATGCATCCTACAGGGCCACTGGCATCCCAGCCAGGTTCAGTGGCAGTGGGTCTGGGACAGACTTCACTCTCACCATCAGCAGCCTAGAGCCTGAAGATTTTGCAGTTTATTACTGTCAGCAGCGTAGCAGCTGGCCTCTCACTTTCGGCGGAGGGACCAAGGTGGAGATCAAA,AAAATTGTGTTGACACAGTCTCCAGCCACCCTGTCTTTGTCTCCAGGGGAAAGAGCCACCCTGTCAGGGCCAGTCAGAGTGTTAGCAGCTACTTAGCCTGGTACCAACAGAAGCCTGGCCAGCCTCCCAGGCTCCTCATCTATGATGCATCCTACAGGGCCACTGGCATCCCAGCCAGGTTCAGTGGCAGTGGGTCTGGGACAGACTTCACTCTCACCATCAGCAGCCTAGAGCCTGAAGATTTTGCAGTTTATTACTGTCAGCAGCGTAGCAGCTGGCCTCTCACTTTCGGCGGAGGGACCAAGGTGGAGATCAAA,GAAATTGTGTTGACACAGTCTCCAGCCACCCTGTCTTTGTCTCCAGGGGAAAGAGCCACCCTCTCCTGCAGGGCCAGTCAGAGTGTTAGCAGCTACTTAGCCTGGTACCAACAGAAACCTGGCCAGGCTCCCAGGCTCCTCATCTATGATGCATCCAACAGGGCCACTGGCATCCCAGCCAGGTTCAGTGGCAGTGGGTCTGGGACAGACTTCACTCTCACCATCAGCAGCCTAGAGCCTGAAGATTTTGCAGTTTATTACTGTCAGCAGCGTAGCAACTGGCCTCTCACTTTCGGCGGAGGGACCAAGGTGGAGATCAAA,IGK,IGKV3-11*01,IGKJ4*01,CQQRSSWPLTF,True,False,False,0,78,93,147,153,264,288,75,93,147,132,270,285,5.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions