-
Notifications
You must be signed in to change notification settings - Fork 21
Description
Hi! Thanks for developing this amazing tool. I have a conceptual question regarding the filtering step.
In the paper, you mentioned:
Here, by default, the 10x cellranger clonotype definitions are filtered to remove spurious chain sharing and merge split clonotypes (for example, due to partial recovery of a second TCRα transcript).
This is also reflected by the default stringent=True for the make_10x_clones_file function.
However, this filtering step results in the majority of my paired data being dropped.
repeat?? 1598 36 ('TRAV12-101', 'TRAJ4101', 'CVVNGNSGYALNF', 'tgtgtggtgaacggcaattccgggtatgcactcaacttc') ('TRAV12-201', 'TRAJ4101', 'CVVNGNSGYALNF', 'tgtgtggtgaacggcaattccgggtatgcactcaacttc')
...
old_unpaired_barcodes: 44 old_paired_barcodes: 131357 new_stringent_paired_barcodes: 238
Setting stringent=False would obviously circumvent this problem, but I wonder what this actually means and whether my data has something seriously wrong haha? From the output, I am assuming there are duplicates but wouldn't clonal expansion also be considered as duplicates and thus be removed? Anyways it would be cool to know what exactly this stringent criteria is filtering for.
Thanks in advance!