-
Notifications
You must be signed in to change notification settings - Fork 162
Description
Hello,
I have a question about pooling and chimera filtering. I understand that the rule of thumb is to use pooled chimera filtering only on datasets that were denoised in pooled mode (i.e. removeBimeraDenovo(method = "pooled") on dada(pool = TRUE)), and non-pooled chimera filtering on datasets that were not pooled during denoising (i.e. removeBimeraDenovo(method = "consensus") on dada(pool = "pseudo") or dada(pool = FALSE)). But what do you do when your dataset is made up of several individually denoised datasets that were merged with mergeSequenceTables(), assuming they were all denoised in pooled mode? Chimera filtering should perform better on a larger dataset, but is it still recommended in this situation, and in which mode?
And a follow-up question: assume some datasets making up the merged dataset were denoised in pooled mode and some in pseudo-pooled mode?
Another problem of merging datasets is that they might have very different sequencing depth, which can mess with chimera detection. The obvious example is pooling a NovaSeq dataset with a MiSeq dataset and then doing chimera filtering, which I guess isn't advisable. But even datasets sequenced on the same type of flow cell can have very varying sequencing depth depending on how many samples were multiplexed on the flow cell, so what is the recommended way of dealing with that? When should you do chimera filtering on the merged dataset and when should you not?