Skip to content

Conversation

paulzierep
Copy link
Collaborator

We need a quick fix, since we run a training tomorrow.
The issue is that multiQC does only show one file / collection; if the file names are the same.
Now we run multiQC twice.

@paulzierep paulzierep requested a review from a team as a code owner March 7, 2024 10:00
> - {% icon param-repeat %} *"Insert FastQC output"*
> - *"Type of FastQC output?"*: `Raw data`
> - {% icon param-files %} *"FastQC output"*: 4 `Raw data` outputs of **FastQC** {% icon tool %}
> 4. {% tool [MultiQC](toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy0) %} with the following parameters:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you not use fastp for fastp output?

Such a change should probably also be reflected in the workflow coming with the workflow, or?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah totally, we have the Fastp step in the workflow included in the training material, which shows everything, including the number of reads before and after trimming, which trainers has to check to answer the training question

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean fastp as input to mutliQC or looking at the output of fastp ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fastp output will only show you the difference before fastp and after fastp; but there is also a porechop step in between. So fastqc will give you the difference between raw reads and all QC steps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you should should include also the other fastqc output. Currently the tutorial runs fastqc twice. Once before porechop and fastp and once after. Just add the additional FastQC report, or?

Probably you can do this in the same MultiQC run (maybe name the different FastQC reports .. such that you can distinguish before and after)

Copy link
Collaborator Author

@paulzierep paulzierep Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to show multiple fastqc reports in multiQC one needs to rename the input elements for fastQC, since this part:

>>Basic Statistics	pass
#Measure	Value
Filename	Barcode11_Spike2b_fastq_gz.gz <<<<<<<<<<<<<<<<
File type	Conventional base calls
Encoding	Sanger / Illumina 1.9

is leading to the not rendering of the duplicated file names.

And that would mean to use Extract element identifiers, regex, Relable for each additional multiQC input, that looks rather complicated and difficult for the students to follow. And also takes me to long to fix for tomorrow.
But I think @EngyNasr will fix it for IWC and then we can update here as well.

@paulzierep paulzierep marked this pull request as draft March 8, 2024 07:42
@paulzierep
Copy link
Collaborator Author

Since we anyway to late for the training today, I will collect some more changes here and fix it the right way after all.
Next issue: Using Kalamari DB for host filtering - Kalamari does not contain all host reads, so this makes no sense, but could still be used, but needs to be explained correctly.

@paulzierep
Copy link
Collaborator Author

Next one:
Parse parameter value does not work outside a workflow (Gene based pathogenic identification - step)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants