-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Dear developers,
Whilst attempting to use your software, I found a number of problems which I fixed manually. Perhaps it may be of some help.
The first problem I encountered was in the bin/bg7 shell script on the blastn and tblastn validation lines. My ncbi-blast install is version 2.2.28+ and i found that the generated .xml output files all had a blank line at the bottom. Therefore, the validation lines:
Line 241: rnaBlastOk=$(tail -n 1 ${rnasVsContigsOutputPath} | grep -c '')
Line 279: proteinBlastOk=tail -n 1 ${proteinsVsContigsOutPath} | grep -c '</BlastOutput>'
Had to be changed to:
Line 241: rnaBlastOk=$(tail -n 2 ${rnasVsContigsOutputPath} | grep -c '')
Line 279: proteinBlastOk=tail -n 2 ${proteinsVsContigsOutPath} | grep -c '</BlastOutput>'
The -n argument of the "tail" program had to change from 1 to 2 to skip that newline at the end.
The second problem I found is within the directory structure of BG7. The BG7 jar file is jars/BG7.jar. However, the bg7 shell script looks for it as "jar/bg7.jar". I had to change this:
cp $BG7_HOME/jar/bg7.jar $output_folder/
echo "running bg7 now!"
java -d64 -Xmx6G -Xms1G -jar $output_folder/bg7.jar
rm -f $output_folder/bg7.jar
To this:
cp $BG7_HOME/jars/BG7.jar $output_folder/
echo "running bg7 now!"
java -d64 -Xmx6G -Xms1G -jar $output_folder/BG7.jar
rm -f $output_folder/BG7.jar
To correct for the jar directory name mismatch and the jar's case.
Finally, the template execution file for the PredictGenes task was lacking a parameter, which I added as the default DIF_SPAN value (indicated as 30 in the PredictGenes.java file). To do this, I changed in the bg7 script the following lines:
<class_full_name>com.era7.bioinfo.annotation.PredictGenes</class_full_name>
<arguments>
<argument>${name}_proteins_tBLASTn.xml</argument>
<argument>${name}_sequences.fna</argument>
<argument>${name}_PredictedGenes.xml</argument>
<argument>400</argument>
<argument>true</argument>
</arguments>
To:
<class_full_name>com.era7.bioinfo.annotation.PredictGenes</class_full_name>
<arguments>
<argument>${name}_proteins_tBLASTn.xml</argument>
<argument>${name}_sequences.fna</argument>
<argument>${name}_PredictedGenes.xml</argument>
<argument>400</argument>
<argument>true</argument>
<argument>30</argument>
</arguments>
So now it takes in account the last argument when invoking the BG7.jar in the last part of the script. I also changed the "executionsTemplate.xml" file that comes with in the BG7 directory to add the same parameter line on the same place. So this:
<execution>
<class_full_name>com.era7.bioinfo.annotation.PredictGenes</class_full_name>
<arguments>
<argument>XX_proteins_tBLASTn.xml</argument>
<argument>XX_sequences_header_fixed.fna</argument>
<argument>XX_PredictedGenes.xml</argument>
<argument>400</argument>
<argument>false</argument>
</arguments>
</execution>
Changed to this:
<execution>
<class_full_name>com.era7.bioinfo.annotation.PredictGenes</class_full_name>
<arguments>
<argument>XX_proteins_tBLASTn.xml</argument>
<argument>XX_sequences_header_fixed.fna</argument>
<argument>XX_PredictedGenes.xml</argument>
<argument>400</argument>
<argument>false</argument>
<argument>30</argument>
</arguments>
</execution>
Finally, after doing all of the above, I managed to get bg7 working properly. I hope this is of some help to the team.
Out of curiosity, how long do you estimate for a release with these fixes, so that BG7 comes "working out of the box"?
Thanks for your time!