Uniformly poor read quality

Andrew Veale

unread,

Dec 6, 2016, 3:18:00 AM12/6/16

to solexaqa-users

I just ran my Illumina reads through the solexaqa analysis, and it indicates my read quality is uniformly bad. I had previously received a report using fastqc which indicated the quality was pretty good...

I have attached some of the outputs from solexaqa.

I believe (and hope) that this is down to a mismatch in encoding the quality values, so that solexaqa believes the reads are worse than they actually are.

Any ideas would be greatly appreciated!

C9PNDANXX-2178-01-34-1_S24_L002_R1_001.fastq.quality.pdf

C9PNDANXX-2178-01-34-1_S24_L002_R1_001.fastq.matrix.pdf

Murray Cox

unread,

Dec 6, 2016, 3:23:18 AM12/6/16

to solexaq...@googlegroups.com

Hi Andrew,

Yes, superficially that dataset does look bad.

SolexaQA should automatically detect the file format. You don't mention how you ran the software (i.e., the command line), but try using the default automatic file format detection option.

Alternately, if you have generated this data set recently, it is almost certainly in 'Sanger' format. (The 'Illumina' format is an old variant once used by Illumina, but this can sometimes lead people astray).

If the issue is not one of the above, then we would need to think further...

-Murray

--
You received this message because you are subscribed to the Google Groups "solexaqa-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to solexaqa-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<C9PNDANXX-2178-01-34-1_S24_L002_R1_001.fastq.quality.pdf><C9PNDANXX-2178-01-34-1_S24_L002_R1_001.fastq.matrix.pdf>

Mauro Truglio

unread,

Dec 6, 2016, 12:34:53 PM12/6/16

to solexaqa-users

Hi Andrew,

I can add something to this. In the past, another user had a similar issue, i.e. huge discrepancy between FastQC and SolexaQA. Investigating the problem, I found out the following.
While we coded SolexaQA to guess the quality encoding only when having necessary and sufficient elements to do it safely - and to warn the user otherwise -, FastQC does not do so; in absence of evidence for a safe guess, it defaults to one of the encodings (don't remember which one at the moment) without warning the user. This is quite dangerous, and leads to a complete misinterpretation of the data. I suspect this could be your case too.

Let us know if this is the case - for example by forcing the quality encoding in SolexaQA if you know it, and/or by trying a third QC software.

Cheers

Mauro

Reply all

Reply to author

Forward