M. Gooch
Yes, I think you've gotten a bit confused about what the programs
actually do.
SolexaQA:
Reports on sequence quality in a lane of Illumina data. SolexaQA does
not modify the original sequences in any way, including by trimming
them. However, SolexaQA does report what the sequences might look
like after trimming with DynamicTrim.
DynamicTrim:
Trims each read individually based on base quality scores.
LengthSort:
Sorts reads into size bins -- such as reads 50-bp and greater, and
reads smaller than this size threshold. For paired end data,
LengthSort will return two paired files with forward and reverse reads
greater than the threshold, a single file with unpaired reads greater
than the threshold, and a single file of discarded reads smaller than
the threshold. (There is more information in the DynamicTrim manual).
To give more information about trimming, the SolexaQA package
implements two trimming algorithms. The first returns the longest
contiguous read segment for which the quality score at each base is
greater than a user-supplied quality cutoff. The second returns the
result of the trimming algorithm implemented in BWA:
http://bio-bwa.sourceforge.net/bwa.shtml
The two trimming algorithms are described here:
http://solexaqa.sourceforge.net/questions.htm#trim
All of the above is implemented in plain Perl. R and matrix2png are
only used in the SolexaQA program, and then only to produce graphical
plots of data quality.
It might be worth creating some toy test files and playing around with
DynamicTrim and LengthSort. This should quickly show you how the
programs behave under any given scenario.
Best
-Murray