The 'sliding_window' mode determines the number of mismatches between the search string and the sequence from each window and does not print the current record to STDOUT if the number of mismatches is below a specified number. Type:
$ ./remove_adapter_reads.pl -h
for more detail and also have a look inside the script. Since indel artifacts are very rare on illumina sequencers, this should be a very sensitive programme depending on specified fuzziness.
Feel free to experiment with this file and improve it. The following will download my git repo "scripts_for_RAD", fetch the *experimental* branch and checkout the file versions from that branch which contains the "experimental" version of 'remove_adapter_reads.pl':
$ git clone https://github.com/claudiuskerth/scripts_for_RAD.git
$ git remote add scripts_for_RAD@github https://github.com/claudiuskerth/scripts_for_RAD.git
$ git fetch scripts_for_RAD@github experimental
$ git checkout experimental
============================
[3] vmatch: http://www.vmatch.de/
============================
It seems like this programme can do almost anything related to string matching, but I haven't had time yet to try it out.
hope that helps,
claudius
--
You received this message because you are subscribed to the Google Groups "NGS Group APS Sheffield" group.
To unsubscribe from this group and stop receiving emails from it, send an email to NGSshef+u...@googlegroups.com.
To post to this group, send an email to NGS...@googlegroups.com.
Visit this group at http://groups.google.com/group/NGSshef.
For more options, visit https://groups.google.com/groups/opt_out.
-- Dr Kang-Wook Kim Postdoctoral Research Associate Department of Animal and Plant Sciences University of Sheffield Western Bank Sheffield S10 2TN United Kingdom Phone: +44 (0)114 222 0112 e-mail: k....@sheffield.ac.uk