I have a huge fasta file and I want to analyze reads with only a particular sequence say "AAGTTGATAACGGACTAGCCTTATTTT" in them. When I do
grep "AAGTTGATAACGGACTAGCCTTATTTT" file.fastqI get the reads but lose other lines of the fastq format. Can you suggest an easy fix to retain the fastq format of the output. Thanks
--
You received this message because you are subscribed to the Google Groups "Unix and Perl for Biologists" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unix-and-perl-for-bi...@googlegroups.com.
To post to this group, send email to unix-and-perl-...@googlegroups.com.
Visit this group at http://groups.google.com/group/unix-and-perl-for-biologists.
For more options, visit https://groups.google.com/d/optout.
## usage: perl grep_fastq.pl pattern file.fastq
use strict;use warnings;
use MCE::Flow;
die "Not enough arguments given\n" if @ARGV < 2;
my $regex = shift;$regex = qr/$regex/;
sub user_func { my ($mce, $slurp_ref, $chunk_id) = @_; if (${ $slurp_ref } =~ $regex) { $mce->print(${ $slurp_ref }); }}
for my $filename (@ARGV) { mce_flow_f { RS => "\n@", chunk_size => 1, use_slurpio => 1 }, \&user_func, $filename;}