Hello, I'm a master student who is now starting to work with bioinformatics. What I want to do is read a sam file containing many reads and select only the ones that contain a specific SNP and save them in a new sam file. I found here
http://seqanswers.com/forums/showthread.php?t=26713 some good info to start with and I've also read the wiki and got basic info on the grab and find_SNPs commands.
However if i try to run something like:
read_sam -i dario.sam |
find_SNPs |
grab -e 'POS == 6535' |
grab -e 'A>G'|
write_sam -o dario2.sam
or even just: read_sam -i file.sam | find_SNPs
i get this error:
/home/dario/Bioinformatic_programs/biopieces/code_ruby/lib/maasha/sam.rb:157:in `align_pair': undefined method `scan' for nil:NilClass (NoMethodError)
from /home/dario/Bioinformatic_programs/biopieces/code_ruby/lib/maasha/sam.rb:86:in `to_bp'
from /home/dario/Bioinformatic_programs/biopieces/bp_bin/read_sam:55:in `block (4 levels) in <main>'
from /home/dario/Bioinformatic_programs/biopieces/code_ruby/lib/maasha/sam.rb:213:in `block in each'
from /home/dario/Bioinformatic_programs/biopieces/code_ruby/lib/maasha/sam.rb:195:in `each_line'
from /home/dario/Bioinformatic_programs/biopieces/code_ruby/lib/maasha/sam.rb:195:in `each'
from /home/dario/Bioinformatic_programs/biopieces/bp_bin/read_sam:54:in `block (3 levels) in <main>'
from /home/dario/Bioinformatic_programs/biopieces/code_ruby/lib/maasha/filesys.rb:76:in `open'
from /home/dario/Bioinformatic_programs/biopieces/bp_bin/read_sam:53:in `block (2 levels) in <main>'
from /home/dario/Bioinformatic_programs/biopieces/bp_bin/read_sam:52:in `each'
from /home/dario/Bioinformatic_programs/biopieces/bp_bin/read_sam:52:in `block in <main>'
from /home/dario/Bioinformatic_programs/biopieces/code_ruby/lib/maasha/biopieces.rb:89:in `open'
from /home/dario/Bioinformatic_programs/biopieces/bp_bin/read_sam:46:in `<main>'
Is it just that I have to read more and learn how to write proper lines or I have some problem with ruby? if I run the bp_test everything is fine for what concern ruby (I have the last version installed and I use ubuntu 14.04)
Another thing I'm interested in is to use grab -r. On the wiki I found that i can use grab -r to grab records that contain a specific sequence. Does that mean that I can grab reads with a specific sequence from a fasta or sam or fastaq file that contains thousand of reads?
where can I find more information on how to use this grab function? For example: grab -e 'A>G' where I can find info about 'arguments' for the -e option?
thanks for the help in advanced and sorry for the many questions... I'm try to read and increase my knowledge but without a bioinformatic background it's not easy.