Now i am trying to process the data (MiSeq paried end data with mixed orientation) as per discussion, and got few more queries about. I followed the analysis in following ways.
join_paired_ends.py -f /usit/abel/u1/sunilm/LSP/L_R1.fastq -r /usit/abel/u1/sunilm/LSP/L_R2.fastq -o /usit/abel/u1/sunilm/LSP/fastq-join_joined/
convert_fastaqual_fastq.py -c fastq_to_fastaqual -f /usit/abel/u1/sunilm/LSP/fastq-join_joined/fastqjoin.join.fastq
split_libraries.py -o /usit/abel/u1/sunilm/LSP/filtered_forward/ -f /usit/abel/u1/sunilm/LSP/fastqjoin.join.fna -q /usit/abel/u1/sunilm/LSP/fastqjoin.join.qual -m /usit/abel/u1/sunilm/LSP/mapfile_LSP.txt -w 50 -s 25 -H 8 -l 200 -L 450 -a 0 -b variable_length -d -z truncate_only
split_libraries.py -o /usit/abel/u1/sunilm/LSP/filtered_reverse/ -f /usit/abel/u1/sunilm/LSP/fastqjoin.join.fna -q /usit/abel/u1/sunilm/LSP/fastqjoin.join.qual -m /usit/abel/u1/sunilm/LSP/mapfile_LSP_rev.txt -w 50 -s 25 -H 8 -l 200 -L 450 -a 0 -b variable_length -d -z truncate_only
adjust_seq_orientation.py -i /usit/abel/u1/sunilm/LSP/filtered_reverse/seqs.fna
Joining of forward and reverse reads
cat filtered_forwrad/seqs.fna filtered_reverse/seqs_rc.fna > merge/merged_seqs.fna
#####
I didnt managed to use
identify_chimeric_seqs.py -m usearch61 -i /usit/abel/u1/sunilm/LSP/merge/merged_seqs.fna --suppress_usearch61_ref --keep_intermediates --usearch61_abundance_skew 2.0 --usearch61_mindiv 1.0 -o usearch61_chimera_checking/
######
pick_otus.py -s 0.97 -i /usit/abel/u1/sunilm/LSP/merge/merge.unique.pick.fna -m uclust --optimal -o /usit/abel/u1/sunilm/LSP/merge/uclust_97/
############
Issue 1. Is this workflow is OK.
Issue 2. After joining the paired end reads, i converted them into .fna and .qual files. then i used split_library.py for demultiplexing and quality control. I am not sure how best we should do quality control on converted quality score. Reads quality score looks not that good when we converted them into .qual file. In some illumina data paper researcher have selected reads with Phered score>35. What is the comparable 454 quality score for illumina data. In split_library i selected reads with 25 score values here. Is this good enough?
Issue 3. While using identify_chimeric_seqs.py i got error, that usearch61 has not installed. When i talk to people handling cluster they said it is installed. how can i use it. is there any alternate way to use it.
Looking forward to hear from you...
Regards
Sunil