Error with align_sequences.py -- Using Blast+ in QIIME?

28 views
Skip to first unread message

Christian Gray

unread,
Feb 28, 2017, 9:36:34 PM2/28/17
to Qiime 1 Forum
Hi,

I have been trying to use QIIME's assign_taxonomy.py with the database I created from Christopher Baker's entrez_qiime.py command, but when I do, QIIME claims that it cannot find the blastall command, presumably because it wants to use Legacy BLAST as opposed to BLAST+. Is there a way I can get QIIME to use BLAST+ instead of legacy BLAST? If there is, I think that would be the simplest way to fix the problems I am having
If not, here is what I have done with legacy BLAST. I already tried using legacy BLAST with the database I had created using makeblastdb, but I came up with no BLAST hits for my sequences. I tested the command with 3 sequences that were used to make my database, and still ended up with no BLAST hits. I thought that it might be a problem with the fact that my database was formatted using makeblastdb instead of formatdb, so I attempted to run formatdb with the command:

formatdb -i output/sequence.fasta -p F -o T -n nifH_db_formatdb

But when I run it, I get the following error:

WARNING: Cannot add sequence number 66112 (lcl|FTPD00000000.1) because it has zero-length.


[formatdb 2.2.22] FATAL ERROR: Fatal error when adding sequence to BLAST database.


Is there a way to get formatdb to ignore these "zero-length" sequences? makeblastdb made a similar comment when I used it, but it just skipped over them, so they didn't cause a problem.

Is it possible that the no BLAST hits is a result of the database being formed with makeblastdb, and then ran with legacy BLAST?


Sorry if there are too many questions here. Let me know if I should upload anything else.

Thanks for the help!

jonsan

unread,
Mar 1, 2017, 3:55:58 PM3/1/17
to Qiime 1 Forum
Hi Christian!

I've emailed Chris to see if he has any insights... Stay tuned!

cheers,
-jon

jonsan

unread,
Mar 2, 2017, 1:17:00 PM3/2/17
to Qiime 1 Forum
Hi Christian,

Chris writes back:

...I did come across what would seem to be the equivalent problem when using makeblastdb. Namely that makeblastdb baulks at blank lines and deflines without sequences in the FASTA file. If that is the problem here with formatdb, as it appears to be, it’s unclear to me why he didn’t also have that problem with makeblastdb. Or maybe he did but it didn’t throw an error, and his blast database was somehow corrupted, contributing to his no blast results problem?


Anyway, I would suggest pre-cleaning the FASTA file before running either of those utilities. I used the following in bash to clean mine and it seemed to do the trick:


# get rid of blank lines ...

grep -v '^$' ./fungi2017.fna > ./fungi2017_noblanks.fna

# ... and deflines without sequences

awk -v RS=">" -v FS="\n" -v ORS="" ' { if ($2) print ">"$0 } ' ./fungi2017_noblanks.fna > ./fungi2017_noblanks_noempty.fna

# get rid of the intermediate file

rm ./fungi2017_noblanks.fna


Hope that helps!


Let me know if that works out for you!

Cheers,
-jon 

Christian Gray

unread,
Mar 5, 2017, 5:46:59 PM3/5/17
to Qiime 1 Forum
Hi Jon,

Thanks for your help! That fixed my problem as far as creating the database goes, but I am still, unfortunately, not getting any hits on my sequences. I used the following command to create the database:

formatdb -i output_noblanks/nifH2017_noblanks_noempty.fasta -p F -o T -n nifH_db_formatdb


And then I used assign taxonomy.py, and still got no blast hits with both my set of sequences and a subset of sequences I used to create the database:

assign_taxonomy.py -i Christian_nif_joined_seqs_Q30_all.fna -t output_noblanks/nifH2017_noblanks_noempty_accession_taxonomy.txt -m blast -b nifH_db_formatdb


Any ideas for how I should go about solving this problem / troubleshooting to see what the problem is with the database? 


Thanks,


Christian

jonsan

unread,
Mar 6, 2017, 10:20:52 PM3/6/17
to Qiime 1 Forum
Hi Christian,

Well, progress at least. :) 

If you call blast directly, rather than as part of Qiime, do you get any hits to the newly formatted db?


-j
Reply all
Reply to author
Forward
0 new messages