align_seqs.py fungal ITS seqs

301 views
Skip to first unread message

Charles Hauser

unread,
Jun 18, 2015, 5:44:47 PM6/18/15
to qiime...@googlegroups.com
Hi

We have some fungal ITS sequences we are trying to analyze using qiime, and have run into a problem attempting to align them to a reference set (its_12_11):

align_seqs.py -i otus/rep_set/sc_union_rep_set.fna -m pynast -t $QIIME_DIR/its_12_11_otus/rep_set/97_otus.fasta -o otus/pynast_aligned_its_sc/
Traceback (most recent call last):
  File "/seu/cs/home/project/binf/bin/align_seqs.py", line 211, in <module>
    main()
  File "/seu/cs/home/project/binf/bin/align_seqs.py", line 194, in main
    log_path=log_path, failure_path=failure_path)
  File "/seu/cs/home/project/binf/lib/qiime/align_seqs.py", line 250, in __call__
    template_alignment, DNASequence, validate=True)
  File "/seu/cs/home/project/binf/lib/skbio/alignment/_alignment.py", line 143, in from_fasta_records
    return cls(data, validate=validate)
  File "/seu/cs/home/project/binf/lib/skbio/alignment/_alignment.py", line 998, in __init__
    super(Alignment, self).__init__(seqs, validate)
  File "/seu/cs/home/project/binf/lib/skbio/alignment/_alignment.py", line 163, in __init__
    "%s failed to validate." % self.__class__.__name__)
skbio.alignment._exception.SequenceCollectionError: Alignment failed to validate.


Not sure what the issue is, and would appreciate suggestions

thanks

charles

Jai Ram Rideout

unread,
Jun 19, 2015, 12:08:54 PM6/19/15
to qiime...@googlegroups.com
Hi Charles,

The file passed via -t has characters that aren't valid IUPAC DNA characters. Looks like it has two sequences that each contain an underscore character.

Note that you likely won't be able to use PyNAST to align your ITS sequences because PyNAST requires a template alignment. For now we recommend suppressing the alignment and tree-building steps of the OTU picking workflows and using non-phylogenetic diversity metrics. For example, if you're using pick_open_reference_otus.py you can pass --suppress_align_and_tree to skip those steps. Then you can pass --nonphylogenetic_diversity to core_diversity_analyses.py.

For more details about aligning ITS sequences (with some suggestions for alternative approaches):


This PeerJ preprint may also be of interest:


Hope this helps,
Jai

--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages