How to set my reference sequences (sequence.keep)

30 views
Skip to first unread message

Miao Sun

unread,
Feb 19, 2015, 10:16:33 AM2/19/15
to phl...@googlegroups.com
Hi All,

Anybody have any experience of how to set your reference sequences properly to run PHLAWD? I mean what's criterion of choosing them? Are they should be sequence representative or taxa  representative?

Should they be alignment or download them from GenBank should be fine? Any length required?

I'm really puzzled about abovementioned issues. Because it seems that it will cause "Segmentation fault" on PLAWD, if your reference sequences contain errors.

Thanks!

Miao

Cody Hinchliff

unread,
Feb 19, 2015, 10:50:45 AM2/19/15
to phl...@googlegroups.com
The segmentation fault suggests that your sequence contains illegal characters, like ? or -, etc. keep sequences should only contain nucleotide coding chars. Try removing everything except the letters A C G T from your keep sews and see if that solves your problem.
--
You received this message because you are subscribed to the Google Groups "phlawd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phlawd+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Miao Sun

unread,
Feb 19, 2015, 11:03:46 AM2/19/15
to phl...@googlegroups.com
Thanks Cody!

Now things are becoming more weird. I don't think I have illegal characters in my reference sequences:

gi|54111867|gb|AY752474.1| Geranium sessiliflorum var. arenarium 18S ribosomal RNA gene, partial sequence; internal transcribed spacer 1, 5.8S ribosomal RNA gene, and internal transcribed spacer 2, complete sequence; and 28S ribosomal RNA gene, partial sequence
GGCGGTTCGCTGCCGGCGACGTCGCGAGAAGTCCACTGAACCTTATCATTTAGAGGAAGGAGAAGTCGTA
ACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTGTCGAACCCTGCACAGCAGAACGACCCGCGAAC
TCGTTAACAAACCGCGGGGAGCGGGTGGCGCTTGCGCCCCCCGCAACCCGATGTCGGGGGCTTGGGCGGA
AGCCCACGCTGCCTGACAAAAAAGTACCCCCGGCGCGGTCCGCGCCAAGGAATCGAAACGAAGCAATGTG
TGCCGTCCGCCCCGTTCGCGGGAAGTGGACGGCAACACGGTCTTCCAATGTATACTAAACGACTCTCGGC
AACGGATATCTCGGCTCTCGCATCGATGAAGAACGTAGCGAAATGCGATACTTGGTGTGAATTGCAGAAT
CCCGTGAACCATCGAGTTTTTGAACGCAAGTTGCGCCCGAAGCCATTAGGCCGAGGGCACGCCTGCCTGG
GCGTCACGCGCTCCGTCGCCCCGCAACCCCGAACCCCGAAACGGGCCAGGGTGCTTGTGGTGCGGACATT
GGTCTCCCGTGTGCCTTGCTCGCGGCTGGCCTAAAATTGAGTCCCGGACGCTCTGTTCTGCGGCCGACGG
TGGTTGAGAAGCCCTCGAAAACGTGCCGCTGCAGTGCTGCCATATGCGGACCCCTTGACCCTTGCGCGAC
CTCTCCCATTTGGGTGAGGGAGCTCCATCTGCGACCCCAGGTCAGGCGGGGCTACCCGCTGAATTTAAGC
ATATCAATAAG
>gi|124390016|gb|EF219365.1| Abutilon menziesii strain PCMB2763 internal transcribed spacer 1, partial sequence; 5.8S ribosomal RNA gene, complete sequence; and internal transcribed spacer 2, partial sequence
TGCGGAAGGATCATTGTCAGAAACCTGCCTAGCAGAACCACCCGTGAATGTGTTATCATACAAAACAGCG
AGAGGGTGCGGATGCAATATTGTGCCAACCCCTCTCGATGCCTTGGTGCGTTTGGTCTTGCCTCAACCCA
CCTCGTGTGGGTGAGCTGCAAGTTCCATCCACTCCAAGGCAAAACTAACAACCCCCGGCGCGAATTGCGC
CAAGGAATTTAAATRAAAAGAGTGCACGTCATTGTCGCCGACCCGTTCGCGGTGTTTGTGCGGGAGTGTC
GTAGCTAACTTTGTCGTGAAATACAAAACGACTCTCGGCAACGGATATCTCGGCTCTCGCATCGATAAAG
AACGTAGCGAAATGCGATACTTGGTGTGAATTGCAGAATCCCGTGAACCATCAAGTCTTTGAACGCAAGT
TGCGCCCCAAGCCATTAGGCCGAGGGCACGTCTGCCTGGGTGTCACGCATCGTTGCCCCAATCAAGCCTC
GAGCTTATTCTGCTCAGGTCAAATTAYGGGCGGATATTGGCTTCCCGTTCGCTCACCGTGCGCGGTTGGC
CTAAAAATGAGTCTTYGGCGATGAAGTGCCGCGACAATCGGTGGGAATACTTACAGTTGTCTCGTTTGAA
GTCGTGTGCACTYGTTGATTTAGACCCTATGACCCTTTTGGCATCACATCGTTGGTGCTCGCATCGCGAC
CCCAGGTCAGGCGGGAT
>gi|686985121|gb|KM037616.1| Rubus ulmifolius x Rubus caesius isolate UxC_col06 18S ribosomal RNA gene, partial sequence; internal transcribed spacer 1, 5.8S ribosomal RNA gene, and internal transcribed spacer 2, complete sequence; and 26S ribosomal RNA gene, partial sequence
TCCGTAGGTGAACCTGCGGAAGGATCATTGTCGAAACCTGCCCAGCAGAACGACCCGAGAACATGTTTCA
ACGCTTGGGGGCGAAGGGTCTTTCGGCTCCTCGTCCCTTTTCTCGGGAGGCAATCGTCTTGTGCGTTGCA
TCTCGATGCTCGCACTTGAACGACCCTCTCGGGCGTACAAACGAACACCGGCGTGTATTGCGCCAAGGAA
CTTGAATGAAAGAGCGTTCCCCCGCCGCCCCGGAAACGGTGTGCGTGCGGTGGGTTACGTCATCTTCAAT
ATGTCTAAACGACTCTCGGCAACGGATATCTCGGCTCTCGCATCGATGAAGAACGTAGCGAAATGCGATA
CTTGGTGTGAATTGCAGAATCCCGTGAACCATCGAGTCTTTGAACGCAAGTTGCGCCCGAAGCCATTAGG
CCGAGGGCACGCCTGCCTGGGCGTCACACGTCGTTGCCCCCCCCCAAACCCCTCGGGAGTTGGGCGGGAC
GGATGATGGCCTCCCGTGTGCTCTGTCATGCGGTTGGCATAAAAACAAGTCCTCGGCGACTAACGCCACG
ACAATCGGTGGTTGTCAAACCTCTGTTGCCTATCGTGTGCGCGTGTCGAGCGAGGGCTCAACAAACCATG
TTGCATCGATTCGTCGATGCTTTCAACGCGACCCCAGGTCAGGCGGGGTTACCCGCTGAATTTAAGCATA
TCAATAAGCGGAGGA

--
You received this message because you are subscribed to a topic in the Google Groups "phlawd" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/phlawd/WaS9odfD_38/unsubscribe.
To unsubscribe from this group and all its topics, send an email to phlawd+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Dr. Miao Sun
Tel: 352 (284) 0928                 Email: cac...@ufl.edu
Florida Museum of Natural History, University of Florida
Dickinson Hall, 1659 Museum Rd,PO. Box 117800, Gainesville, FL 32611-7800, USA.

Cody Hinchliff

unread,
Feb 19, 2015, 11:21:04 AM2/19/15
to phl...@googlegroups.com
Try getting rid of the long Id strings. I would just use cut out everything except gi number and give it another try.

Miao Sun

unread,
Feb 19, 2015, 11:28:17 AM2/19/15
to phl...@googlegroups.com
Thanks, I'll redo it, and let you know the updates.

Miao

在 2015年2月19日星期四 UTC-5上午11:21:04,Cody Hinchliff写道:
To unsubscribe from this group and stop receiving emails from it, send an email to phlawd+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "phlawd" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/phlawd/WaS9odfD_38/unsubscribe.
To unsubscribe from this group and all its topics, send an email to phlawd+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Dr. Miao Sun
Tel: 352 (284) 0928                 Email: cac...@ufl.edu
Florida Museum of Natural History, University of Florida
Dickinson Hall, 1659 Museum Rd,PO. Box 117800, Gainesville, FL 32611-7800, USA.

--
You received this message because you are subscribed to the Google Groups "phlawd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phlawd+unsubscribe@googlegroups.com.

Miao Sun

unread,
Feb 19, 2015, 11:59:32 AM2/19/15
to phl...@googlegroups.com
Hi,

I still come across the same "segmentation fault" (see the captured screen). 

Thanks!

Miao



在 2015年2月19日星期四 UTC-5上午11:21:04,Cody Hinchliff写道:
Try getting rid of the long Id strings. I would just use cut out everything except gi number and give it another try.
To unsubscribe from this group and stop receiving emails from it, send an email to phlawd+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "phlawd" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/phlawd/WaS9odfD_38/unsubscribe.
To unsubscribe from this group and all its topics, send an email to phlawd+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Dr. Miao Sun
Tel: 352 (284) 0928                 Email: cac...@ufl.edu
Florida Museum of Natural History, University of Florida
Dickinson Hall, 1659 Museum Rd,PO. Box 117800, Gainesville, FL 32611-7800, USA.

--
You received this message because you are subscribed to the Google Groups "phlawd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phlawd+unsubscribe@googlegroups.com.

Miao Sun

unread,
Feb 20, 2015, 9:29:59 AM2/20/15
to phl...@googlegroups.com, codiferous
Hi Cody,

I can't sleep for this. I've tried many times with different sequences with various modifications (e.g. remove all the tags, or remove part sequences, or change to "search = internal transcribed spacer"), and before I set them as reference sequences, I always make sure the sequences contain no other characters except ATCG.

So wonder:
1. how exactly then PLAWD work in side to capture the patterns of reference sequence? is it really strict or ITS evolves too fast that too much variables that made the procedure fail?

2. I think in Stephen's Mega-phylogeny (Smith, S. A., J. Beaulieu, and M. J. Donoghue. 2009.  Mega-phylogeny approach for comparative biology: an alternative to supertree and supermatrix approaches. BMC Evol Biol. . 9: 37), he used ITS right? How did he set the ITS reference sequences?Can I have his reference copy of ITS, just to make the PHLAWD work?



Thanks!

Miao
Reply all
Reply to author
Forward
0 new messages