create-priors problem: "FATAL: The order of sequences in the input wiggle file differs from the order of sequences in the intput FASTA file."

13 views
Skip to first unread message

qin wang

unread,
Dec 10, 2018, 1:30:15 AM12/10/18
to MEME Suite Q&A
I am facing the problem "FATAL: The order of sequences in the input wiggle file differs from the order of sequences in the input FASTA file." when I try to use create-priors. 

The following are clear error messages:

"""
Creating priors with alpha=1.0 and beta=10000
Sequence size is 1498357 bases.
Scores were observed for 1498357 bases.
Minimum score is 2.77268
Median score is 3.80554
Maximum score is 15.6714
Sum of scores is 6.4026e+06
Sum of un-normalized priors is 415825
FATAL: The order of sequences in the input wiggle file differs from the order of sequences in the intput FASTA file.
"""

And my command is "create-priors --oc ./test_fimo_with_prior_files ./test_fimo_with_prior_files/ESC.5000.signal.sequence.fa ./test_fimo_with_prior_files/ESC.5000.signal.wig"

Also, here is content of several lines in two input files:
1. test_fimo_with_prior_files/ESC.5000.signal.sequence.fa

>chr6
TGGGAGAGGAGGGTGCGAAGCCAGAGGCTGAGGGCGAAGCTCAAGGGCCCAGAGGGCAGAGCCCGGATGTGGAGAAATAGGCTGAGGGCGGGGCTTGGTGCGCGAGAGGCGGAGCCCAGAGGGGAGCAACAGTTATGAGGGCGGAACTTAGAAACGATGAAGGCTGTAGTAATTGGGCTGAGGGCGGGGCTTAGGAGCACAGAAAGTGGAGTCCAGGTGTAGAGCAATTAGATTGAGGGAGGGGCTTGGAGCATGAGAGGCGGAGTCCGAGTGTGGAGCAACAGGACGAGGGCAGAATTTAAGAGCGCAGAAGGCGGAGCCCCTGCGTGGGCCAGCAGTCTGAGACCG
>chr7
CGGTGTCATGGCCGACGCTCCTCGCACTCCCTTGGAGACTCTAAGTCACACTGCCCTGCTCCCTAGAGGAGAGAGGAAAAGTACTGAAAGGAAGCAGCCAAATTCTTTGACATCCAATCAGAAGCAGGCATGAAGCCAAGAGTTGCCCAATCAGATTAGCGGAAATAGCGAAGGAGTCTCCTGGAAGCAGAGTTGCCATGACAACGCGAGTTTCTAGGGTTCTCTGCTTGCGCCTAGCGCTTGGCAACGCATGGCCACCTCCCTGATGAAATGCAGCCAATAAGAGGGACCCATTTTGTGACGTCAGTAGTTGCCAAGCCGGCTTGCAGAAGCACGCGAGCCGCCCCCTC
>chr3
CAGGGGACCAAAACGTCTACAAGGTAAAGCTGCAGCCTGAAGCGATTTTTTGACCCCCGACACTTAATGGGCTGTCGAAGACCCGGATGTGTCTGGGACTGGCCGAGGTACCCATTAACGCGCCCCCGCTAGACCATCCCCCTTCTCCATTTATGATTGGATTTTCTTAACGCCAATCACTACTTCTCCAGCCACCCATTCCTCAAGGGTGGGGTGAAGTCTTGGTATCCCGCTTTCAATTGGTAGGAGCAAGAAGAGGAGGGCGGGTAGCCGGCTAAGTTCCTCCAATGGCCGTCGTGGCGTGTGGTTAAGGCGGAGTCAAGGCGCTTGGCAGCTGCCCAATCAGCGT
>chr7
TCATTCTCCTTCCCATTTTCTGTTTAAGTTCGGGCGTCCCAGCCCCGCCCAAACAGTGATATTCCCGCCCACTGGATTACCTTCGCTCAAAACAAACCGAAGTTTCCCTTTTGTGTCTCAGCTGGAATCCAAAAAGTTCAGGCTTTTCCCATTGGCTAGGCACCCCTTTCTAGTCCCGCCCACATGGCCTTCTACCTGACATTGTAAACTCCGCCTAACCCAGGTACCTTAGCCTTAGCCCCGCCTACTCACTTTCCTCCCTCTGATTTGCTCCGAATCACGCCCCTCCCACTCTTGGGCGAGCTAGGCGCCTTAGCCCCGCCCACAGCCAGCTTT
>chr4
TTGCTGCATAATCCGGACTCCACAGGTAGGAGGGCTGGGCATTCTAGTGAATGTTGTCGGGAACAAAGGAAGGCCTGGCCCAGCGCCCTCACACACCTCTTACACCGAGCATGCCCGGGCGTCCCGGAACCAGCGAGAACTGGACTTCAGACCCGCCAAGCTGAAAGAAATCCTCATTTCCCCGAAAAGGAGCCGCTCCCAGTGGGAACAAAGAAGAAGCGGGGTGGGGTCTAGGTGGGGCTGAGCCAGGAGGAGCTGGAGAAGCACAGGGGGCGGGTCTAGGAGGAGCCGGTCGTGGGCGGGTCCAAAAGGCGG

2. ESC.5000.signal.wig
variableStep chrom=chr6 span=348
3182967 15.6714
variableStep chrom=chr7 span=350
44405671 15.0179
variableStep chrom=chr3 span=349
90292834 14.5617
variableStep chrom=chr7 span=336
27330690 14.4946
variableStep chrom=chr4 span=315
130520079 12.9171


cegrant

unread,
Dec 18, 2018, 11:35:39 PM12/18/18
to meme-...@googlegroups.com
Hi Qin,

FATAL: The order of sequences in the input wiggle file differs from the order of sequences in the intput FASTA file

create_priors and FIMO make a single pass through the wiggle file and the fasta file.  Furthermore the regions covered by the wiggle file have to be a sub-set of the regions covered by the FASTA file. This means that the entries in each file have to be in order by the sequence and and the region in the sequence. Your sequences and wiggle file jump from chr7 to chr3, and then back to chr7. Further more you haven't provided genomic coordinates in your FASTA file, but you do in your wiggle file. If you don't provide genomic coordinates in the FASTA file the programs in the MEME Suite will assume that they start at position 1.

Your FASTA file would need to look more like:

>chr3:9029283-9000000
CAGGGGACCAAAACGTCTACAAGGTAAAGCTGCAGCCTGAAGCGATTTTTTGACCCCCGACACTTAATGGGCTGTCGAAGACCCGGATGTGTCTGGGACTGGCCGAGGTACCCATTAACGCGCCCCCGCTAGACCATCCCCCTTCTCCATTTATGATTGGATTTTCTTAACGCCAATCACTACTTCTCCAGCCACCCATTCCTCAAGGGTGGGGTGAAGTCTTGGTATCCCGCTTTCAATTGGTAGGAGCAAGAAGAGGAGGGCGGGTAGCCGGCTAAGTTCCTCCAATGGCCGTCGTGGCGTGTGGTTAAGGCGGAGTCAAGGCGCTTGGCAGCTGCCCAATCAGCGT
>chr4:130520079-130550000
TTGCTGCATAATCCGGACTCCACAGGTAGGAGGGCTGGGCATTCTAGTGAATGTTGTCGGGAACAAAGGAAGGCCTGGCCCAGCGCCCTCACACACCTCTTACACCGAGCATGCCCGGGCGTCCCGGAACCAGCGAGAACTGGACTTCAGACCCGCCAAGCTGAAAGAAATCCTCATTTCCCCGAAAAGGAGCCGCTCCCAGTGGGAACAAAGAAGAAGCGGGGTGGGGTCTAGGTGGGGCTGAGCCAGGAGGAGCTGGAGAAGCACAGGGGGCGGGTCTAGGAGGAGCCGGTCGTGGGCGGGTCCAAAAGGCGG
>chr6:3182967-31840000
TGGGAGAGGAGGGTGCGAAGCCAGAGGCTGAGGGCGAAGCTCAAGGGCCCAGAGGGCAGAGCCCGGATGTGGAGAAATAGGCTGAGGGCGGGGCTTGGTGCGCGAGAGGCGGAGCCCAGAGGGGAGCAACAGTTATGAGGGCGGAACTTAGAAACGATGAAGGCTGTAGTAATTGGGCTGAGGGCGGGGCTTAGGAGCACAGAAAGTGGAGTCCAGGTGTAGAGCAATTAGATTGAGGGAGGGGCTTGGAGCATGAGAGGCGGAGTCCGAGTGTGGAGCAACAGGACGAGGGCAGAATTTAAGAGCGCAGAAGGCGGAGCCCCTGCGTGGGCCAGCAGTCTGAGACCG
>chr7: 130520079-130530000 
CGGTGTCATGGCCGACGCTCCTCGCACTCCCTTGGAGACTCTAAGTCACACTGCCCTGCTCCCTAGAGGAGAGAGGAAAAGTACTGAAAGGAAGCAGCCAAATTCTTTGACATCCAATCAGAAGCAGGCATGAAGCCAAGAGTTGCCCAATCAGATTAGCGGAAATAGCGAAGGAGTCTCCTGGAAGCAGAGTTGCCATGACAACGCGAGTTTCTAGGGTTCTCTGCTTGCGCCTAGCGCTTGGCAACGCATGGCCACCTCCCTGATGAAATGCAGCCAATAAGAGGGACCCATTTTGTGACGTCAGTAGTTGCCAAGCCGGCTTGCAGAAGCACGCGAGCCGCCCCCTC
>chr7-27330690-27340000
TCATTCTCCTTCCCATTTTCTGTTTAAGTTCGGGCGTCCCAGCCCCGCCCAAACAGTGATATTCCCGCCCACTGGATTACCTTCGCTCAAAACAAACCGAAGTTTCCCTTTTGTGTCTCAGCTGGAATCCAAAAAGTTCAGGCTTTTCCCATTGGCTAGGCACCCCTTTCTAGTCCCGCCCACATGGCCTTCTACCTGACATTGTAAACTCCGCCTAACCCAGGTACCTTAGCCTTAGCCCCGCCTACTCACTTTCCTCCCTCTGATTTGCTCCGAATCACGCCCCTCCCACTCTTGGGCGAGCTAGGCGCCTTAGCCCCGCCCACAGCCAGCTTT


Though of course you'd have to provide sequence data for the entire region described by the coordinates.

Your wiggle file would have to look like

variableStep chrom=chr3 span=349
90292834 14.5617
variableStep chrom=chr4 span=315
130520079 12.9171
variableStep chrom=chr6 span=348
3182967 15.6714
variableStep chrom=chr7 span=336
27330690 14.4946
Reply all
Reply to author
Forward
0 new messages