Combining predictions doesn't yield any output

267 views
Skip to first unread message

Bas Verbruggen

unread,
Nov 12, 2015, 5:48:51 AM11/12/15
to EVidenceModeler-users
Hello,

I have been trying to run EVM on and I can get some predictions in the partition/scaffold directories. But when I try to combine all predictions into a single file using
recombine_EVM_partial_outputs.pl I get no output. Neither do I get any errors or hints that things are actually happening.

Can you help me figure out what is going wrong?


I'm running the following command:
$EVM_HOME/EvmUtils/recombine_EVM_partial_outputs.pl --partitions $evm_dir'partitions_list.out' --output_file_name evm.out    # ($evm_dir is the full path to my evm directory)

My structure is as follows:
$evm_dir/partitions_list.out                              # list of all evm directories
$evm_dir/EVM_partitions/                                # directory with all my partitions
$evm_dir/EVM_partitions/scaffold_*/                 # individual scaffold directories
$evm_dir/EVM_partitions/scaffold_*/evm.out      # output file name from the evm command

My partitions_list.out file ([...] is the whole path to my evm directory, like $evm_dir):
scaffold_1      [...]/EVM_partitions/scaffold_1     N
scaffold_2      [...]/EVM_partitions/scaffold_2     N

Example evm.out:
!! Predictions spanning range 1645 - 2013 [R1]
# EVM prediction: Mode:STANDARD S-ratio: 1.20 1645-2013 orient(+) score(2214.00) noncoding_equivalent(1845.00) raw_noncoding(1845.00) offset(0.00)
1645    2013    internal+       1       3       {ev_type:CMaenas-SOAP95-BESST_500bp.fa/ID=match.scaffold_1;CMaenas-SOAP95-BESST_500bp.fa}

!! Predictions spanning range 10202 - 10386 [R3]
# EVM prediction: Mode:STANDARD S-ratio: 1.00 10202-10386 orient(+) score(924.00) noncoding_equivalent(925.00) raw_noncoding(925.00) offset(0.00)
10202   10386   internal+       2       3       {GeneMark.hmm_39221_t;GeneMark.hmm}

Thank you,

Bas

Bas Verbruggen

unread,
Nov 18, 2015, 8:34:52 AM11/18/15
to EVidenceModeler-users
Nobody here?

Brian Haas

unread,
Nov 18, 2015, 9:29:03 AM11/18/15
to EVidenceModeler-users
Hi Bas,

If you want to package up a small set of data files that I can use to troubleshoot this over here, I'll do what I can to help.

best,

~b

Bas Verbruggen

unread,
Nov 18, 2015, 11:05:06 AM11/18/15
to EVidenceModeler-users
Hi Brian,

Thank you for looking into this. I attached my files, see README for explanation.

best wishes,

Bas
EVM_test_files.rar

Brian Haas

unread,
Nov 18, 2015, 11:06:31 AM11/18/15
to Bas Verbruggen, EVidenceModeler-users
Sounds good.  I'll aim to look through this later this evening and get back to you by tomorrow.

best,

~brian


--
You received this message because you are subscribed to the Google Groups "EVidenceModeler-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to evidencemodeler-...@googlegroups.com.
To post to this group, send email to evidencemo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/evidencemodeler-users/b010fd9e-e3cd-4a98-acb6-f94fa8cc6f77%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

Brian Haas

unread,
Nov 18, 2015, 8:40:37 PM11/18/15
to Bas Verbruggen, EVidenceModeler-users
Hi Bas,

Thanks for sending me some example files.  The formats for the input files look good.  There were a few other issues, though.  

EVM doesn't accept multiple input files for a given parameter, so you need to combine the input files into single files (eg. combine all your gene prediction files into a single 'gene_predictions.gff3').  I think this was the only critical piece.  The other issues were minor.

Be sure to include an entry in your evm.weights file for the pasa data:
TRANSCRIPT      gmap-sample_mydb_pasa   5

And the last bit is that the 'recombine_EVM_partial_outputs.pl' script only does something useful if the contigs are long enough that they need to be segmented into smaller chunks, and it apparently runs very quietly and doesn't indicate anything useful if it doesn't do any work (something I'm realizing now when working with smaller contig files).

There was one problem that's fatal with the current version of EVM in that there were some alignments or genewise predictions (I don't remember exactly) where the coordinates in the 3rd and 4th fields of the gff file were out of order (ie. 4th field < 3rd field).  The attached version of EVM will just swap the coordinates when this happens, so feel free to use it as a drop-in replacement.

I ran it on the data you sent me and it appeared to work.  Please let me know how it goes for you.


best,

~brian




 

--
You received this message because you are subscribed to the Google Groups "EVidenceModeler-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to evidencemodeler-...@googlegroups.com.
To post to this group, send email to evidencemo...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
evidence_modeler.pl

erin.hami...@gmail.com

unread,
Sep 21, 2017, 12:15:34 AM9/21/17
to EVidenceModeler-users
Hi Brian, 
Just wondering if you can combine multiple gene predictions in the one file (as you've described above), but assign the different prediction programs different weights in the weights file?

Cheers, 

Erin

Brian Haas

unread,
Sep 21, 2017, 7:13:08 AM9/21/17
to erin.hami...@gmail.com, EVidenceModeler-users
This should be fine as long as the 2nd column of the gff3 file matched up with the prediction type in the weights file

Best,

-Brian
(by iPhone)

--
You received this message because you are subscribed to the Google Groups "EVidenceModeler-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to evidencemodeler-...@googlegroups.com.
To post to this group, send email to evidencemo...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages