Hello,
I tested EVM to compile different genes prediction, however even after partitioning I end up with a very large memory usage (>200Go), which looks excessive.
"""
partition_EVM_inputs.pl --genome "$GENOME" \
--gene_predictions denovo.test.gff3 --protein_alignments homology.test.gff3 \
--transcript_alignments transcript.test.gff3 \
--segmentSize 100000 --overlapSize 10000 --partition_listing partitions_list.out
write_EVM_commands.pl --genome "$GENOME" --weights "$WEIGHT" \
--gene_predictions denovo.test.gff3 --protein_alignments homology.test.gff3 \
--transcript_alignments transcript.test.gff3 \
--output_file_name "$OUTPUT" --partitions partitions_list.out >commands.list
"""
I tried running a single command to check how much RAM would be necessary:
"""
evidence_modeler.pl --genome sspace.final.scaffolds.fasta --gene_predictions denovo.gff3 --weights weight.txt --transcript_alignments transcript.gff3 --protein_alignments homology.gff3 --exec_dir scaffold1size567472/scaffold1size567472_450001-550000 > scaffold1size567472/scaffold1size567472_450001-550000/EVM_output.txt 2> scaffold1size567472/scaffold1size567472_450001-550000/EVM_output.txt.lo """
"""
=>> PBS: job killed: mem 211812812kb exceeded limit 209715200kb
"""
the files included in this partition have a reasonable size:
3.8 Feb 28 23:33 scaffold1size567472/scaffold1size567472_450001-550000/denovo.test.gff3
0 Mar 1 01:29 scaffold1size567472/scaffold1size567472_450001-550000/EVM_output.txt
171 Mar 1 01:29 scaffold1size567472/scaffold1size567472_450001-550000/EVM_output.txt.log
0 Feb 28 23:33 scaffold1size567472/scaffold1size567472_450001-550000/homology.test.gff3
100K Feb 28 23:33 scaffold1size567472/scaffold1size567472_450001-550000/sspace.final.scaffolds.fasta
17K Feb 28 23:33 scaffold1size567472/scaffold1size567472_450001-550000/transcript.test.gff3
Do you have any idea of what I might be doing wrong here ?
Thank you
J.