Re: [trinityrnaseq-users] Which files are deletable after running trinity

673 views
Skip to first unread message

Brian Haas

unread,
Oct 23, 2016, 7:45:49 PM10/23/16
to Ina, trinityrnaseq-users
Hi,

The only file you really need to keep is the final Trinity.fasta file for each assembly. If you run Trinity with the --full_cleanup parameter, it'll automatically take care of cleaning up and retaining just this Trinity.fasta output file.

best,

~brian


On Sun, Oct 23, 2016 at 8:34 AM, 'Ina' via trinityrnaseq-users <trinityrn...@googlegroups.com> wrote:
Hi all,
i assembled six transcriptomes with trinity and everything went fine.

Now i have round about 3Terabyte data (limit on the server is 3.1T). Which files/folders are not necessary for the downstream analysis and therefore deletable?

Thanks in advance.

Best,
Ina

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.
To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

Ken Field

unread,
Oct 24, 2016, 11:44:34 AM10/24/16
to Brian Haas, Ina, trinityrnaseq-users
Brian-
Does full_cleanup delete all the intermediate files if Trinity does not complete? I've always been hesitant to use it because I want to be able to restart at an intermediate step.

Ken

On Sun, Oct 23, 2016 at 5:45 PM, Brian Haas <bh...@broadinstitute.org> wrote:
Hi,

The only file you really need to keep is the final Trinity.fasta file for each assembly. If you run Trinity with the --full_cleanup parameter, it'll automatically take care of cleaning up and retaining just this Trinity.fasta output file.

best,

~brian

On Sun, Oct 23, 2016 at 8:34 AM, 'Ina' via trinityrnaseq-users <trinityrnaseq-users@googlegroups.com> wrote:
Hi all,
i assembled six transcriptomes with trinity and everything went fine.

Now i have round about 3Terabyte data (limit on the server is 3.1T). Which files/folders are not necessary for the downstream analysis and therefore deletable?

Thanks in advance.

Best,
Ina

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsubscribe...@googlegroups.com.

To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.
To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
Ken Field, Ph.D.
Professor of Biology
Program in Cell Biology/Biochemistry
Bucknell University
Room 203A Biology Building

Brian Haas

unread,
Oct 24, 2016, 12:22:36 PM10/24/16
to Ken Field, Ina, trinityrnaseq-users
Hi Ken,

If the job gets interrupted, then it won't delete all files.  Essentially, it'll only delete directories that it created during that runtime.  This is for security purposes and to ensure it doesn't delete something that was created by some other process or other personal data.

~b



Ken Field, Ph.D.
Professor of Biology
Program in Cell Biology/Biochemistry
Bucknell University
Room 203A Biology Building

Will Holtz

unread,
Oct 24, 2016, 12:31:50 PM10/24/16
to Brian Haas, Ken Field, Ina, trinityrnaseq-users
Hi Brian,

Let's say I start a run with full_cleanup, and it gets through one or more checkpoints but does not fully complete. I then re-start the run with full_cleanup and it runs to completion. In this scenario, no files created during the first execution get deleted?

thanks,
-Will

To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.

To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
The information contained in this e-mail message or any attachment(s) may be confidential and/or privileged and is intended for use only by the individual(s) to whom this message is addressed.  If you are not the intended recipient, any dissemination, distribution, copying, or use is strictly prohibited.  If you receive this e-mail message in error, please e-mail the sender at who...@lygos.com and destroy this message and remove the transmission from all computer directories (including e-mail servers).

Please consider the environment before printing this email.

Brian Haas

unread,
Oct 24, 2016, 12:36:08 PM10/24/16
to Will Holtz, Ken Field, Ina, trinityrnaseq-users
The 'trinity_out_dir' won't get deleted and it'll leave a bunch of the large files lying around that tend to exist in the trinity_out_dir.   

Note, though, it won't nearly be as bad as running Trinity with the --no_cleanup  parameter.  Instead, it would just run as if you didn't have the '--full_cleanup' parameter set.   For the '--full_cleanup' to do its thing, there would have to be a seamless run end-to-end of the Trinity pipeline.

Note, you can always implement your own cleanup strategy regardless of how you run Trinity.  Just wrap Trinity so that you just capture the Trinity.fasta file and then purge everything you don't want to keep.  That's effectively what --full_cleanup would normally do.

best,

~b

The information contained in this e-mail message or any attachment(s) may be confidential and/or privileged and is intended for use only by the individual(s) to whom this message is addressed.  If you are not the intended recipient, any dissemination, distribution, copying, or use is strictly prohibited.  If you receive this e-mail message in error, please e-mail the sender at who...@lygos.com and destroy this message and remove the transmission from all computer directories (including e-mail servers).

Please consider the environment before printing this email.

Wyclif Odago

unread,
Aug 7, 2024, 3:31:39 PM8/7/24
to trinityrnaseq-users
Hi Brian,
I was running trinity assembly on multiple samples and after running a few samples butterfly returned errors. I am so confused since I am not sure which part of the assembly crashed. It gave error as shown here: 

Trinity run failed. Must investigate error above.
warning, cmd: /software/trinityrnaseq-v2.13.2/util/support_scripts/../../Trinity --single "/home/odago/hoya/03.trinity/trinity_output/trinity_Hoya_australis_ZCF_combined/read_partitions/Fb_0/CBin_764/c76472.trinity.reads.fa" --output "/home/odago/hoya/03.trinity/trinity_output/trinity_Hoya_australis_ZCF_combined/read_partitions/Fb_0/CBin_764/c76472.trinity.reads.fa.out" --CPU 1 --max_memory 1G --run_as_paired --seqType fa --trinity_complete --full_cleanup --no_salmon   failed with ret: 65280, going to retry.
succeeded(146333), failed(1)   47.2044% completed.    

We are sorry, commands in file: [failed_butterfly_commands.74150.txt] failed.  :-(



Error encountered::  <!----
CMD: /software/trinityrnaseq-v2.13.2/trinity-plugins/BIN/ParaFly -c /home/odago/hoya/03.trinity/trinity_output/trinity_Hoya_australis_ZCF_combined/read_partitions/Fb_0/CBin_764/c76472.trinity.reads.fa.out/chrysalis/butterfly_commands -shuffle -CPU 1 -failed_cmds failed_butterfly_commands.74150.txt  2>tmp.74150.1723047681.stderr

Errmsg:
Exception in thread "main" java.lang.NullPointerException: Cannot invoke "String.length()" because "res" is null
at SeqVertex.getShortSeq(SeqVertex.java:431)
at SeqVertex.getShortSeqWID(SeqVertex.java:478)
at SeqVertex.toString(SeqVertex.java:260)
at java.base/java.lang.String.valueOf(String.java:4216)
at java.base/java.lang.StringBuilder.append(StringBuilder.java:173)
at TransAssembly_allProbPaths.removeLightFlowEdges(TransAssembly_allProbPaths.java:14640)
at TransAssembly_allProbPaths.removeLightEdges(TransAssembly_allProbPaths.java:14596)
at TransAssembly_allProbPaths.main(TransAssembly_allProbPaths.java:809)
How do I solve such problem?

Thank you.

To post to this group, send email to trinityrn...@googlegroups.com.
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsubscribe...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.



--
Ken Field, Ph.D.
Professor of Biology
Program in Cell Biology/Biochemistry
Bucknell University
Room 203A Biology Building



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsubscribe...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.



--
Trinity run failed. Must investigate.txt

Brian Haas

unread,
Aug 7, 2024, 3:37:23 PM8/7/24
to Wyclif Odago, trinityrnaseq-users
Hi,

When the assembly finishes (with error), you can try rerunning it and see if it picks up and can complete that failed job. If not, there'll be a failed commands file in the trinity output directory that contains the problem entry.  

If you want to bypass any failure (and examine those failed reads later separately), then rerun Trinity with the --FORCE option and it'll package up what it was able to assemble ok.

best,

Brian

To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/trinityrnaseq-users/cc70418a-285d-4e58-aba7-0b20f53f8a65n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages