how to create matrix.normalized.FPKM or matrix.TMM_normalized.FPKM for heatmap?

850 views
Skip to first unread message

Farbod Emami

unread,
Mar 22, 2015, 5:54:31 PM3/22/15
to trinityrn...@googlegroups.com
Dear Brian
I have just passed the Volcano plot step (run_DE_analysis.pl) for my two conditions (J & M) samples (each with 3 biological replication) and now I want to run analyze_diff_expr.pl for heat maps, but I could not find this matrix.normalized.FPKM or matrix.TMM_normalized.FPKM that is required. The process of run_DE_analysis.pl (for volcano plots) has produced just 3 files: Trinity_trans.counts.matrix.conditionJ_vs_conditionM.edgeR.DE_results.MA_n_Volcano.pdf, Trinity_trans.counts.matrix.conditionJ_vs_conditionM.edgeR.DE_results and Trinity_trans.counts.matrix.conditionJ_vs_conditionM.conditionJ.vs.conditionM.EdgeR.Rscripts, and  3 similar files for my Gene comparison; but there is no normalized or TMM_normalized among them.
please help me about the script below:

$TRInITY_HOME/Analysis/DifferentialExpression/analyze_diff_expr.pl --matrix matrix.TMM_normalized.FPKM -P 1e-3 -C 2
Thank you for all of your supports



Brian Haas

unread,
Mar 22, 2015, 6:24:01 PM3/22/15
to Farbod Emami, trinityrn...@googlegroups.com
Hi Farbod,

When you ran the 'abundance_estimates_to_matrix.pl ' step, it should have created a TMM normalized fpkm matrix file along with your counts.matrix file.  Use the TMM normalized matrix from that step.

best,

~b

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

Farbod Emami

unread,
Mar 22, 2015, 6:36:25 PM3/22/15
to trinityrn...@googlegroups.com, farbo...@gmail.com
Dear Brian,
I have used the below script:

perl abundance_estimates_to_matrix.pl --est_method RSEM  --out_prefix Trinity_trans '/home2/RNA-seq_F/J1_RSEMalign/J1.RSEM.isoforms.results' '/home2/RNA-seq_F/J2_RSEMalign/J2.RSEM.isoforms.results' '/home2/RNA-seq_F/J3_RSEMalign/J3.RSEM.isoforms.results' '/home2/RNA-seq_F/M1_RSEMalign/M1.RSEM.isoforms.results' '/home2/RNA-seq_F/M2_RSEMalign/M2.RSEM.isoforms.results' '/home2/RNA-seq_F/M3_RSEMalign/M3.RSEM.isoforms.results'

and it has created just 4 files in this step for transcripts: Trinity_trans.counts.matrix, Trinity_trans.TMM.fpkm.matrix, Trinity_trans.not_cross_norm.fpkm.tmp and Trinity_trans.not_cross_norm.fpkm.tmp.TMM_info.txt, and 4 similar files for genes.

it seems that there is no normalized file among them. 
what must I do?

On Monday, March 23, 2015 at 2:54:01 AM UTC+4:30, Brian Haas wrote:
Hi Farbod,

When you ran the 'abundance_estimates_to_matrix.pl ' step, it should have created a TMM normalized fpkm matrix file along with your counts.matrix file.  Use the TMM normalized matrix from that step.

best,

~b
On Sun, Mar 22, 2015 at 5:54 PM, Farbod Emami <farbo...@gmail.com> wrote:
Dear Brian
I have just passed the Volcano plot step (run_DE_analysis.pl) for my two conditions (J & M) samples (each with 3 biological replication) and now I want to run analyze_diff_expr.pl for heat maps, but I could not find this matrix.normalized.FPKM or matrix.TMM_normalized.FPKM that is required. The process of run_DE_analysis.pl (for volcano plots) has produced just 3 files: Trinity_trans.counts.matrix.conditionJ_vs_conditionM.edgeR.DE_results.MA_n_Volcano.pdf, Trinity_trans.counts.matrix.conditionJ_vs_conditionM.edgeR.DE_results and Trinity_trans.counts.matrix.conditionJ_vs_conditionM.conditionJ.vs.conditionM.EdgeR.Rscripts, and  3 similar files for my Gene comparison; but there is no normalized or TMM_normalized among them.
please help me about the script below:

$TRInITY_HOME/Analysis/DifferentialExpression/analyze_diff_expr.pl --matrix matrix.TMM_normalized.FPKM -P 1e-3 -C 2
Thank you for all of your supports



--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

Brian Haas

unread,
Mar 22, 2015, 6:38:23 PM3/22/15
to Farbod Emami, trinityrn...@googlegroups.com
Trinity_trans.TMM.fpkm.matrix is the file that you want.

~b

Farbod Emami

unread,
Mar 22, 2015, 6:46:00 PM3/22/15
to trinityrn...@googlegroups.com, farbo...@gmail.com
Thank you very much!

So, is this absence of word "normalized" in my output files for any negative reasons? what are these "not_cross_norm" files here?

do I must rename the file to matrix.normalized.FPKM or I can use the  Trinity_trans.TMM.fpkm.matrix name directly in the script?

Thank yo again

Brian Haas

unread,
Mar 22, 2015, 6:49:45 PM3/22/15
to Farbod Emami, trinityrn...@googlegroups.com
The other files were just intermediates generated before running the TMM normalization. You can ignore them.   The .TMM.fpkm.matrix is definitely the file you'll want to use for making heatmaps and other expression-based plots.

best,

~b

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

Rohan Mellick

unread,
Oct 25, 2015, 7:30:01 PM10/25/15
to trinityrnaseq-users
Hey Brian,
 
I have used the attached R script to normalise for ERCCs prior to TMM normalisation, though when I preform the abundance_estimates_to_matrix.pl the process is interrupted prior to generating the matrix.TMM.fpkm.matrix.
 
I get
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  line 1 did not have 245 elements
Calls: read.table -> scan
Execution halted
Error, cmd: R --vanilla -q < __tmp_runTMM.R 1>&2  died with ret (256)  at /apps/trinity/r2014-07-17/util/support_scripts/run_TMM_scale_matrix.pl line 98.
Error, CMD: /apps/trinity/r2014-07-17/util/support_scripts/run_TMM_scale_matrix.pl --matrix ERCC_all_matrix.not_cross_norm.fpkm.tmp > ERCC_all_matrix.TMM.fpkm.matrix died with ret 6400 at /apps/trinity/r2014-07-17/util/abundance_estimates_to_matrix.pl line 240.
 
When I don't use the R script and use the raw RSEM files generated I don't have a problem. I have scrutinised the R script but am not sure why TMM fails.
 
Any advise would be greatly appreciated.
 
Thanks
NormERCC.R
ERCCnorm_error-9412151.out

Tiago Hori

unread,
Oct 25, 2015, 7:36:53 PM10/25/15
to Rohan Mellick, trinityrnaseq-users
It is most likely a problem with your input files and not with the script. This error often happens when there discrepancies between the files for each RSEM result file. 90% of the time, it happens because people have using different Trinity assemblies for different files, by accident or design.

If you describe your process with more details, we may be able to help better. How many individuals do you have, what does your samples described file looks like, what command did you use?

T.

"Profanity the is the only language all programmers understand" 
Sent from my iPhone, the universal excuse for my poor spelling.
--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.
<NormERCC.R>
<ERCCnorm_error-9412151.out>

Rohan Mellick

unread,
Oct 25, 2015, 8:24:57 PM10/25/15
to trinityrnaseq-users, rohan....@gmail.com
Thanks Tiago,
 
No I'm sure the input files from RSEM are all ok, bc I am using these raw files in abundance_estimates_to_matrix.pl and the fpkm matrix is produced and there are no errors. The problem is when I normalise for the ERCCs using these raw RSEM files and then try abundance_estimates_to_matrix.pl I get
 
Execution halted
Error, cmd: R --vanilla -q < __tmp_runTMM.R 1>&2  died with ret (256)  at /apps/trinity/r2014-07-17/util/support_scripts/run_TMM_scale_matrix.pl line 98.
Error, CMD: /apps/trinity/r2014-07-17/util/support_scripts/run_TMM_scale_matrix.pl --matrix ERCC_all_matrix.not_cross_norm.fpkm.tmp > ERCC_all_matrix.TMM.fpkm.matrix died with ret 6400 at /apps/trinity/r2014-07-17/util/abundance_estimates_to_matrix.pl line 240.
 
 
I have 245 samples, one of the unaltered RSEM output files is attached and the command I am using are for abundance_estimates_to_matrix.pl is
 
module load trinity
module load R/3.1.3
/apps/trinity/r2014-07-17/util/abundance_estimates_to_matrix.pl --est_method RSEM --cross_sample_fpkm_norm TMM *genes.results --out_prefix matrix
See previous posting for the abundance_estimates_to_matrix.pl error and R script I am using to alter the RSEM file that works (attached) and normalise for ERCCs
 
Thanks
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.
<NormERCC.R>
<ERCCnorm_error-9412151.out>
RSEM_out_RNAseqI01L1.genes.results

Tiago Hori

unread,
Oct 25, 2015, 8:43:08 PM10/25/15
to Rohan Mellick, trinityrnaseq-users
I will the look at you R script in detail to try to figure out. However, it still sounds to me that you have a truncated and or incomplete file problem. What is is crashing is most likely the scan nmax call. This should be happening towards the end and it should be checking to make sure that there is data for all individuals in the matrix. It is fixing that there isn't. I am not sure why.

T. 

"Profanity the is the only language all programmers understand" 
Sent from my iPhone, the universal excuse for my poor spelling.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.
<RSEM_out_RNAseqI01L1.genes.results>

Rohan.mellick

unread,
Oct 26, 2015, 12:15:27 AM10/26/15
to Tiago Hori, trinityrnaseq-users
Thank you Tiago,

R

Kelli Anderson

unread,
Oct 10, 2016, 12:36:42 AM10/10/16
to trinityrnaseq-users, farbo...@gmail.com
Hi Brian,

I ran the 'abundance_estimates_to_matrix.pl' step and there were two output files 1) xxxxx.count.matrix and 2) xxxxx.TMM.EXPR.matrix

I'm further down the pipeline now, and want to upload my DE results to SQL. The script is asking for:

--fpkm_matrix Trinity_trans.counts.matrix.TMM_normalized.FPKM

None of the previous scripts have produced a .TMM.normalized.FPKM output. I thought of using the .TMM.EXPT.matrix file, but I read that the file contains normalised TPM not FPKM.

Are you able to advise a way forward?

Cheers,
Kelli



On Monday, 23 March 2015 08:24:01 UTC+10, Brian Haas wrote:
Hi Farbod,

When you ran the 'abundance_estimates_to_matrix.pl ' step, it should have created a TMM normalized fpkm matrix file along with your counts.matrix file.  Use the TMM normalized matrix from that step.

best,

~b
On Sun, Mar 22, 2015 at 5:54 PM, Farbod Emami <farbo...@gmail.com> wrote:
Dear Brian
I have just passed the Volcano plot step (run_DE_analysis.pl) for my two conditions (J & M) samples (each with 3 biological replication) and now I want to run analyze_diff_expr.pl for heat maps, but I could not find this matrix.normalized.FPKM or matrix.TMM_normalized.FPKM that is required. The process of run_DE_analysis.pl (for volcano plots) has produced just 3 files: Trinity_trans.counts.matrix.conditionJ_vs_conditionM.edgeR.DE_results.MA_n_Volcano.pdf, Trinity_trans.counts.matrix.conditionJ_vs_conditionM.edgeR.DE_results and Trinity_trans.counts.matrix.conditionJ_vs_conditionM.conditionJ.vs.conditionM.EdgeR.Rscripts, and  3 similar files for my Gene comparison; but there is no normalized or TMM_normalized among them.
please help me about the script below:

$TRInITY_HOME/Analysis/DifferentialExpression/analyze_diff_expr.pl --matrix matrix.TMM_normalized.FPKM -P 1e-3 -C 2
Thank you for all of your supports



--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

Elena

unread,
Oct 11, 2016, 10:05:03 AM10/11/16
to trinityrnaseq-users, farbo...@gmail.com


Hi,

I am also trying to populate my trinotate web and came across the same doubt.

by looking to the script usage I realized that our TMM.EXPR is the file you want,
+ # Usage:  $TRINITY_HOME/util/abundance_estimates_to_matrix.pl --est_method <method>  sample1.results sample2.results ...
+ # Required:
+ #
+ #  --est_method <string>           RSEM|eXpress  (needs to know what format to expect)
+ #
+ #
+ # Options:
+ #
+ #  --cross_sample_fpkm_norm <string>    TMM|UpperQuartile|none   (default: TMM)

https://sourceforge.net/u/djinnome/jamg/ci/fc6f599378ac92e90056779f3c13529cf470b352/tree/3rd_party/trinityrnaseq_r20140413/docs/analysis/diff_expression_analysis.asciidoc?barediff=65a16ce0eb66bf5b65147d7387c3f441ae3cf586

but, further on... I found this.

(thanks Brian for documenting everything!)

https://github.com/trinityrnaseq/NaplesWorkshop2016/wiki/Day_3#populate-the-expression-data-into-the-trinotate-database

hope it helps!



Brian Haas

unread,
Oct 12, 2016, 2:55:07 AM10/12/16
to Elena, trinityrnaseq-users, Farbod Emami
That's right - the TMM.EXPR file is what you want. We changed Trinity from using fpkm values to instead using TPM values, and some of the parameter names need to be updated to reflect this - but it should still work.

best,

~b

To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsubscribe...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.
To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages