Error in calcNormFactors.DGEList(exp_study) : NAs not permitted

763 views
Skip to first unread message

Bent Petersen

unread,
Jul 8, 2015, 4:46:40 AM7/8/15
to trinityrn...@googlegroups.com
Hi There,

I have been following this GuideLine:

And I am now trying to follow this:

Only problem is that it fails pretty fast when I get to the first step where I combine the matrices I get an error:
Error in calcNormFactors.DGEList(exp_study) : NAs not permitted

Can anyone tell me how to proceed as I have followed the exact pipeline to this point?

I use the below command:

/services/tools/ngs/trinityrnaseq_r20140717/util/abundance_estimates_to_matrix.pl --est_method RSEM --out_prefix Trinity_genes ./Lt1160414/Lt11604.RSEM.genes.results ./Lt1180314/Lt1180314_estimate_out/Lt1180314.RSEM.genes.results ./St1180314/St1180314_estimate_out/St1180314.RSEM.genes.results ./St3160414/St3160414_estimate.results --name_sample_by_basedir

-reading file: ./Lt1160414/Lt1160414_estimate_out/Lt1160414.RSEM.genes.results
-reading file: ./Lt1180314/Lt1180314_estimate_out/Lt1180314.RSEM.genes.results
-reading file: ./St1180314/St1180314_estimate_out/St1180314.RSEM.genes.results
-reading file: ./St3160414/St3160414_estimate_out/St3160414.RSEM.genes.results


* Outputting combined matrix.

/services/tools/ngs/trinityrnaseq_r20140717/util/support_scripts/run_TMM_scale_matrix.pl --matrix Trinity_genes.not_cross_norm.fpkm.tmp > Trinity_genes.TMM.fpkm.matrixCMD: R --vanilla -q < __tmp_runTMM.R 1>&2 
> library(edgeR)
Loading required package: limma
> rnaseqMatrix = read.table("Trinity_genes.not_cross_norm.fpkm.tmp", header=T, row.names=1, com='', check.names=F)
> rnaseqMatrix = round(rnaseqMatrix)
> exp_study = DGEList(counts=rnaseqMatrix, group=factor(colnames(rnaseqMatrix)))
> exp_study = calcNormFactors(exp_study)
Error in calcNormFactors.DGEList(exp_study) : NAs not permitted
Calls: calcNormFactors -> calcNormFactors.DGEList
Execution halted
Error, cmd: R --vanilla -q < __tmp_runTMM.R 1>&2  died with ret (256)  at /services/tools/ngs/trinityrnaseq_r20140717/util/support_scripts/run_TMM_scale_matrix.pl line 98.
Error, CMD: /services/tools/ngs/trinityrnaseq_r20140717/util/support_scripts/run_TMM_scale_matrix.pl --matrix Trinity_genes.not_cross_norm.fpkm.tmp > Trinity_genes.TMM.fpkm.matrix died with ret 6400 at /services/tools/ngs/trinityrnaseq_r20140717/util/abundance_estimates_to_matrix.pl line 240.


My combined matrix does have NA's but why is that not allowed? I mean, not all genes are expected to be in all samples?

here is head of the Matrix:

head Trinity_genes.counts.matrix

        Lt1160414_estimate_out  Lt1180314_estimate_out  St1180314_estimate_out  St3160414_estimate_out
c46397_g2       47.33   NA      NA      NA
c17494_g1       30.79   4.50    24.56   2004.71
c35331_g1       68.42   220.91  174.29  135.64
c35772_g2       NA      1.02    NA      0.00
c38800_g1       668.13  2275.87 613.69  4213.08
c13549_g1       14.26   7.66    NA      13.00
c73778_g1       0.00    3.00    0.00    0.00
c40067_g2       13.19   NA      NA      118.29

head Trinity_genes.not_cross_norm.fpkm.tmp

        Lt1160414_estimate_out  Lt1180314_estimate_out  St1180314_estimate_out  St3160414_estimate_out
c46397_g2       2.42    NA      NA      NA
c17494_g1       1.48    0.62    0.81    13.64
c35331_g1       1.31    3.59    4.41    1.20
c35772_g2       NA      0.24    NA      0.00
c38800_g1       23.57   33.77   9.32    58.64
c13549_g1       0.51    0.24    NA      1.12
c73778_g1       0.00    0.75    0.00    0.00
c40067_g2       1.88    NA      NA      0.63


I hope somebody will be able to help :)

Cheers,
Bent



Tiago Hori

unread,
Jul 8, 2015, 4:50:18 AM7/8/15
to Bent Petersen, trinityrn...@googlegroups.com
You have NA. You can't have NAs. Not my rules :) that often happens when files have been aligned not different assemblies by design or mistake. The NA should be 0 if the transcript was not expressed in that sample, NA means it wasn't there at all.

T.

"Profanity the is the only language all programmers understand" 
Sent from my iPhone, the universal excuse for my poor spelling.
--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

Bent Petersen

unread,
Jul 8, 2015, 4:52:15 AM7/8/15
to trinityrn...@googlegroups.com
Btw, my *RSEM.genes.results and *RSEM.isoforms.results files does not contain any NA's:

head Lt1160414.RSEM.isoforms.results
transcript_id   gene_id length  effective_length        expected_count  TPM     FPKM    IsoPct
c10000_g1_i1    c10000_g1       866     695.11  15.60   0.47    0.40    43.61
c10000_g1_i2    c10000_g1       853     682.11  19.80   0.61    0.51    56.39
c10002_g1_i1    c10002_g1       1819    1648.11 48.24   0.61    0.52    100.00
c10003_g1_i1    c10003_g1       545     374.12  14.30   0.80    0.68    100.00
c10006_g1_i1    c10006_g1       269     100.99  2.00    0.41    0.35    100.00
c10007_g1_i1    c10007_g1       416     245.23  0.00    0.00    0.00    0.00
c10007_g1_i2    c10007_g1       460     289.16  11.44   0.83    0.70    100.00
c10008_g1_i1    c10008_g1       250     83.32   7.00    1.75    1.49    100.00
c1000_g1_i1     c1000_g1        255     87.92   2.67    0.63    0.54    100.00

head Lt1160414.RSEM.genes.results
gene_id transcript_id(s)        length  effective_length        expected_count  TPM     FPKM
c10000_g1       c10000_g1_i1,c10000_g1_i2       858.67  687.78  35.40   1.07    0.91
c10002_g1       c10002_g1_i1    1819.00 1648.11 48.24   0.61    0.52
c10003_g1       c10003_g1_i1    545.00  374.12  14.30   0.80    0.68
c10006_g1       c10006_g1_i1    269.00  100.99  2.00    0.41    0.35
c10007_g1       c10007_g1_i1,c10007_g1_i2       460.00  289.16  11.44   0.83    0.70
c10008_g1       c10008_g1_i1    250.00  83.32   7.00    1.75    1.49
c1000_g1        c1000_g1_i1     255.00  87.92   2.67    0.63    0.54
c10010_g1       c10010_g1_i1    294.00  124.84  12.67   2.12    1.79
c10011_g1       c10011_g1_i1    1193.00 1022.11 54.03   1.10    0.93

Tiago Hori

unread,
Jul 8, 2015, 4:54:15 AM7/8/15
to Bent Petersen, trinityrn...@googlegroups.com
Still the same problem, it means the files don't contain the se transcripts, so when you put them together you get NAs.

T.

"Profanity the is the only language all programmers understand" 
Sent from my iPhone, the universal excuse for my poor spelling.

Bent Petersen

unread,
Jul 8, 2015, 4:57:52 AM7/8/15
to trinityrn...@googlegroups.com, bentpe...@gmail.com
Thanks.. But do you know how to get around this?
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.

Bent Petersen

unread,
Jul 8, 2015, 5:03:54 AM7/8/15
to trinityrn...@googlegroups.com
Argh, I get it now.... My fault... Found another thread with a solution..
Thanks Tiago for answering so fast !!

Here it is:


On Wednesday, July 8, 2015 at 10:46:40 AM UTC+2, Bent Petersen wrote:

Tiago Hori

unread,
Jul 8, 2015, 5:05:05 AM7/8/15
to Bent Petersen, trinityrn...@googlegroups.com
Well. I am assuming you did align your samples to the same reference. It is hard to tell what happened if that is the case. Is there any chance you may have used multiple Trinity.fasta files by accident?

Can you try wc -l in your result files? That may start to point is in right direction.

T.

"Profanity the is the only language all programmers understand" 
Sent from my iPhone, the universal excuse for my poor spelling.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.

Bent Petersen

unread,
Jul 9, 2015, 9:48:17 AM7/9/15
to trinityrn...@googlegroups.com, bentpe...@gmail.com
I did four different assemblies, and not a different one... Which was the initial error, my fault :)

wc -l indeed give different results...
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

Brian Haas

unread,
Jul 19, 2015, 8:27:22 PM7/19/15
to Bent Petersen, trinityrn...@googlegroups.com
The NAs problem is almost always due to the abundance estimation step being run against different Trinity assemblies (or other targets) as opposed to doing abundance estimation against a single target.   The system only works correctly if (in the case of Trinity), there's only a single Trinity assembly based on the combination of all reads, and each of the samples are separately mapped back (RSEM/bowtie) to that single assembly.

(just adding the above to this already closed thread in case it helps someone else later on).

best,

~b

杨冠东

unread,
Oct 8, 2015, 1:14:59 AM10/8/15
to trinityrnaseq-users
Hi Petersen
When running the abundance_estimate_to_matrix.pl, I came across the exact same error as yours. However I am still get confused on how to work this error out even after I read your solution of another thread. I try to combine two assemblies without biological replicates.

Here is my command 
/export/home/tempo001/Soft/trinityrnaseq-2.0.6/util/abundance_estimates_to_matrix.pl --est_method RSEM --out_prefix Trinity_trans S1.isoforms.results F1.isoforms.results

and my error
Error in calcNormFactors.DGEList(exp_study) : NAs not permitted
Calls: calcNormFactors -> calcNormFactors.DGEList
Execution halted
Error, cmd: R --vanilla -q < __tmp_runTMM.R 1>&2  died with ret (256)  at /export/home/tempo001/Soft/trinityrnaseq-2.0.6/util/support_scripts/run_TMM_scale_matrix.pl line 98.
Error, CMD: /export/home/tempo001/Soft/trinityrnaseq-2.0.6/util/support_scripts/run_TMM_scale_matrix.pl --matrix Trinity_trans.not_cross_norm.fpkm.tmp > Trinity_trans.TMM.fpkm.matrix died with ret 6400 at /export/home/tempo001/Soft/trinityrnaseq-2.0.6/util/abundance_estimates_to_matrix.pl line 240.


Could you tell me how did you fix this issue? Thanks a lot.  

在 2015年7月8日星期三 UTC+8下午5:03:54,Bent Petersen写道:

acim...@tamu.edu

unread,
Jul 23, 2018, 6:50:57 PM7/23/18
to trinityrnaseq-users
Hello,

I have the same problem. NA problem. You said that we should not have used two different references. Do we have to use only one reference for all samples? Thanks. 

Mark Chapman

unread,
Jul 24, 2018, 2:54:31 AM7/24/18
to acim...@tamu.edu, trinityrnaseq-users
Hello,
Yes thats correct

If you have multiple RNA-Seq data sets that you want to compare (eg. different tissues sampled from a single organism), be sure to generate a single Trinity assembly and to then run the abundance estimation separately for each of your samples.

Hello,

To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsubscribe...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsubscribe...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.
To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.

For more options, visit https://groups.google.com/d/optout.



--
Dr. Mark A. Chapman
+44 (0)2380 594396
------------------------------------
Biological Sciences
University of Southampton
Life Sciences Building 85
Highfield Campus
Southampton
SO17 1BJ
Reply all
Reply to author
Forward
0 new messages