EdgeR with trinity

768 views
Skip to first unread message

Cleo Nicole

unread,
Jan 2, 2014, 4:05:12 AM1/2/14
to rsem-...@googlegroups.com, cleo nicole
Hello there,

I am using trinity to assemble my transcriptome reads, followed by RSEM to estimate transcript abundance and now I am trying to identify DE features using edgeR.
I have a total of 8 samples, 4 conditions with 2 replicates each. I have created a matrix containing the counts of RNA-seq fragments per feature created by RSEM in a tab-delimited text file.
And then I have run this command:
$ Trinity_HOME/Analysis/DifferentialExpression/run_DE_analysis. pl --matrix transcripts.counts.matrix --method edgeR --output edgeR_dir --samples_file samples_described.txt

However i got this error message:

Got 9 samples, and got: 9 data fields.
Header: Transcript_id RSEM_0C_Rep1.isoforms.results RSEM_0C_Rep2.isoforms.results RSEM_5C_Rep1.isoforms.results RSEM_5C_Rep2.isoforms.results RSEM_15C_Rep1.isoforms.results RSEM_15C_Rep2.isoforms.results RSEM_21C_Rep1.isoforms.results RSEM_21C_Rep2.isoforms.results
Next: comp20833_c0_seq1 1.00 0.00 0.00 2.00 1.00 0.00 2.00 0.00

-shifting sample indices over.
$VAR1 = {
          'RSEM_5C_Rep2' => 4,
          'RSEM_5C_Rep1' => 3,
          'RSEM_0C_Rep2' => 2,
          'RSEM_0C_Rep1' => 1,
          'RSEM_15C_Rep1' => 5,
          'RSEM_21C_Rep1' => 7,
          'RSEM_21C_Rep2' => 8,
          'RSEM_15C_Rep2' => 6
        };
$VAR1 = {
          '15C' => [
                     '15C_Rep1',
                     '15C_Rep2'
                   ],
          '0C' => [
                    '0C_Rep1',
                    '0C_Rep2'
                  ],
          '5C' => [
                    '5C_Rep1',
                    '5C_Rep2'
                  ],
          '21C' => [
                     '21C_Rep1',
                     '21C_Rep2'
                   ]
        };
Samples to compare: $VAR1 = [
          '15C',
          '0C',
          '5C',
          '21C'
        ];

Error, cannot determine column index for replicate name [0C_Rep1]$VAR1 = {
          'RSEM_0C_Rep2' => 2,
          'RSEM_5C_Rep1' => 3,
          'RSEM_5C_Rep2' => 4,
          'RSEM_0C_Rep1' => 1,
          'RSEM_15C_Rep1' => 5,
          'RSEM_21C_Rep1' => 7,
          'RSEM_21C_Rep2' => 8,
          'RSEM_15C_Rep2' => 6
        };



Can someone please shed from light on what I should do or how to trouble shoot the problem that I am facing?
I appreciate your help so so so much.



Regards,
Cleo Nicole

Brian Haas

unread,
Jan 2, 2014, 8:12:20 AM1/2/14
to rsem-...@googlegroups.com, cleo nicole, trinityrn...@lists.sf.net
Hi Cleo,

This question should be directed to the Trinity users list (CC'd here).

It looks like there's a discrepancy between your samples.txt file description of the samples and the column headers of your matrix.

Your matrix column headers contain the prefix 'RSEM_', and this is not included in the samples.txt file for the sample name.

Either of two solutions:   remove the 'RSEM_' prefix from your matrix column headers, or add it to your samples.txt file replicate entry names.

best,

~brian





--
RSEM website: http://deweylab.biostat.wisc.edu/rsem/
---
You received this message because you are subscribed to the Google Groups "RSEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rsem-users+...@googlegroups.com.
To post to this group, send email to rsem-...@googlegroups.com.
Visit this group at http://groups.google.com/group/rsem-users.



--
--
Brian J. Haas
The Broad Institute
http://broad.mit.edu/~bhaas

 

Cleo Nicole

unread,
Jan 6, 2014, 10:12:54 PM1/6/14
to rsem-...@googlegroups.com, trinityrn...@lists.sf.net
Hi Brian,

Thank you very much for your help. I have added the "RSEM_" prefix in the samples.txt file. I think it is working now.
However, I got this new error message instead:

CMD: R --vanilla -q < transcripts.counts.matrix.RSEM_0C_vs_RSEM_21C.RSEM_0C.vs.RSEM_21C.EdgeR.Rscript
sh: R: command not found

Error, cmd: R --vanilla -q < transcripts.counts.matrix.RSEM_0C_vs_RSEM_21C.RSEM_0C.vs.RSEM_21C.EdgeR.Rscript died with ret (32512) at /storage/bioapps/trinity/Analysis/DifferentialExpression/run_DE_analysis.pl line 439.

WARNING: This EdgeR comparison failed...

I think that it might be due to problem in the R installation?



Regards,
Cleo


You received this message because you are subscribed to a topic in the Google Groups "RSEM Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rsem-users/XcFoMWQi5Zw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rsem-users+...@googlegroups.com.

Brian Haas

unread,
Jan 7, 2014, 6:23:04 AM1/7/14
to rsem-...@googlegroups.com, trinityrn...@lists.sf.net

The R software needs to be accessible from your PATH environmental variable setting. Usually, after installing R, it will automatically be available via your PATH.  If it's not, then you'll need to update the PATH setting directly.


Cleo Nicole

unread,
Jan 7, 2014, 10:14:10 PM1/7/14
to rsem-...@googlegroups.com, trinityrn...@lists.sf.net
OK. Thanks Brian. Problem solved. :)
Reply all
Reply to author
Forward
0 new messages