Heat map construction from TMM normalized FPKM value

435 views
Skip to first unread message

Yogesh Gupta

unread,
Jul 6, 2015, 1:55:59 AM7/6/15
to trinityrn...@googlegroups.com
Dear All,

Is it possible to construct heat map from excel sheet having FPKM value at different development stages for analysis of differential gene expression.

Thanks
Yogesh



--
Yogesh Gupta
Senior Research Fellow
National Agri-Food Biotechnology Institute
(Departement of Biotechnology, Government of India)
C-127, Industrial Area, Phase 8, SAS Nagar
Mohali-160071 Punjab, India

Ken Field

unread,
Jul 6, 2015, 8:04:41 AM7/6/15
to Yogesh Gupta, trinityrn...@googlegroups.com
Yogesh-
You really should do that within R, after importing the data from a CSV file that you saved in excel. I like to use the Trinity code that gets generated for you when you run:
$TRINITY_HOME/Analysis/DifferentialExpression/run_DE_analysis.pl

If you would like to do it in excel, you can use Conditional Formating to color code cells based on their values. For example, the following row of TMM-normalized fpkm values have been highlighted with a color scale to show their relative values:

0.049 0.032 0.202 0.063 0.026 149.837 80.964 56.518 58.63 152.937 26.019

Note that if you want to highlight each row with its own scale, you have to format each row separately. If you format the whole table at once, it takes the maximum and minimum values for the entire table to set the color scale.

Ken


--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
Ken Field, Ph.D.
Associate Professor of Biology
Program in Cell Biology/Biochemistry
Bucknell University
Room 203A Biology Building

Brian Haas

unread,
Jul 19, 2015, 8:55:39 PM7/19/15
to Ken Field, Yogesh Gupta, trinityrn...@googlegroups.com
If you have a matrix file containing expression values, you can use the Trinity-included PtR script for generating a heatmap:

  trinityrnaseq/Analysis/DifferentialExpression/PtR   --matrix  my_matrix.fpkm  --log2  --heatmap


Brian Haas

unread,
Jul 20, 2015, 10:58:51 AM7/20/15
to Yogesh Gupta, trinityrn...@googlegroups.com

Hi Yogesh,

Try saving the xlsx file as tab-delimited.  Then, do this to get rid of the cntrl-M characters that excel puts in there:

    cat my_matrix.fpkm.txt | perl -lane 's/\cM/\n/g; print;' > file.tab.txt

and then make your heatmap:

   trinityrnaseq/Analysis/DifferentialExpression/PtR -m  file.tab.txt --log2 --heatmap --center_rows

I'll send you the row-centered as well as the one w/o the row-centering separately.  Row-centering is more useful when looking for overall patterns.

best,

~b




On Mon, Jul 20, 2015 at 4:52 AM, Yogesh Gupta <yoges...@gmail.com> wrote:
Dear All ,

I am getting error in heat map contruction using trinityrnaseq/Analysis/DifferentialExpression/PtR   --matrix  my_matrix.fpkm  --log2  --heatmap

EXCEL SHEET WHICH I AM USING IS ATTACHED.

CAMMAND LINE IS :

 /home/yogesh/softwares/TRINITY_HOME/Analysis/DifferentialExpression/PtR --matrix /sataslave/annona_illumina/hybrid_data/blast_artf_hormone_seedgene/my_matrix.fpkm --log2 --heatmap


Thanks in Advance
Yogesh

Error is :
CMD: R --vanilla -q < my_matrix.fpkm.R
> library(cluster)
> library(Biobase)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from ‘package:stats’:

    xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
    do.call, duplicated, eval, evalq, Filter, Find, get, intersect,
    is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax,
    pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rep.int,
    rownames, sapply, setdiff, sort, table, tapply, union, unique,
    unlist, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> library(qvalue)
> NO_REUSE = F
>
> # try to reuse earlier-loaded data if possible
> if (file.exists("my_matrix.fpkm.RData") && ! NO_REUSE) {
+     print('RESTORING DATA FROM EARLIER ANALYSIS')
+     load("my_matrix.fpkm.RData")
+ } else {
+     print('Reading matrix file.')
+     primary_data = read.table("/sataslave/annona_illumina/hybrid_data/blast_artf_hormone_seedgene/my_matrix.fpkm", header=T, com='', sep="\t", row.names=1, check.names=F)
+     primary_data = as.matrix(primary_data)
+ }
[1] "Reading matrix file."
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  line 2 did not have 2 elements
Calls: read.table -> scan
In addition: Warning messages:
1: In read.table("/sataslave/annona_illumina/hybrid_data/blast_artf_hormone_seedgene/my_matrix.fpkm",  :
  line 1 appears to contain embedded nulls
2: In read.table("/sataslave/annona_illumina/hybrid_data/blast_artf_hormone_seedgene/my_matrix.fpkm",  :
  line 2 appears to contain embedded nulls
3: In read.table("/sataslave/annona_illumina/hybrid_data/blast_artf_hormone_seedgene/my_matrix.fpkm",  :
  line 4 appears to contain embedded nulls
Execution halted
Error, cmd: R --vanilla -q < my_matrix.fpkm.R died with ret 256 at /home/yogesh/softwares/TRINITY_HOME/Analysis/DifferentialExpression/PtR line 1568.


On Mon, Jul 20, 2015 at 6:25 AM, Brian Haas <bh...@broadinstitute.org> wrote:
If you have a matrix file containing expression values, you can use the Trinity-included PtR script for generating a heatmap:

  trinityrnaseq/Analysis/DifferentialExpression/PtR   --matrix  my_matrix.fpkm  --log2  --heatmap





--
Yogesh Gupta
Senior Research Fellow
National Agri-Food Biotechnology Institute
(Departement of Biotechnology, Government of India)
C-127, Industrial Area, Phase 8, SAS Nagar
Mohali-160071 Punjab, India



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

Yogesh Gupta

unread,
Jul 29, 2015, 12:59:35 PM7/29/15
to Brian Haas, trinityrn...@googlegroups.com
Dear Brian,

I am making heat map as suggested, but in some cases no of genes in matrix is more and it is making heat map for less genes.

Please give suggestions.

Thanks in Advance

Yogesh

On Mon, Jul 20, 2015 at 9:14 PM, Yogesh Gupta <yoges...@gmail.com> wrote:
 --heatmap_colorscheme "green,black,red"

On Mon, Jul 20, 2015 at 8:29 PM, Brian Haas <bh...@broadinstitute.org> wrote:
pdfs attached



--

Yogesh Gupta

unread,
Aug 10, 2015, 12:26:52 AM8/10/15
to Brian Haas, trinityrn...@googlegroups.com
Dear Brian,

I have some queries regarding heat map. 

Can we make heat map without clustering?

In some cases no. of genes is more in input file and heat map shows less gene. I am attaching one for which I constructed heat map, there is only 32 gene in heat map whereas input file has 39 genes.


Thanks in advance.
Yogesh
comparativeanalysis.txt
comparativeanalysis.tab.txt.minCol10.minRow10.log2.centered.genes_vs_samples_heatmap.pdf

Brian Haas

unread,
Aug 10, 2015, 9:15:43 AM8/10/15
to Yogesh Gupta, trinityrn...@googlegroups.com
Hi Yogesh,

If you're running PtR to generate the heatmap, then include

 --min_rowSums 0

and it should include all the data points.  By default, it filters out those rows that have less than a sum of 10 in the row.

best,

~b
Reply all
Reply to author
Forward
0 new messages