date suited for GRIT

99 views
Skip to first unread message

lifan...@gmail.com

unread,
Mar 25, 2014, 4:32:21 AM3/25/14
to grit...@googlegroups.com
Hi,
   
We only have rnaseq data, but don't have the gene boundary information data (such as cage and passeq). And we have the reference acquired by genewise through the genome, which means we only have the information about the protein-coding region.

when I trying:  run_grit.py --rnaseq-reads scaffold164.bam --rnaseq-read-type backward --reference scaffold164.gtf --fasta scaffold164.fa
I got the error information back as follow:
ValueError: Either (cage reads or rampage reads) must be provided for each sample or (--use-reference-tss or --use-reference-promoters) must be set
I am wondering if it isn't recommended to use GRIT when there are no data of cage and passeq?

Thanks,
Fang

Nathan Boley

unread,
Mar 25, 2014, 7:53:11 PM3/25/14
to lifan...@gmail.com, grit...@googlegroups.com
Hi Fang,


> We only have rnaseq data, but don't have the gene boundary information data
> (such as cage and passeq). And we have the reference acquired by genewise
> through the genome, which means we only have the information about the
> protein-coding region.
>
> when I trying: run_grit.py --rnaseq-reads scaffold164.bam
> --rnaseq-read-type backward --reference scaffold164.gtf --fasta
> scaffold164.fa

If you already have a gtf file containing contigs, then you can just
estimate expression by running:

run_grit.py --rnaseq-reads scaffold164.bam --rnaseq-read-type backward
--GTF scaffold164.gtf --fasta scaffold164.fa

This will only estimate the expression of the transcripts in
scaffold164.gtf - it will not identify new transcripts.

> I got the error information back as follow:
> ValueError: Either (cage reads or rampage reads) must be provided for each
> sample or (--use-reference-tss or --use-reference-promoters) must be set
> I am wondering if it isn't recommended to use GRIT when there are no data of
> cage and passeq?

For transcript discovery, GRIT needs either a list of previously
identified transcript bound elements (e.g. poly(A) sites or promoters)
or it needs data (e.g. CAGE or pas-seq).

What organism are you working in?

Best, Nathan

lifan...@gmail.com

unread,
Mar 26, 2014, 2:44:20 AM3/26/14
to grit...@googlegroups.com, lifan...@gmail.com
Hi, Nathan

Thanks for your reply.

>What organism are you working in? 

I am working on a fruit fly which was not sequenced before, and we don't have the experimental identified transcript bound elements data, neither the CAGE nor pas-seq data. 

Best, Fang

在 2014年3月26日星期三UTC+8上午7时53分11秒,Nathan Boley写道:

Michel Moser

unread,
Mar 31, 2014, 6:25:36 AM3/31/14
to grit...@googlegroups.com, lifan...@gmail.com
Hello

I am also working with a non-model species (plant) and possess only stranded RNA-seq data. 
So does that mean GRIT wont work with RNA-seq only? 
Couldnt we provide potential starting sites from the alignment file (given by cufflinks for example) ? 

Best, 
Michel

Nathan Boley

unread,
Apr 1, 2014, 8:53:49 PM4/1/14
to Michel Moser, grit...@googlegroups.com, Fang Li
Dear Michel,

> I am also working with a non-model species (plant) and possess only stranded
> RNA-seq data.
> So does that mean GRIT wont work with RNA-seq only?
> Couldnt we provide potential starting sites from the alignment file (given by cufflinks for example) ?

One could, in principle, but worry that you would run into the same
problems that we had trying to use Cufflinks initially - that
fragments are very common and it is very difficult to identify
TSS/TES's from RNAseq data alone.

I do have a branch that uses RNAseq signal ratios to identify
transcript bounds, but the results are poor (although not worse than
Cufflinks - I think that it really is a limitation in the assay ).

Best, Nathan
Reply all
Reply to author
Forward
0 new messages