Hi Fang,
> We only have rnaseq data, but don't have the gene boundary information data
> (such as cage and passeq). And we have the reference acquired by genewise
> through the genome, which means we only have the information about the
> protein-coding region.
>
> when I trying: run_grit.py --rnaseq-reads scaffold164.bam
> --rnaseq-read-type backward --reference scaffold164.gtf --fasta
> scaffold164.fa
If you already have a gtf file containing contigs, then you can just
estimate expression by running:
run_grit.py --rnaseq-reads scaffold164.bam --rnaseq-read-type backward
--GTF scaffold164.gtf --fasta scaffold164.fa
This will only estimate the expression of the transcripts in
scaffold164.gtf - it will not identify new transcripts.
> I got the error information back as follow:
> ValueError: Either (cage reads or rampage reads) must be provided for each
> sample or (--use-reference-tss or --use-reference-promoters) must be set
> I am wondering if it isn't recommended to use GRIT when there are no data of
> cage and passeq?
For transcript discovery, GRIT needs either a list of previously
identified transcript bound elements (e.g. poly(A) sites or promoters)
or it needs data (e.g. CAGE or pas-seq).
What organism are you working in?
Best, Nathan