strange error

Georgi Marinov

unread,

Mar 20, 2014, 12:16:45 AM3/20/14

to grit...@googlegroups.com

Hi,

I am trying to run GRIT and I get this message:

File "/usr/bin/run_grit.py", line 514, in <module>

main()

File "/usr/bin/run_grit.py", line 486, in main

elements = discover_elements(sample_data, args)

File "/usr/bin/run_grit.py", line 440, in discover_elements

sample_data.get_reads(sample_type)

File "/usr/bin/run_grit.py", line 247, in get_reads

rnaseq_reads = self._load_rnaseq_reads(sample_type, rep_id)

File "/usr/bin/run_grit.py", line 188, in _load_rnaseq_reads

assert data.paired == 'true', "RNASeq reads must be paired"

AssertionError: RNASeq reads must be paired

This happens on both TopHat and STAR-generated BAM files.

Does this mean that there is a requirement for all reads to be paired and that if you have singletons, the program exits or there is some other problem?

Thanks,

Georgi

Nathan Boley

unread,

Mar 20, 2014, 12:22:23 AM3/20/14

to Georgi Marinov, grit...@googlegroups.com

> This happens on both TopHat and STAR-generated BAM files.
>
> Does this mean that there is a requirement for all reads to be paired and
> that if you have singletons, the program exits or there is some other
> problem?

No, but it does ignore RNAseq reads without proper pairs. It looks
like it's choking during the argument parsing, rather than when
processing the reads.

Are you using a control file or command line options? Can I see the
control file/exact command?

Best, Nathan

Nathan Boley

unread,

Mar 20, 2014, 12:52:22 AM3/20/14

to Georgi Marinov, grit...@googlegroups.com

[cc-ing list]

> python /usr/bin/run_grit.py --threads 8 --rnaseq-read-type backward
> --rnaseq-reads wgEncodeCshlLongRnaSeqMcf7CellPapFastqRd1Rep1.STAR_2.3.0e.bam
> --ofprefix
> GRIT-1.1.0-wgEncodeCshlLongRnaSeqMcf7CellPapFastqRd1Rep1.STAR_2.3.0e &
>

It looks like I introduced another bug when I changed the argument
parsing. I'll upload a fix tonight.

Also, you'll need to either provide poly(A) and CAGE or RAMPAGE reads,
or reference poly(A)'s and promoters.

> What exactly is the CONTROL file supposed to contain and do? I could not
> find any example so I did not specify any

For now, you can use a control file of the form:

# comment line
# *'s indicate merged
#sample_type rep_id assay paired
stranded read_type filename

AdMatedF_Ecl_20days_Heads rep1 rnaseq true true
auto AdMatedF_Ecl_20days_Heads.biorep1.rnaseq.chr4.bam
AdMatedF_Ecl_20days_Heads rep1 cage false true
auto AdMatedF_Ecl_20days_Heads.biorep1.cage.chr4.bam
AdMatedF_Ecl_20days_Heads rep1 polya false true
auto AdMatedF_Ecl_20days_Heads.biorep2.passeq.chr4.bam

I've put an example data set with a control file and README at:
http://grit-bio.org/GRIT_example.tar.gz

Note that to use read type 'auto' you'll need to provide a reference
annotation (even GENCODE chr20 will work - GRIT just needs genes so
that it can determine the gene strand).

> P.S. Am I correct in my guess the read-type should be "backward" for dUTP?

Yes.

Thanks for your interest - sorry about the bug.

Best, Nathan

Reply all

Reply to author

Forward