Several Issues running Tophat/Cufflinks

5,954 views
Skip to first unread message

Matthew MacManes

unread,
May 28, 2014, 9:15:15 AM5/28/14
to tuxedo-to...@googlegroups.com
I am trying to run Tophat 

tophat -p8 -o 2926 --library-type fr-firststrand pero ../pero_annotation/Pero2926.1.fastq ../pero_annotation/Pero2926.2.fastq


1st issue:

I have the samtools developmental version installed (which leverages htslib). Tophat does not recognize this version, and errors out. There is no way to get around this error, save editing the code or installing an older version of samtools. From reading, this looks like an issue with `samtools --version` rather than something about the software itself. Can this be fixed?


2nd Issue:

There is segfault issue with bowtie2-inspect that causes Tophat to crash. The bug was reported, claimed fixed, but it not.. https://sourceforge.net/p/bowtie-bio/bugs/314/.  I found am older version of `bowtie2-inspect` that works.. This is a bowtie issue, should be fixed asap.

3rd Issue

tophat -p8 -o 2926 pero ../pero_annotation/Pero2926.1.fastq ../pero_annotation/Pero2926.2.fastq

               [2014-05-28 07:46:31] Beginning TopHat run (v2.0.11)

               -----------------------------------------------

               [2014-05-28 07:46:31] Checking for Bowtie

                 Bowtie version:        2.2.2.0

               [2014-05-28 07:46:31] Checking for Samtools

                       Samtools version:        0.1.19.0

               [2014-05-28 07:46:31] Checking for Bowtie index files (genome)..

               [2014-05-28 07:46:31] Checking for reference FASTA file

                       Warning: Could not find FASTA file pero.fa

               [2014-05-28 07:46:31] Reconstituting reference FASTA file from Bowtie index

                Executing: /share/bin/bowtie2-inspect pero > 2926/tmp/pero.fa

               [2014-05-28 07:49:08] Generating SAM header for pero

               [2014-05-28 07:49:08] Preparing reads

                        left reads: min. length=151, max. length=151, 22922811 kept reads (8500 discarded)

                       right reads: min. length=151, max. length=151, 22918434 kept reads (12877 discarded)

               [2014-05-28 08:12:36] Mapping left_kept_reads to genome pero with Bowtie2

                       [FAILED]

               Error running bowtie:

               Error while flushing and closing output

               Error while flushing and closing output

               terminate called after throwing an instance of 'int'

      (ERR): bowtie2-align died with signal 6 (ABRT) (core dumped)


 
When looking at the logfiles, it is clear that this command is failing:

/share/tophat-2.0.11.Linux_x86_64/bam2fastx --all 2926/tmp/left_kept_reads.bam | /share/bin/bowtie2 -k 20 -D 15 -R 2 -N 0 -L 20 -i S,1,1.25 --gbar 4 --mp 6,2 --np 1 --rdg 5,3 --rfg 5,3 --score-min C,-14,0 -p 8 --sam-no-hd -x pero - | /share/tophat-2.0.11.Linux_x86_64/fix_map_ordering --bowtie2-min-score 15 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --index-outfile 2926/tmp/left_kept_reads.mapped.bam.index --sam-header 2926/tmp/pero_genome.bwt.samheader.sam - 2926/tmp/left_kept_reads.mapped.bam 2926/tmp/left_kept_reads_unmapped.bam


error:

[sam_read1] missing header? Abort!
(ERR): bowtie2-align died with signal 13 (PIPE)






Matthew MacManes

unread,
May 28, 2014, 9:41:05 PM5/28/14
to tuxedo-to...@googlegroups.com
Any ideas about these errors, reported above?

Boyko

unread,
May 29, 2014, 11:38:40 AM5/29/14
to tuxedo-to...@googlegroups.com
Hi Matthew,

There are a couple of reasons why you might be seeing this error message. One is that the sam file with chromosome information in the Bowtie index somehow got corrupted during the run. Another is that your machine ran out of RAM. The latter is much more likely. Bowtie is a very nifty tool, but it's memory footprint and exception handling could be improved.

Best,
Boyko

Matthew MacManes

unread,
May 29, 2014, 12:14:00 PM5/29/14
to tuxedo-to...@googlegroups.com
I don't think that it is. 

I tried running Bowtie2 independent of TopHat - works just fine suggesting that the issue is not with the index. I'm working on a large workstation with 512Gb RAM. If Bowtie2 is using more that that amount of RAM, we've got a much larger issue.

Wondering if the Generating SAM header for pero failed. As you can see from the log, it only took 1 second, which might be indicative of a problem.

Thanks, Matt

Boyko

unread,
May 29, 2014, 1:53:31 PM5/29/14
to tuxedo-to...@googlegroups.com
Hmm, okay. In that case, for the run(s) that failed, in the tmp folder within the output directory, there should be a file called genome_genome.bwt.samheader.sam . Does it contain any @SQ lines or only @HD / @PG lines?

Matthew MacManes

unread,
May 29, 2014, 2:48:51 PM5/29/14
to tuxedo-to...@googlegroups.com
In that file, I have no @SQ lines

@HD VN:1.0 SO:coordinate
@PG ID:TopHat VN:2.0.11 CL:/share/tophat-2.0.11.Linux_x86_64/tophat -p16 -o 2926 --library-type fr-firststrand pero test1.fq test2.fq


Boyko

unread,
May 29, 2014, 3:35:36 PM5/29/14
to tuxedo-to...@googlegroups.com
All right, in my experience this occurs when not enough memory can be allocated for the initial task of loading the bowtie index. It seems to happen during this step when genome_genome.bwt.samheader.sam is created:

[2014-05-28 07:49:08] Generating SAM header for pero
One would expect that the program should throw an exception at this point and exit with an error code, but alas that's not the case.
It's entirely possible that it could be cause by another issue, but without going through a debugging process, I don't readily have a second guess.

Just as a reference, a "healthy" genome_genome.bwt.samheader.sam file should look like this (this one is for H. sapiens of course):


@HD    VN:1.0    SO:coordinate
@SQ    SN:1    LN:249250621
@SQ    SN:10    LN:135534747
@SQ    SN:11    LN:135006516
@SQ    SN:12    LN:133851895
@SQ    SN:13    LN:115169878
@SQ    SN:14    LN:107349540
@SQ    SN:15    LN:102531392
@SQ    SN:16    LN:90354753
@SQ    SN:17    LN:81195210
@SQ    SN:18    LN:78077248
@SQ    SN:19    LN:59128983
@SQ    SN:2    LN:243199373
@SQ    SN:20    LN:63025520
@SQ    SN:21    LN:48129895
@SQ    SN:22    LN:51304566
@SQ    SN:3    LN:198022430
@SQ    SN:4    LN:191154276
@SQ    SN:5    LN:180915260
@SQ    SN:6    LN:171115067
@SQ    SN:7    LN:159138663
@SQ    SN:8    LN:146364022
@SQ    SN:9    LN:141213431
@SQ    SN:MT    LN:16569
@SQ    SN:X    LN:155270560
@SQ    SN:Y    LN:59373566
@PG    ID:TopHat    VN:2.0.11    CL:/.../tophat/tophat-2.0.11.Linux_x86_64/tophat -o /... -p 16 -G /.../Homo_sapiens/NCBI/build37.2/Annotation/Archives/archive-2013-03-06-12-22-32/Genes/genes.gtf -r 253 /.../Homo_sapiens/NCBI/build37.2/Sequence/Bowtie2Index/genome /...R1_val_1.fq.gz /...R2_val_2.fq.gz

Matthew MacManes

unread,
May 29, 2014, 7:54:23 PM5/29/14
to tuxedo-to...@googlegroups.com
OK thanks. Unfortunately this issue prevents me from using Tophat. I'll have to use other software until this issue is resolved. 

Luciana Souto Mofatto

unread,
Jun 6, 2014, 12:08:55 PM6/6/14
to tuxedo-to...@googlegroups.com
Hi Matthew,

We find the error in the tophat code file and we fixed using "vi tophat". 
You need to search "def get_index_sam_header" inside the tophat code and add "bowtie_header_cmd += ["-x"]", as described below :

def get_index_sam_header(params, idx_prefix, name = ""):
    noSkip = currentStage >= resumeStage
    try:
        temp_sam_header_filename = tmp_dir + "temp.samheader.sam"
        temp_sam_header_file = None
        if noSkip:
          temp_sam_header_file = open(temp_sam_header_filename, "w")

        bowtie_header_cmd = [bowtie_path]

        read_params = params.read_params
        if not params.bowtie2:
            bowtie_header_cmd += ["--sam"]

        if read_params.color:
            bowtie_header_cmd.append('-C')

        bowtie_header_cmd += ["-x"]

I hope this can help you.

Regards,

Luciana Souto Mofatto

Laraib Iqbal Malik

unread,
Jun 6, 2014, 2:16:43 PM6/6/14
to tuxedo-to...@googlegroups.com
Thanks a lot! That works perfectly!

Boyko

unread,
Jun 6, 2014, 2:16:49 PM6/6/14
to tuxedo-to...@googlegroups.com
Hi Luciana,

I haven't had a chance to check if this change to the code would fix the specific problem Matthew mentioned, but given the intended use it should be modified to this:

if params.bowtie2:
    bowtie_header_cmd += ["-x"]

That's because the -x options is specific to bowtie2.

Best,
Boyko

Luciana Souto Mofatto

unread,
Jun 6, 2014, 2:59:51 PM6/6/14
to tuxedo-to...@googlegroups.com
Hi Boyko,

I'll fix this in the tophat code. As I only used the Bowtie2, I didn't figure out about the Bowtie. Your tip was very useful!
Thanks a lot!

Regards,

Luciana

Benjamin Deonovic

unread,
Jun 13, 2014, 11:10:13 AM6/13/14
to tuxedo-to...@googlegroups.com
I ran into this same error. Using Boyko's fix I was able to get tophat to run. 
Message has been deleted
Message has been deleted

Alon Galor

unread,
Jun 15, 2020, 4:34:04 PM6/15/20
to Tuxedo Tools Users
Note that this does not solve the problem if using bowtie 1 as instructed by the tutorial at https://ccb.jhu.edu/software/tophat/fusion_tutorial.shtml
Reply all
Reply to author
Forward
0 new messages