quality of Trinity assembly

hsiu-ching Ma

unread,

Mar 23, 2015, 6:44:37 PM3/23/15

to Brian Haas, Tiago Hori, trinityrn...@googlegroups.com

Hi,

I am trying to assess the quality of my assembly before I proceed further and I am wondering if there are any suggestions as to what can be explored?

So far, i am using:

1) transcript fullness assessment as suggested by Trinity using both transcripts and protein databases

2) run bowtie_PE_separate_then_join.pl

My Q:

Do you recommend mapping Trinity transcripts to a reference genome sequence and do reciprocal blastn searches?

Also, is it worth running Detonate REF-EVAL?

I am aware that I have a large amount of reads mapped to rRNA (~20%) in my output, would I better off filtering out these reads before running Trinity in silico normalisation and assembly? Will this improve the quality of the assembly?

Any suggestions will be greatly appreciated.

Best Regards

Karen

Tiago Hori

unread,

Mar 23, 2015, 6:53:02 PM3/23/15

to hsiu-ching Ma, Brian Haas, trinityrn...@googlegroups.com

Mapping against the genome could be trick unless you have a intron exon boundary library.

DETONATE will give a more complete picture than separate then join method.

I would also use the script that extracts all putative full length sequences and compare that to your estimated number of transcripts.

It also depends on what you are looking for. For differential analysis for example what really matters is unique mapping and fragmentation may not necessarily be a huge issue.

I also find that it is hard to determine the absolute quality of an assembly without a reference transcriptome. What I will often do is compare different assemblers and a merged assembly.

The rRNA should have been taken care at library prep, but that does not always work. The normalization should reduce the presence of those.

T.

Sent from my iPhone

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

hsiu-ching Ma

unread,

Mar 24, 2015, 7:16:44 PM3/24/15

to Tiago Hori, Brian Haas, trinityrn...@googlegroups.com

Hi Tiago,

Thanks for the reply. My library is a normal non-stranded library, so I suppose mapping Trinity transcripts back to the genome will not tell me anything about the quality of the Trinity transcriptome. I came across a paper in which the authors does this to access their transcriptome assembly and I was just not sure if this is something i should try.

Detonate: both RSEM-EVAL and REF-EVAL should be tested?

I have run the script that extract full length sequences , i.e. analyze_blastPlus_topHit_coverage.pl. I ran this script with the CDS sequences and only ~10% of transcripts mapped at least 80% to the CDS. I suppose this % is not good enough?

my Q:

1. I am actually trying to do 2 things with my Trinity assembly. First is to generate a good transcriptome assembly and to do annotation and second to compare gene expression between different samples. At this stage, I thought I should make sure the Trinity assembly I have is good before I proceed further.

Any suggestions as to how to proceed with these 2 types of analysis?

2. If there is a okish transcriptome available for my species, anything i can do to test the quality of my Trinity assembly?

Thanks.

Best Regards

Karen

Tiago Hori

unread,

Mar 25, 2015, 11:53:35 AM3/25/15

to hsiu-ching Ma, Brian Haas, trinityrn...@googlegroups.com

You would have to use RSEM eval, as REF-eval requires a reference.

You can map your reads to you transcriptome and the Trinity generated one and if the percent properly mapped read are much higher in the old transcriptome, you could try to use CAP3 or CD_HIT to merge them. CD-HIT is actually better.

Can you send the results of bowtie_PE_separate_then_join.pl?

T.

Ultimate stresses sportsmanship and fair play. Competitive play is encouraged, but NEVER AT THE EXPENSE OF RESPECT BETWEEN PLAYERS, adherence to the rules and the BASIC JOY OF PLAY.

On Mar 24, 2015, at 08:46 PM, hsiu-ching Ma <hck...@gmail.com> wrote:

Hi Tiago,

Thanks for the reply. My library is a normal non-stranded library, so I suppose mapping Trinity transcripts back to the genome will not tell me anything about the quality of the Trinity transcriptome. I came across a paper in which the authors does this to access their transcriptome assembly and I was just not sure if this is something i should try.

Detonate: both RSEM-EVAL and REF-EVAL should be tested?

bowtie_PE_separate_then_join.pl

hsiu-ching Ma

unread,

Mar 25, 2015, 2:14:35 PM3/25/15

to Tiago Hori, Brian Haas, trinityrn...@googlegroups.com

Hi Tiago,

Thanks for your kind offer to look at the result for me, I really appreciate it. The hpc is down here and will be back tomorrow or Friday, I will send it as soon as the hpc is back and running, if this is ok.

Apart from merging with the old transcriptome, any suggestion if i want to use only the de novo assembly?

Thanks.

Best Regards

Karen

Mark Chapman

unread,

Mar 25, 2015, 6:21:42 PM3/25/15

to hsiu-ching Ma, Tiago Hori, Brian Haas, trinityrn...@googlegroups.com

Hello, As Tiago says, DETONATE is a pretty good way of assessing how 'good' a transcriptome is. Do you by any chance have access to the raw reads from the previous assembly? Were they from the same individual? Or completely different? You could always use CD-HIT to assess what proportion of your assembly is covered in the initial assembly, as well as vice versa to see if its worth combining them. Just some ideas :) Thanks, Mark

Dr. Mark A. Chapman

M.Ch...@soton.ac.uk

+44 (0)2380 594396

------------------------------------

Centre for Biological Sciences
University of Southampton

Life Sciences Building 85
Highfield Campus
Southampton
SO17 1BJ

Brian Haas

unread,

Mar 25, 2015, 6:34:10 PM3/25/15

to Mark Chapman, hsiu-ching Ma, Tiago Hori, trinityrn...@googlegroups.com

We definitely need to modernize the section on our web documentation on how to assess the quality of an assembly.

~b

--

--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

hsiu-ching Ma

unread,

Mar 26, 2015, 7:52:27 PM3/26/15

to Mark Chapman, Tiago Hori, Brian Haas, trinityrn...@googlegroups.com

Hi Mark,

Thanks for the email. The only reference I have for the species of interest is in flybase and no, unfortunately I don't have the raw reads and I don't think the transcriptome was generated based on these second generation sequencing data. They are the same species, but not sure if they are the exact same strain.

Thanks.

Best Regards

Karen

hsiu-ching Ma

unread,

Mar 30, 2015, 7:31:36 PM3/30/15

to Tiago Hori, Brian Haas, trinityrn...@googlegroups.com

Hi Tiago,

I was able to get the read alignment statistics from the '.o' file after running

bowtie_PE_separate_then_join.pl

However, this time when I run the analysis,

This is the kind of output I am getting:

$ perl /data/apps/trinity/r2015-2.0.3/util/SAM_nameSorted_to_uniq_count_stats.pl bowtie.nameSorted.bam

Æ20,000,000Å lines read

#read_type count pct

single 10781121 100.00

Total aligned reads: 10781121

I do not seems to be able to obtain the alignment stats as stated here:

http://trinityrnaseq.sourceforge.net/analysis/abundance_estimation.html

Any suggestions?

Thanks.

Best Regards

Karen

On Wed, Mar 25, 2015 at 8:53 AM, Tiago Hori <tiag...@me.com> wrote:

Mark Chapman

unread,

Mar 31, 2015, 2:28:47 AM3/31/15

to hsiu-ching Ma, Tiago Hori, Brian Haas, trinityrn...@googlegroups.com

Hi Karen,

Can you provide your full 'bowtie_PE_separate_then_join.pl' command as it seems you are only able to detect reads that map singly and not as pairs. Do you have 10781121 reads? Or hasnt it used the F and R? (I presume you have PE data?).

Thanks, Mark

hsiu-ching Ma

unread,

Mar 31, 2015, 2:43:17 PM3/31/15

to Mark Chapman, Tiago Hori, Brian Haas, trinityrn...@googlegroups.com

Hi Mark,

Thanks for the email. Yes, I have PE data, non-stranded and here is the command i used

perl /data/apps/trinity/r2015-2.0.3/util/bowtie_PE_separate_then_join.pl --seqType fq --left ../../../R207-P1-READ1-trimmed-paired-new.txt --right ../../../R207-P1-READ2-trimmed-paired-new.txt --target Trinity.fasta --aligner bowtie -- -p 4 --all --best --strata -m 300

Thanks.

Best Regards

Karen

Mark Chapman

unread,

Mar 31, 2015, 4:33:55 PM3/31/15

to hsiu-ching Ma, Brian Haas, Tiago Hori, trinityrn...@googlegroups.com

Hi Karen,
The first thing I see is that the filenames are .txt but you say they are .fq. Could this be causing the issue? Also you can add --SS_lib_type parameter to make it clear its strand-specific. Maybe this would fix it?
Thanks, Mark

Tiago Hori

unread,

Mar 31, 2015, 4:35:32 PM3/31/15

to Mark Chapman, hsiu-ching Ma, Brian Haas, trinityrn...@googlegroups.com

Also, Brian has mentioned in the past that it is a bad idea to have your data and Trinity installation in the same file tree.

T.

Sent from my iPhone

hsiu-ching Ma

unread,

Mar 31, 2015, 5:18:33 PM3/31/15

to Mark Chapman, Brian Haas, Tiago Hori, trinityrn...@googlegroups.com

Hi Mark,

My library and not stranded one and also, I have ran with the same file name, i.e.R207-P1-READ1-trimmed-paired-new.txt in the past and it didn;t give me any problems. Indeed, the content is in fastq format.

Thanks.

Karen

hsiu-ching Ma

unread,

Mar 31, 2015, 5:20:40 PM3/31/15

to Mark Chapman, Brian Haas, Tiago Hori, trinityrn...@googlegroups.com

Hi Mark,

This is the message I got:

Error, cmd: /data/apps/trinity/r2015-2.0.3/util/../util/support_scripts/merge_left_right_nameSorted_SAMs.pl --left_sam left/left.nameSorted.bam --right_sam right/right.nameSorted.bam -D 2000 | samtools view -bt target.fa.fai -S - > combined.nameSorted.pre.bam died with ret 256 at /data/apps/trinity/r2015-2.0.3/util/bowtie_PE_separate_then_join.pl line 630

Thanks.

Karen

Mark Chapman

unread,

Mar 31, 2015, 5:33:57 PM3/31/15

to hsiu-ching Ma, Tiago Hori, Brian Haas, trinityrn...@googlegroups.com

Hi Karen,
Sorry I misread about stranded/unstranded. Maybe try renaming the files to .fq to see is anything changes. Is samtools in your PATH? Maybe Brian would suggest upgrading to ver. 2.0.6 too? These are just some ideas :)
Thanks, Mark

Brian Haas

unread,

Mar 31, 2015, 6:00:16 PM3/31/15

to Mark Chapman, hsiu-ching Ma, Tiago Hori, trinityrn...@googlegroups.com

Upgrading to v2.0.6 won't hurt, but I'm not sure it's going to fix this specific problem.

Try running that failed command directly and let's see if there's a useful error message.

best,

~brian

Mark Chapman

unread,

Mar 31, 2015, 6:15:59 PM3/31/15

to Brian Haas, Tiago Hori, hsiu-ching Ma, trinityrn...@googlegroups.com

Hi Karen,
Were your sequence files .txt files when you received them? If not then is it possible they have been saved as .txt files more recently and hence the line breaks could have been changed? (Just another idea!)
--Mark

hsiu-ching Ma

unread,

Apr 3, 2015, 3:13:07 PM4/3/15

to Tiago Hori, Brian Haas, trinityrn...@googlegroups.com

Hi Tiago,

Finally, I have the result of bowtie_PE_separate_then_join.pl. Can you please take a look for me?

Fri Apr 03 12:10:55 hcma@compute-1-13:/bio/hcma/new_trinity_analysis/merged_maleANDf_trinity_r2015-2.0.3_output/output_with/trinity_garr/bowtie_out

1052 $ cat SAM_nameSorted_merged_with.o3264330

#read_type count pct

proper_pairs 224421146 80.03

improper_pairs 45037334 16.06

right_only 5685546 2.03

left_only 5282764 1.88

Total aligned reads: 280426790

Thanks for your time.

Best Regards

Karen

On Wed, Mar 25, 2015 at 8:53 AM, Tiago Hori <tiag...@me.com> wrote:

Tiago Hori

unread,

Apr 3, 2015, 3:23:47 PM4/3/15

to hsiu-ching Ma, Brian Haas, trinityrn...@googlegroups.com

Hi Karen,

That looks great. 80% proper pairs is what you want to see for a good assembly. Getting above that takes a lot depth and/or multi-assemblies.

T.

Sent from my iPhone

Ken Field

unread,

Apr 4, 2015, 11:11:56 AM4/4/15

to Tiago Hori, hsiu-ching Ma, Brian Haas, trinityrn...@googlegroups.com

I would add that if your read lengths overlap with your fragment lengths, most of the 'improper pairs' may in fact just be overlapping reads. In my case, I had 200 basepair fragments and 100 basepair reads and I found that almost all of the improper pairs were actually just overlaps.

It would be great if bowtie_PE_separate_then_join.pl would have a separate category of overlapping reads versus truly improper pairs (i.e. pairs that map to the same contig but in the wrong orientation relative to each other).

Ken

Ken Field, Ph.D.

Associate Professor of Biology

Program in Cell Biology/Biochemistry

Bucknell University

Room 203A Biology Building

Tiago Hori

unread,

Apr 4, 2015, 1:05:26 PM4/4/15

to Ken Field, hsiu-ching Ma, Brian Haas, trinityrn...@googlegroups.com

Ken,

That is a great point! I should start looking at that!

T.

Sent from my iPhone

Ken Field

unread,

Apr 5, 2015, 11:26:34 AM4/5/15

to Tiago Hori, hsiu-ching Ma, Brian Haas, trinityrn...@googlegroups.com

This thread reminds me of a question that I have had about overlapping reads and RSEM and eXpress.

When using the align_and_estimate_abundance.pl script, do overlapping reads lead to false increases in the apparent expression of the overlapped region for either RSEM or eXpress estimates of read counts?

I think that this is important to know because longer reads are becoming the norm and overlaps will be very common, even after quality trimming, when we have 125 bp paired reads on 200 bp fragments.

I know that the program flash can join these overlaps, but I ran into all kinds of problems in the downstream analysis when I tried to use flash to preprocess my reads for Trinity.

Ken

hsiu-ching Ma

unread,

Nov 5, 2015, 8:22:29 PM11/5/15

to Ken Field, Tiago Hori, Brian Haas, trinityrn...@googlegroups.com

Hi,

I ran the following script:

perl /data/apps/trinity/r2015-2.1.1/util/bowtie_PE_separate_then_join.pl --seqType fq --left ../../../R207-P2-READ1-trimmed-paired-new.txt --right ../../../R207-P2-READ2-trimmed-paired-new.txt --target ../Trinity.fasta --SS_lib_type RF --aligner bowtie -- -p $CORES --all --best --strata -m 300

Then

perl /data/apps/trinity/r2015-2.1.1/util/SAM_nameSorted_to_uniq_count_stats.pl bowtie_out.nameSorted.bam

#read_type count pct

single 1 100.00

Total aligned reads: 1

But i am getting a wired output.

Thanks.

Best Regards

Karen

Brian Haas

unread,

Nov 5, 2015, 8:28:57 PM11/5/15

to hsiu-ching Ma, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Hi Karen,

If you run

samtools flagstat bowtie_out.nameSorted.bam

does it report useful stats?

If you want to share the bam file with me, I can give it a look.

best,

~b

hsiu-ching Ma

unread,

Nov 5, 2015, 8:34:44 PM11/5/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Hi Brian,

Thanks for the email. I did what you say and below is what I got:

Of course, see attached my bam file.

$ samtools flagstat bowtie_out.nameSorted.bam

0 + 0 in total (QC-passed reads + QC-failed reads)

0 + 0 secondary

0 + 0 supplimentary

0 + 0 duplicates

0 + 0 mapped (-nan%:-nan%)

0 + 0 paired in sequencing

0 + 0 read1

0 + 0 read2

0 + 0 properly paired (-nan%:-nan%)

0 + 0 with itself and mate mapped

0 + 0 singletons (-nan%:-nan%)

0 + 0 with mate mapped to a different chr

0 + 0 with mate mapped to a different chr (mapQ>=5)

Thanks.

Best Regards

Karen

bowtie_out.nameSorted.bam

Brian Haas

unread,

Nov 5, 2015, 9:05:59 PM11/5/15

to hsiu-ching Ma, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

My best guess is that there's something about your input files that it doesn't like.

Is your R207-P2-READ2-trimmed-paired-new.txt in fastq file format?

To test whether or not the bowtie-PE process can operate given your installation, and assuming that you have the very latest Trinity version installed, you can

cd trinityrnaseq/sample_data/test_Trinity_Assembly/

then

make test_trinity

to build a small trinity assembly, and then

make test_bowtie_PE_read_estimates

to test the bowtie-PE process.

This should tell us whether or not the system is functioning as intended, and then we can focus on what's different about your data.

best,

~b

hsiu-ching Ma

unread,

Nov 5, 2015, 9:11:11 PM11/5/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

It is so strange because i ran the same script using 2.0,3 a few months ago and everything worked last time.

The R207-P2-READ2-trimmed-paired-new.txt is indeed in fastq format, but then again, i used the same format last time as well.

is Trinity 2.1.1 compatible with bowtie/0.12.8 and samtools 1,1?

Thanks.

Best Regards

Karen

Brian Haas

unread,

Nov 5, 2015, 9:16:15 PM11/5/15

to hsiu-ching Ma, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

It should be. Let's see how the tests go.

~b

hsiu-ching Ma

unread,

Nov 5, 2015, 9:21:43 PM11/5/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Has it got anything to do with the new Trinity naming stystem?

Output of running Trinity/2.0.3:

transcript_id gene_id length effective_length expected_count TPM FPKM IsoPct

c10000_g1_i1 c10000_g1 406 207.11 5.00 0.24 0.18 100.00

c10001_g1_i1 c10001_g1 383 184.82 8.00 0.42 0.33 100.00

Output of running Trinity/2.1.1:

transcript_id gene_id length effective_length expected_count TPM FPKM IsoPct

TRINITY_DN10000_c0_g1_i1 TRINITY_DN10000_c0_g1 378 180.67 20.91 0.58 0.81 100.00

TRINITY_DN10000_c0_g2_i1 TRINITY_DN10000_c0_g2 840 639.92 73.09 0.57 0.80 100.00

Thanks.

Best Regards

Karen

hsiu-ching Ma

unread,

Nov 5, 2015, 9:24:50 PM11/5/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Thu Nov 05 18:24:20 hcma@compute-1-13:/data/apps/trinity/r2015-2.1.1

1020 $ ls

Analysis hpc_conf Makefile README Trinity util

Butterfly Inchworm ,make.log README.md Trinity.original

Chrysalis LICENSE notes Release.Notes trinity-plugins

galaxy-plugin LICENSE.txt PerlLib sample_data trinityrnaseq.wiki

This is all i see, how can i

cd trinityrnaseq/sample_data/test_Trinity_Assembly/

Thanks

Brian Haas

unread,

Nov 5, 2015, 9:25:28 PM11/5/15

to hsiu-ching Ma, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

It shouldn't. Again, let's see how the test goes given the sample data provided.

hsiu-ching Ma

unread,

Nov 5, 2015, 9:33:11 PM11/5/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Sorry, i found 'sample_data', but then

Thu Nov 05 18:31:36 hcma@compute-1-13:/data/apps/trinity/r2015-2.1.1/sample_data/test_Trinity_Assembly

1028 $ ls

cleanme.pl Makefile reads2.left.fq.gz reads.right.fq.gz

__indiv_ex_sample_derived misc_run_tests reads2.right.fq.gz runMe.sh

longReads.fa README reads.left.fq.gz test_FL.sh

Thu Nov 05 18:31:37 hcma@compute-1-13:/data/apps/trinity/r2015-2.1.1/sample_data/test_Trinity_Assembly

1029 $ make test_trinity

./runMe.sh

module () { eval `/data/apps/modules/Modules/$MODULE_VERSION/bin/modulecmd bash $*`

}

#!/bin/bash -ve

if [ -e reads.right.fq.gz ] && [ ! -e reads.right.fq ]; then

gunzip -c reads.right.fq.gz > reads.right.fq

fi

./runMe.sh: line 5: reads.right.fq: Permission denied

make: *** [test_trinity] Error 1

Thanks

Brian Haas

unread,

Nov 5, 2015, 9:40:56 PM11/5/15

to hsiu-ching Ma, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

I see.... Yes, it's going to be difficult to do this given a central installation.

In any case, you'll find sample data there that you can explore and attempt to execute the various steps with.

hsiu-ching Ma

unread,

Nov 6, 2015, 1:28:08 PM11/6/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

I have asked the people who installed it to test it for me as suggested. I will let you know what they says.

Thanks.

Best Regards

Karen

Brian Haas

unread,

Nov 6, 2015, 1:45:32 PM11/6/15

to hsiu-ching Ma, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

sounds good. best of luck!

~b

hsiu-ching Ma

unread,

Nov 6, 2015, 2:50:56 PM11/6/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Hi Brian,

The people from hpc installed an app to test gam files and these are the 2 results i got using a ban output that has not worked with the 'SAM_nameSorted_to_uniq_count_stats.pl'

-------------------

output using files that has NOT worked with SAM_nameSorted_to_uniq_count_stats.pl:

Number of records read = 391736570

Number of valid records = 391736570

TotalReads(e6) 391.74

MappedReads(e6) 391.74

PairedReads(e6) 391.74

ProperPair(e6) 391.74

DuplicateReads(e6) 0.00

QCFailureReads(e6) 0.00

MappingRate(%) 100.00

PairedReads(%) 100.00

ProperPair(%) 100.00

DupRate(%) 0.00

QCFailRate(%) 0.00

TotalBases(e6) 38234.95

BasesInMappedReads(e6) 38234.95

Returning: 0 (SUCCESS)

[1]+ Done bam validate --in bowtie.bam

--------------------------

output using files that has worked with SAM_nameSorted_to_uniq_count_stats.pl:

Number of records read = 568404620

Number of valid records = 568404620

TotalReads(e6) 568.40

MappedReads(e6) 568.40

PairedReads(e6) 568.40

ProperPair(e6) 568.40

DuplicateReads(e6) 0.00

QCFailureReads(e6) 0.00

MappingRate(%) 100.00

PairedReads(%) 100.00

ProperPair(%) 100.00

DupRate(%) 0.00

QCFailRate(%) 0.00

TotalBases(e6) 55461.44

BasesInMappedReads(e6) 55461.44

Returning: 0 (SUCCESS)

jobs

[1]+ Done bam validate --in bowtie.bam

-----------------------

They seems pretty similar, so does this means the problem is with the 'SAM_nameSorted_to_uniq_count_stats.pl'?

Thanks.

Best Regards

Karen

Brian Haas

unread,

Nov 6, 2015, 2:54:34 PM11/6/15

to hsiu-ching Ma, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Hi Karen,

What happens when you run the SAM_nameSorted_to_uniq_count_stats.pl script on one of those bam files?

~b

hsiu-ching Ma

unread,

Nov 6, 2015, 2:59:09 PM11/6/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Output of running SAM_nameSorted_to_uniq_count_stats.pl script:

1) didn;t work with 2.1.1:

perl /data/apps/trinity/r2015-2.1.1/util/SAM_nameSorted_to_uniq_count_stats.pl bowtie_out.nameSorted.bam

#read_type count pct

single 1 100.00

Total aligned reads: 1

But i am getting a wired output.

2) worked with 2.0.3 in the past:

$ cat SAM_nameSorted_merged_with_new.o3279429

#read_type count pct

proper_pairs 224421146 80.03

improper_pairs 45037334 16.06

right_only 5685546 2.03

left_only 5282764 1.88

Total aligned reads: 280426790

Thanks.

Karen

hsiu-ching Ma

unread,

Nov 6, 2015, 3:01:48 PM11/6/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Hi Brian,

The hpc people have copied the files to my dir, so i can test it.

I did 'make test_trinity' and this is what i got below:

$ make test_trinity

./runMe.sh

module () { eval `/data/apps/modules/Modules/$MODULE_VERSION/bin/modulecmd bash $*`

}

#!/bin/bash -ve

if [ -e reads.right.fq.gz ] && [ ! -e reads.right.fq ]; then

gunzip -c reads.right.fq.gz > reads.right.fq

fi

if [ -e reads.left.fq.gz ] && [ ! -e reads.left.fq ]; then

gunzip -c reads.left.fq.gz > reads.left.fq

fi

if [ -e reads2.right.fq.gz ] && [ ! -e reads2.right.fq ]; then

gunzip -c reads2.right.fq.gz > reads2.right.fq

fi

if [ -e reads2.left.fq.gz ] && [ ! -e reads2.left.fq ]; then

gunzip -c reads2.left.fq.gz > reads2.left.fq

fi

#######################################################

## Run Trinity to Generate Transcriptome Assemblies ##

#######################################################

../../Trinity --seqType fq --max_memory 2G --left reads.left.fq.gz,reads2.left.fq.gz --right reads.right.fq.gz,reads2.right.fq.gz --SS_lib_type RF --CPU 4 --no_cleanup

./runMe.sh: line 26: ../../Trinity: No such file or directory

make: *** [test_trinity] Error 1

Thanks a lot for your time.

Karen

Brian Haas

unread,

Nov 6, 2015, 3:14:10 PM11/6/15

to hsiu-ching Ma, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

If you make your bam file available to me, I'll give it a look. It isn't obvious to me why it's causing trouble.

hsiu-ching Ma

unread,

Nov 6, 2015, 3:18:15 PM11/6/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Hi Brian,

Please see attached, this is the bam file (using r2015-2.1.1) that has not worked with the script 'SAM_nameSorted_to_uniq_count_stats.pl '

Thanks.

Karen

bowtie_out.nameSorted.bam

hsiu-ching Ma

unread,

Nov 6, 2015, 3:18:54 PM11/6/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Hi Brian, did you see this?

Brian Haas

unread,

Nov 6, 2015, 3:24:03 PM11/6/15

to hsiu-ching Ma, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

That bam file is empty - no reads. I'll look into why the script is reporting 1 read, as that's incorrect.

If the bam file has no reads, then it sounds like a problem earlier on, and it's a different problem than the counting script not working on your other large bam files correctly.

Let's focus for a moment on the large bam files for which you have statistics indicating that there are plenty of reads. What happens when you run the counting script on one of them?

As for running the 'make' tests in Trinity, it's another can of worms, and I honestly just don't have resources to begin exploring it, unfortunately, unless it's a problem that many are encountering.

hsiu-ching Ma

unread,

Nov 6, 2015, 6:20:03 PM11/6/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

Hi Brian,

Sorry for the delay in replying.

Using the 'SAM_nameSorted_to_uniq_count_stats.pl' from r-2015.2.0.3

#read_type count pct

proper_pairs 224421146 80.03

improper_pairs 45037334 16.06

right_only 5685546 2.03

left_only 5282764 1.88

Total aligned reads: 280426790

However, i ran the script from trinity/r-2015.2.1.1 using the same dataset, this is what i got:

perl /data/apps/trinity/r2015-2.1.1/util/SAM_nameSorted_to_uniq_count_stats.pl bowtie_out.nameSorted.bam

Can't exec "samtools": No such file or directory at /data/apps/trinity/r2015-2.1.1/util/../PerlLib/SAM_reader.pm line 35.

Error, cannot open file bowtie_out.nameSorted.bam at /data/apps/trinity/r2015-2.1.1/util/../PerlLib/SAM_reader.pm line 34.

SAM_reader::_init('SAM_reader=HASH(0x1585768)') called at /data/apps/trinity/r2015-2.1.1/util/../PerlLib/SAM_reader.pm line 24

SAM_reader::new('SAM_reader', 'bowtie_out.nameSorted.bam') called at /data/apps/trinity/r2015-2.1.1/util/SAM_nameSorted_to_uniq_count_stats.pl line 58

Does this mean anything?

Thanks

Best Regards

Karen

hsiu-ching Ma

unread,

Nov 6, 2015, 6:24:03 PM11/6/15

to Brian Haas, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

As for running the 'make' tests in Trinity, it's another can of worms, and I honestly just don't have resources to begin exploring it, unfortunately, unless it's a problem that many are encountering............i see, so does this means that there is problem with my actual Trinity.fasta file, or if i got an '' at the end of running it, then things are fine?

Thanks.

Karen

Brian Haas

unread,

Nov 6, 2015, 9:26:51 PM11/6/15

to hsiu-ching Ma, Ken Field, Tiago Hori, trinityrn...@googlegroups.com

For some reason, the version of the script that's breaking is not finding the 'samtools' software in your PATH, but apparently the first one is. That script hasn't changed between the releases, so it sounds like it's some issue with the computing environment. If you have a version that's working, then use that one - for lack of better advice.

Reply all

Reply to author

Forward