assembling trinity using SRA data doesnt work !!!

367 views
Skip to first unread message

Anuj Guruvacharya

unread,
Oct 5, 2015, 5:54:37 PM10/5/15
to trinityrnaseq-users
I think Assembling data obtained from SRA using trinity is a big challenge. 

I have tried all the troubleshooting methods there and still its not working.

I ran :
fastq-dump SRR1819888 --split-files

then some trimmomatic using :
java -jar /home/Trimmomatic-0.33/trimmomatic-0.33.jar PE SRR1819888_1.fastq SRR1819888_2.fastq SRR1819888_1_paired_trimmed.fastq SRR1819888_1_unpaired_trimmed.fastq SRR1819888_2_paired_trimmed.fastq SRR1819888_2_unpaired_trimmed.fastq ILLUMINACLIP:/home/Trimmomatic-0.33/adapters/TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15

then i took the PAIRED FILE OUTPUT FROM trimmomatic of the forward and reverse and ran trinity on them like this :

/home/dnarules/Downloads/trinityrnaseq-2.1.0/Trinity --seqType fq --left SRR1819888_1_paired_trimmed.fastq --right SRR1819888_2_paired_trimmed.fastq --CPU 30 --max_memory 50G

It didnt work and I was suggested that there were blank spaces in the description files of my fastq , so I removed the spaces and also put in the /1 and /2 at the end of the description line using sed and perl. 

To put in the /1 and /2 at the end of the fastq file i did this :

sed '1~4 s/$/\/2/g' your_fastq_file.fastq > your_new_fastq_file.fastq

sed '1~4 s/$/\/1/g' your_fastq_file.fastq > your_new_fastq_file.fastq

To remove the blank spaces i did this :

perl -lape 's/\s+//sg' SRR1819888_2_paired_trimmed.fastq > SRR1819888_2_paired_trimmed_mod.fastq

Then my fastq files looked like this : 
For forward :
@SRR1819888.1HWI-ST1213:141:C17PWACXX:6:1101:1049:1976length=101/1
AAACTTTTCAGATAGTCATTCATAGCTTTTACGTATCCTTCATAAGAAGAATTGTCCTTGGGGTCATATGCTTTAACCTCTTTTAAAACACGCA
+SRR1819888.1HWI-ST1213:141:C17PWACXX:6:1101:1049:1976length=101
1:DDFFDDFDFHICCC@IAFHGIIBGHGAHGGHBE4@FGCFHHIBHCE;FHHD?9BGH<BF(7-5@)@@CGHG=@@@AEHECCDDFB>BBA8;@
@SRR1819888.2HWI-ST1213:141:C17PWACXX:6:1101:1374:1967length=101/1
CNGTAGTTCACACATACTTTGTTTTTTGATACATCTGGGGAAGTAGAGGATAAATAGTTGTACCAGACTTTATTTATTTATTATAGGATTATTGTTTGTAG
+SRR1819888.2HWI-ST1213:141:C17PWACXX:6:1101:1374:1967length=101
@#1=BBDDAHDDHIGIGHCGGDBFHIHFFDHHGIIIIIIIIIIFGGDDAGEIICFCC8CA77@A;DEEEC=A?BCDDFFECEDE>;>CCCA@DCCCC39@@
@SRR1819888.3HWI-ST1213:141:C17PWACXX:6:1101:1412:1988length=101/1
CCTCTGACAACGAAATACGAATGCCCCCAATTGTCCCTCTTAATCATTACGTCGGTCCTGAAAACCAACAAAACGGGACCGAAGTCCTGTTCCATTATTCC

For reverse:
@SRR1819888.1HWI-ST1213:141:C17PWACXX:6:1101:1049:1976length=101/2
CCGTGGCTTTCTCCATTTCCTGTACAACAAAGAGAAAGGAGAAGTAATGGGCCGAAATGGCAGGAGCTGGG
+SRR1819888.1HWI-ST1213:141:C17PWACXX:6:1101:1049:1976length=101
@@BFDDADFHGGFGGIG@FHICHGIJ@CFC@:CB@FHIGHGEFE9??<BDDBFAFGGE@3=>AH=5?>.(5
@SRR1819888.2HWI-ST1213:141:C17PWACXX:6:1101:1374:1967length=101/2
CGGTGCCTAGGGAGTAGCCCTGCGCGTGCGTTGGTTTAAGTGAAAGGGGGATTGAGACTGAGACAGTATTTTTGTATGCACTTGAGCGACAACTACTCGTT
+SRR1819888.2HWI-ST1213:141:C17PWACXX:6:1101:1374:1967length=101
???DDDDFFDFD3FAEGDHGEHIJGHCGHHIGGC7(=;FG=C@@D@@CABAACCCCDCC@C@@CCD:@((5@>58ACDCCDCDDCDD@BBBDBDDCCA7<8
@SRR1819888.3HWI-ST1213:141:C17PWACXX:6:1101:1412:1988length=101/2
CTGCACATGTCCTTTAATGGCTGTGTGTAGGATTTGCGACGTTTACTTTGAAAAAATTAGAGTGTTCAAAGCAAGCTTGTAAGCTTGAATACATGAGCATG

I then ran trinity on these fastq files like this again :

/home/dnarules/Downloads/trinityrnaseq-2.1.0/Trinity --seqType fq --left SRR1819888_1_paired_trimmed_mod.fastq --right SRR1819888_2_paired_trimmed_mod.fastq --CPU 30 --max_memory 50G


And its still not working. 
The error message i get is that butterfly could not complete.

" ERROR, no butterfly assemblies reported. at /home/dnarules/Downloads/trinityrnaseq-2.1.0/Trinity line 1258. "


I dont think there is any mistake I have done. The test sample data works fine, so the only possibility of error should be in my fastq file. I have followed all the procedures followed above and have my file in sanger fastq format. I still dont know why trinity is not working. I have been stuck in troubleshooting this for almost two weeks now. Could someone please drop me some hints that I could try out ???????

HELPPPPPPPPPPPPPPPPPPPPPPPPP !!!!!!!!!!!!!!!!!!1





Anuj Guruvacharya

unread,
Oct 5, 2015, 5:55:10 PM10/5/15
to trinityrnaseq-users
I am running Trinity 2.1.0 BTW

Brian Haas

unread,
Oct 5, 2015, 6:16:41 PM10/5/15
to Anuj Guruvacharya, trinityrnaseq-users

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

Anuj Guruvacharya

unread,
Oct 6, 2015, 10:53:50 AM10/6/15
to trinityrnaseq-users, anuj...@gmail.com
Thanks Brian , That worked !!! wished I had known this 2 weeks ago.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.

To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

Joshua Orvis

unread,
Oct 8, 2015, 12:27:52 PM10/8/15
to trinityrnaseq-users, anuj...@gmail.com
For anyone else finding this thread, if you were like me and already extracted the SRA files, trimmed them, and ran diginorm, and NOW want to reformat your headers without redoing all that, something like this should work:

$ perl -ne 's/SR\S+ (\S+) .+/$1\/1/; print' R1.trimmed.PE.fastq > R1.trimmed.PE.fastq.newheaders

This transformed my headers from this:

@SRR1542666.1 HWI-ST915:230:C38E8ACXX:6:1101:1442:2050 length=100

To this:

@HWI-ST915:230:C38E8ACXX:6:1101:1442:2050/1

Note that the command has the /1 part hard-coded in the regex.  Just change that to /2 when doing the other file.
Reply all
Reply to author
Forward
0 new messages