Can abyss-1.5.1 take in *.sra file as input? Or is *.fastq file 100% the same as *.sra file?

100 views
Skip to first unread message

Qiyuan Qiu

unread,
Jun 13, 2014, 12:47:47 AM6/13/14
to abyss...@googlegroups.com
Dear all,

I am able to run abyss-1.5.1 with SRR498276.fastq. It was converted from SRR498276.sra from this link.

The fastq file is much larger in size compared with sra file.

Therefore my question is why do we need to have files in fastq format?

Best,
Qiyuan Qiu

Tony Raymond

unread,
Jun 13, 2014, 6:09:32 PM6/13/14
to Qiyuan Qiu, abyss...@googlegroups.com
Hi Qiyuan,

ABySS will handle .sra files by running fastq-dump as a separate process (run like `fastq-dump -Z --split-spot <SRA_files>`). This should all be handled automatically, but you'll need to make sure that the SRA toolkit is installed and fastq-dump is in your PATH. Check this by running `which fastq-dump`.

What happens if you run abyss with the *.sra file instead of the *.fastq file?

Thanks,
Tony

--
You received this message because you are subscribed to the Google Groups "ABySS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to abyss-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Qiyuan Qiu

unread,
Jun 15, 2014, 12:51:44 AM6/15/14
to abyss...@googlegroups.com, qiuq...@gmail.com
Hi Tony,
I can successfully turn *.sra file into  *.fastq files using the fastq-dump command from sra_sdk-2.3.5-2.
However, I will encounter the error abyss-fixmate: error: All reads are mateless.

The sra file I used is from the link (SRR498276 ). 
The above error happens whenever I use the exact command "fastq-dump SRR498276.sra". -- > this command generates one file named SRR498276.fastq. 
The above error will go away if I use the exact command "fastq-dump --split-files SRR498276.sra" --> this command generates two files named SRR498276_1.fastq and SRR498276_2.fastq.

Why does ABySS need to see the same file divided into two parts?

Thank you, 
Qiyuan Qiu

Ka Ming Nip

unread,
Jun 16, 2014, 7:42:02 PM6/16/14
to abyss...@googlegroups.com, qiuq...@gmail.com
Hi Qiyuan,

Thanks for using ABySS.

If you have paired-end data, the output FASTQ file from the command `fastq-dump SRR498276.sra' would be completely garbage. Basically, the two mates of a read pair would be fused to form a chimeric single-end read! You don't want to use these reads for assembly.
As you have observed, you must use the `--split-files' option to split the reads properly. The read names should have either '/1' or '/2' suffix. abyss-fixmate should then be able to recognize read pairs based on these suffixes and perform a paired end assembly.

Hope that helps!

Ka Ming

Qiyuan Qiu

unread,
Jun 21, 2014, 12:59:19 AM6/21/14
to abyss...@googlegroups.com, qiuq...@gmail.com
Hi Ka, 

Sorry for replying late. 
Thank you for your answer, I can understand the procedure to break one *.sra file into two parts, but 
I guess my major doubt is how ABySS makes use of the fact that input are from two different files. 

In another word, what I do not understand is if I have just one file (instead of two) what information did I hide from ABySS 
such that it fools ABySS to produce "complete garbage". 

Tony Raymond

unread,
Jun 23, 2014, 12:29:02 PM6/23/14
to Qiyuan Qiu, abyss...@googlegroups.com
Hi,

When you use fastq-dump on paired-end data in an SRA file without any options, the output will be chimeric unpaired reads (i.e. both forward and reverse reads concatenated into one sequence). Since the reads are unpaired you get this error. The --split-files option makes it so that the sequences for both pairs are reported in separate fastq records. You could also use the --split-spot option to produce read pairs as separate fastq records in a single file, which seems to be how you are expecting fastq-dump to work.

Note that if you set the SRA file as input, abyss will use the --split-spot option so you shouldn't have to create the fastq files yourself. Have you tried to set the SRA file as input?

Cheers,
Tony

Qiyuan Qiu

unread,
Jun 23, 2014, 2:33:04 PM6/23/14
to abyss...@googlegroups.com, qiuq...@gmail.com
Hi Tony,

 
I tried using *.sra file as input. Result showed that it succeeded. 

Thank you very much,
Qiyuan Qiu
Reply all
Reply to author
Forward
0 new messages