Currently have fastq.gz files but need .fna (fasta)

732 views
Skip to first unread message

Patrick Rosenthal

unread,
Jan 19, 2016, 12:37:56 PM1/19/16
to Qiime 1 Forum
Hello,

I am doing research for school and we have a bunch of raw data (fastq.gz) files that were from an illumina miSeq machine.  However, I need these files to be combined into one .fna file to use for Qiime.  I believe an .fna file is composed of fastq files that have been converted into fasta files for processing.  

Does anyone have a script or a way that I could get the .fna file created.  I actually have multiple sets of data that I need to do this for. 

Thank you, any help would be greatly appreciated!

Thank you,
Patrick Rosenthal

justink

unread,
Jan 19, 2016, 5:28:20 PM1/19/16
to Qiime 1 Forum
actually, qiime should be able to handled the gzipped (.gz) fastq files. see split_libraries_fastq.py  and see if that works.

Patrick Rosenthal

unread,
Jan 19, 2016, 6:16:28 PM1/19/16
to Qiime 1 Forum
Thank you for the reply!  I noticed the script that you suggested has
separate fastq files for the bar codes.  I don't believe we have
separate fastq files for bar codes.  We were using a mapping file for
all of the scripts that were ran with the .fna.  Will this affect
anything?

Will split_libraries_fastq.py create an .fna file?  If so, I have
about 20 folders worth of fastq files for one data set, Is there an
easy way to combine all of these or would I have to manually run the
script 20 times?

Thank you!

P.S. I am not a biology major, I am actually a computer science major.
I am helping with the computer side of some biological analysis.  So I
may ask stupid questions...

Colin Brislawn

unread,
Jan 20, 2016, 12:25:08 PM1/20/16
to Qiime 1 Forum
20 folders worth of fastq files
There's a script for that! 
http://qiime.org/scripts/multiple_split_libraries_fastq.html

Most of us came from microbiology or comp sci and are trying to scrambling to catch up. Welcome! Ask away!

Colin Brislawn 

Patrick Rosenthal

unread,
Jan 20, 2016, 1:52:45 PM1/20/16
to Qiime 1 Forum
Thank you for the response, Colin!

I am attempting to run the script you suggested but it says the file does not exist?  

How do I download the file?

Most of the other files I have used have been already built into Qiime.

Thank you,
Patrick Rosenthal

Colin Brislawn

unread,
Jan 20, 2016, 2:06:03 PM1/20/16
to Qiime 1 Forum
Hello Patrick,

Can you post the full command you ran and the full error message? That will help us look for clues. 

Colin

Patrick Rosenthal

unread,
Jan 20, 2016, 2:30:50 PM1/20/16
to Qiime 1 Forum
multiple_split_libraries_fastq.py -i all_fastq -o output_fna --demultiplexing_method mapping_barcode_files --mapping_indicator mapping
Error message:
multiple_split_libraries_fastq.py: command not found
I am assuming that is because I don't have the file. 
Thank you! 

Colin Brislawn

unread,
Jan 20, 2016, 6:07:44 PM1/20/16
to qiime...@googlegroups.com
What version of qiime are you using? The script was introduced recently, so an older version may not have it.

Colin

P.S. You can learn more about your qiime version by running print_qiime_config.py

Patrick Rosenthal

unread,
Jan 20, 2016, 6:18:23 PM1/20/16
to Qiime 1 Forum
I am using Qiime 1.8.0

Is there a way to upgrade or download scripts into qiime?

I am using the virtualbox version. 

Colin Brislawn

unread,
Jan 20, 2016, 7:14:59 PM1/20/16
to Qiime 1 Forum
You could download the new VM image for 1.9.1 here: http://qiime.org/home_static/dataFiles.html 

You could also deploy a newer version of qiime from inside your VM, but this would be slightly harder than using the new VM. 

Colin

Patrick Rosenthal

unread,
Jan 20, 2016, 7:20:51 PM1/20/16
to Qiime 1 Forum
If I install that newer version will I lose the current files I have in my current virtual box?

Colin Brislawn

unread,
Jan 20, 2016, 7:30:55 PM1/20/16
to Qiime 1 Forum
You can totally have two VM images are the same time. You will have the transfer your data from the old images to the new image, but nothing is lost.

Colin

Patrick Rosenthal

unread,
Jan 20, 2016, 9:49:12 PM1/20/16
to Qiime 1 Forum
Okay, I will try doing that soon!

Thank you for the advice, I will keep you posted!

Patrick Rosenthal

unread,
Jan 27, 2016, 7:23:23 PM1/27/16
to Qiime 1 Forum
UPDATE:  I installed the latest version of Qiime 1.9.1.  I currently have fastq files but no bar code files.  I also have a mapping file that has the bar code sequences and the sample ID's that go with them.  How do I run multiple_split_libraries_fastq with this data?  I could not figure out how to do this without bar code files.

Colin Brislawn

unread,
Jan 28, 2016, 12:34:14 PM1/28/16
to Qiime 1 Forum
Hello Patrick,

Take a look at the documentation for multiple_split_libraries_fastq.py. Barcodes are an optional parameter, and are not even used by default. 

I take it that the fastq files have the sample names in their file names, like this:
R1.sample23.fastq

Colin

Patrick Rosenthal

unread,
Jan 28, 2016, 2:25:12 PM1/28/16
to Qiime 1 Forum
Thank you for the response.  The sample names that came off the machine originally that were part of the fastq file names are like S1, S2, S3...  However, the sample names that we actually wanted to assign the sequences were ones that we made up based on location.  We do not know which fastq files correlate to what location.  The mapping file we have has the Sample ID we made for the location and the barcode that correlates to that Sample ID, and I am assuming the barcode for each sample ID matches up with a fastq file based on the bar code?

Sorry if that is confusing, but let me know if that makes any sense. 

Thank you!!

Colin Brislawn

unread,
Jan 28, 2016, 2:34:58 PM1/28/16
to Qiime 1 Forum
Thanks for talking me through this Patrick,

So
mapping file: yourSampleID <--> location
barcodes: barcodes <--> yourSampleID
fastq file names: fastqSampleID <--> barcodes

... but you don't have the barcode for a fastq files?

Colin
Reply all
Reply to author
Forward
0 new messages