Missing a barcode fastq file for Illumina run

1,960 views
Skip to first unread message

lsti...@gmail.com

unread,
May 25, 2013, 6:43:08 PM5/25/13
to qiime...@googlegroups.com
Hi all,
We've run a sample on a MiSeq and cannot identify the fastq file that has our barcodes. We've been working off the qiime illumina tutorial page.

The MiSeq fastq files we have are:
Tag01_S1_L001_R1_001.fastq
Tag01_S1_L001_R2_001.fastq
And accompanying undetermined files.

Here are our results for search with the head command:

qiime@qiime-VirtualBox:~/Desktop/Shared_Folder/Data/Intensities/BaseCalls$ head -n 4 Tag01_S1_L001_R1_001.fastq
@M00317:62:000000000-A3GFY:1:1101:8618:7512 1:N:0:1
CGACTACCCTCCCGACTGCCTATGATGTTTATCCTTTGGATGGTCGCCATGATGGTGGTTATTATACCGTCAAGGACTGTGTGACTATTGACGTCCTTCCCCGTACGCCGGGCAATAATGTTTATGTTGGTTTCATGGTTTGGTCTAACTT
+
>3>>ADFFFFBBGCAEEEGGGFHHFHFGHFHHFHHHHHFGGHCHHCG2ECFBGHHBAEEGHGHHHHHDEFDAAGC1FGHGFHHHGHEFHFGHGGGGGHHHHHGGGGGGGGGGGGGFHHHHHFHHFHFHHEFHHHHHHGHGGFCGHHHGDFF

qiime@qiime-VirtualBox:~/Desktop/Shared_Folder/Data/Intensities/BaseCalls$ head -n 4 Tag01_S1_L001_R2_001.fastq
@M00317:62:000000000-A3GFY:1:1101:8618:7512 2:N:0:1
TTATCGGTAGCAAGCACATCACCTTGAATGCCACCGGAGGCGGCTTTTTGACCGCCTCCAAACAATTTAGACATGGCGCCACCAGCAAGAGCAGAAGCAATACCGCCAGCAATAGCACCAAACATAAATCACCTCACTTAAGTGGCTGGAG
+
AAABBBBBBBFFGGGGGGGGGGHHHHHHHFHHHHHFACGCEGGGEEHHHGGGHGGGGGHFFFHHHHHHGGHHHHGGHEDFGGHHFEF2FFHHHHHHHGGGEGGHA/DGCGHHHFFHHHHGHHHHFHHHHHHHHHHGHHGHG<=DDEHEGGC

We received similar results for the undetermined files.

We used the grep -c command using our first two barcode sequences, using primers from here. We found 0 and 0 for the Tag files and 3 and 41 at most for the undetermined files. We assumed these counts in the undetermined files were false positives.

Are we just missing a barcode file due to the way the MiSeq was run? If so, is the thing to forward to the lab your paper from PNAS in 2011? We've read a few other posts with what sound like similar, if not identical problems, but we finally wanted to ask for some advice ourselves.

Thanks in advance,
Leland

Tony Walters

unread,
May 25, 2013, 6:56:40 PM5/25/13
to qiime...@googlegroups.com
Hello Leland,

You might also try looking for the same barcode in the other three possible orientations (reverse complement, complement, and reverse) with the grep -c command.

It is possible that the sequencing center did not do the third read with the index primer, and sending them the paper (pointing to figure 1) might help clear up confusion on their end.

-Tony



--
 
---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

lsti...@gmail.com

unread,
May 25, 2013, 9:15:28 PM5/25/13
to qiime...@googlegroups.com
Hi Tony,
Wow, thanks for the quick response. We went out for dinner, we should have just gone downstairs and grabbed some cheetos.

So we've run the complement, reverse complement, and reverse, and we didn't get any hits. Thanks for your help, we'll let you know how it works out.

Leland

Mike

unread,
May 26, 2013, 11:05:18 AM5/26/13
to qiime...@googlegroups.com
Hi Leland,

I've had this issue myself. The problem is that MISEQ is not set up to generate barcode.fastq files by default. Below is some information on how to do this, thanks to some helpful correspondence with Sarah from Argonne National Laboratory. Below is an edited snippet of e-mailed instructions, with some comments on barcodes:

Here's how to set up the MISEQ to write the barcode / index read file:
# start e-mail
"""Modify MiSeqReporter.exe.config file that can be found in C:\Illumina\MiSeq Reporter

The following needs to be included in the top portion of the file (the <appSettings> section)

<add key="CreateFastqForIndexReads" value="1" />

You will then want to restart the service. This can be accomplished by right clicking on the tool bar in windows, selecting "Start Task Manager", select the "Services" tab, find MiSeq Reporter on the list and then select to stop and then start the service.

The MiSeq doesn't put barcodes in the header of the fastq unless it completes the demultiplexing (which we don't want the MiSeq to do, because we're going to use Qiime). However, CASAVA generated fastqs, like ones you've received from JGI always contain barcode information in the header regardless of whether it has been demultiplexed."""
# end e-mail

Some sequencing facilities can be reluctant to do this modification on their machine, or do not even know that they can. So, I suggest sending the above snippet of information along with the figure reference that Tony mentioned to help expedite obtaining your barcode files.

Finally. In my case, as hinted above, the data I receive from JGI does not include a separate barcode.fastq file, but they do append the barcode sequence into the reads 1 and 2 fastq headers. So, I use that information to make a mock barcode.fastq file. I used a modified variant of Tony's code as posted here:

-Hope this helps! 
-Mike

lsti...@gmail.com

unread,
May 27, 2013, 11:01:02 AM5/27/13
to qiime...@googlegroups.com
Thanks for your post, Mike, that's very helpful. We are fortunate to have a MiSeq in the core here at UVA, so I'm hopeful we can work with them to modify the machine as needed.

Thanks again,
Leland

Mike

unread,
May 27, 2013, 1:31:31 PM5/27/13
to qiime...@googlegroups.com
Great! I am in the same boat as you: working with getting our local machine set up properly. :-)

Mike

Olivier

unread,
Jun 27, 2013, 7:32:01 AM6/27/13
to qiime...@googlegroups.com
Hi all,

Thanks for the info, I configurated the MiSeqReporter as indicated below, and it worked quite well. But now, with my 12-plex run, paired end, I have 39 files in the data folder (three files per sample and three files for undetermined). Does someone can help me to obtain only three files (forward, reverse and index)? It should be the simplest but I can't find how to do that.
Best
Olivier
Message has been deleted

Mike

unread,
Jul 3, 2013, 11:15:22 AM7/3/13
to qiime...@googlegroups.com
Interesting, are they all fastq files? If so, you may be able to just concatenate them accordingly. Take a look at a related forum link below, perhaps Sarah can help?


-Mike

Sarah Owens

unread,
Aug 1, 2013, 12:11:02 PM8/1/13
to qiime...@googlegroups.com
You're likely getting separate fastq files because you've included all of your samples and barcodes in the samplesheet.csv.  Instead, just include a generic sample name and a random 12bp barcode to indicate to the sequencer that you'd like to complete a 12bp index.

Thanks.
Sarah

Mike

unread,
Aug 1, 2013, 12:37:10 PM8/1/13
to qiime...@googlegroups.com
Thanks Sarah!

So, basically we are pretending that we have one sample for the entire run. We provide a random 12 bp sequence for the barcode to instruct the MISEQ to sequence up to 12 bp for the index read. 

Won't the MISEQ attempt to demultiplex the run by that random 12 bp barcode? If so, should that 12 bp barcode NOT match the actual barcodes being used? I assume there is a setting that instructs the MISEQ to demultiplex or not? Obviously, we do not want it to demultiplex as we want to do this via QIIME.

-Mike

Owens, Sarah

unread,
Aug 1, 2013, 12:44:12 PM8/1/13
to qiime...@googlegroups.com
Hi Mike,

You're exactly right. Just pick a barcode, you can make sure it's not one of the barcodes you're using (although, it doesn't really matter, as you need the reverse complement of the barcodes for it to be demultiplexed). 

There is no way to turn demultiplexing on or off specificially. If there are barcodes in the sample sheet, the miseq will try to demultiplex. If there aren't, it won't. If you just put one sample in as instructed, you'll get empty fastqs with your sampleID you listed, and then "undetermined" fastqs that have all of the data from the run.

Thanks.
Sarah


You received this message because you are subscribed to a topic in the Google Groups "Qiime Forum" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qiime-forum/T5GDBXC83v8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qiime-forum...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Sarah M. Owens, M.S.
Technical Director, IGSB-NGS Core Facility
Argonne National Laboratory
9700 S. Cass Avenue
Bldg. 202, Rm. A353
Lemont, IL 60439
Sarah...@anl.gov
630.252.2101
http://press.igsb.anl.gov/ngs/

Mike

unread,
Aug 1, 2013, 12:55:42 PM8/1/13
to qiime...@googlegroups.com, sarah...@anl.gov
It all makes perfect sense now. Thanks for clarifying! :-)

Again, thank you so much for your help Sarah!

-M

yings...@126.com

unread,
Feb 6, 2014, 3:58:51 AM2/6/14
to qiime...@googlegroups.com, sarah...@anl.gov
it's helpful. thanks

在 2013年8月2日星期五UTC+8上午12时55分42秒,Mike R写道:
Reply all
Reply to author
Forward
0 new messages