Barcodes with different length size.

37 views
Skip to first unread message

jose luis acosta

unread,
Sep 22, 2015, 7:22:25 PM9/22/15
to AftrRAD
Dear Mike Sovic ,

In my barcode file, I have different sequence sizes, for example:

 Sample   Barcode
ALT01 AACCGAACT
ALT02 AGAGTAGGAT
ALT03 CAATCTAATA
ALT04 CGTCGCGCGGT
ALT05 GATCGTTCAGA
ALT06 GCTTGATTGGAT
ALT07 TATTGACGACTAC
ALT08 TGCGCGGATTGGT
ALT09 AAGACAGGC
ALT10 AGGCTCCTAC
ALT11 CAGCGTACAT
ALT12 CTCAGCTTCCA
ALT14 GCAGGAAAAGTT
ALT15 GGCGTCTACGGA
ALT16 TCATCCCATGGGT
ALT17 TGCTAGTGAGGGT
ALT18 AAGGTCGAT
ALT19 ATAAGCTGTA
ALT20 CCATATGCGA
ALT21 CTCGCGCCAGT
ALT22 GCCTTAACGCCT
ALT23 GGTCCGAACTTC
ALT24 TCCGGCCGGATAT
ALT25 TTACTTAGGCCAT


when I run ./AftrRAD.pl script, only I can put a same barcode size.
How to, I can modify  ./AftrRAD.pl script for allow different sizes of barcodes??

best wishes
Jose Luis Acosta Ph. D.

Mike Sovic

unread,
Sep 23, 2015, 8:19:07 AM9/23/15
to AftrRAD
Hi Jose,

There should not be any problem with having different barcode sizes - this will be recognized from the barcode file automatically and accounted for.  There is more information about this in the FAQ document (attached).  Having said that, our lab has always used barcodes with the same length, so we haven't had the opportunity to test this function out a great deal ourselves.  I'm pretty sure other labs have run analyses with varying barcode lengths, and I'm not aware of any problems, at least up through version 4.1.  If you see any specific errors/warnings when running AftrRAD with your dataset that might be associated with the varying barcode lengths, or if you have any concerns about the dataset that is produced, please let us know.  Good luck!

           Mike      
AftrRAD_FAQs.pdf

jose luis acosta

unread,
Sep 24, 2015, 12:37:22 PM9/24/15
to AftrRAD
hI Mike

I put X at the end of the different barcodes, this can work ??  example:

AACCGAACTXXXX ALT01
AGAGTAGGATXXX ALT02
CAATCTAATAXXX ALT03
CGTCGCGCGGTXX ALT04
GATCGTTCAGAXX ALT05
GCTTGATTGGATX ALT06
TATTGACGACTAC ALT07
TGCGCGGATTGGT ALT08
AAGACAGGCXXXX ALT09
AGGCTCCTACXXX ALT10
CAGCGTACATXXX ALT11
CTCAGCTTCCAXX ALT12
GCAGGAAAAGTTX ALT14
GGCGTCTACGGAX ALT15
TCATCCCATGGGT ALT16
TGCTAGTGAGGGT ALT17
AAGGTCGATXXXX ALT18
ATAAGCTGTAXXX ALT19
CCATATGCGAXXX ALT20
CTCGCGCCAGTXX ALT21
GCCTTAACGCCTX ALT22
GGTCCGAACTTCX ALT23
TCCGGCCGGATAT ALT24
TTACTTAGGCCAT ALT25
AATAGCTCCXXXX COL01
ATTCAGGAACXXX COL02
CCGACTTCTCXXX COL03
CTGCCGCTCTAXX COL04
GCGAAAATATGCX COL05
GTAATGAATTCAX COL06
TCGTACGCGGAGA COL07
TTAGCTATCGGGA COL08
ACCAGGCGTXXXX COL09
ATTGGGGTGTXXX COL10
CGATTATCGTAXX COL11
CTTAAGGTTGTXX COL12
GCTCTATGAAACX COL13
GTCTATAGCGGAX COL14
TCTACCGTGTGGT COL15
TTATGTCTCAGTC COL16
ACCTATCACXXXX COL17
CAACTGTATTXXX COL18
CGGCACCAGCTXX COL19
GATATTGGCTAXX COL20
GCTGCATTAATTX COL21
TATCTAACCGAGA COL22
TGAGAGCTGTGGA COL24
TTCGATGGTACGT COL25

best regards

jose

Mike Sovic

unread,
Sep 24, 2015, 2:11:53 PM9/24/15
to AftrRAD
Hi Jose,

No, putting X's at the end of the barcodes will not work.  Just remove the X's from what you have here and what is left should work - again, it should be OK if the barcodes are different lengths.

               Mike

jose luis acosta

unread,
Sep 24, 2015, 3:49:58 PM9/24/15
to AftrRAD
Hi Mike,

I run AftrRAD.pl script and obtained this error:

Scoring each pairwise alignment.  Alignments exceeding 90 % similarity will be retained.
Parsing aligned sequences into candidate loci.
Finished writing the mafft output.
Identifying and removing paralogous loci.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value $SecondCount in numeric ge (>=) at AftrRAD.pl line 2719.
Use of uninitialized value $ThirdCount in numeric ge (>=) at AftrRAD.pl line 2736.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value $SecondCount in numeric ge (>=) at AftrRAD.pl line 2719.
Use of uninitialized value $ThirdCount in numeric ge (>=) at AftrRAD.pl line 2736.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value $SecondCount in numeric ge (>=) at AftrRAD.pl line 2719.
Use of uninitialized value $ThirdCount in numeric ge (>=) at AftrRAD.pl line 2736.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value $SecondCount in numeric ge (>=) at AftrRAD.pl line 2719.
Use of uninitialized value $ThirdCount in numeric ge (>=) at AftrRAD.pl line 2736.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value $SecondCount in numeric ge (>=) at AftrRAD.pl line 2719.
Use of uninitialized value $ThirdCount in numeric ge (>=) at AftrRAD.pl line 2736.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value $SecondCount in numeric ge (>=) at AftrRAD.pl line 2719.
Use of uninitialized value $ThirdCount in numeric ge (>=) at AftrRAD.pl line 2736.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value $SecondCount in numeric ge (>=) at AftrRAD.pl line 2719.
Use of uninitialized value $ThirdCount in numeric ge (>=) at AftrRAD.pl line 2736.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value $SecondCount in numeric ge (>=) at AftrRAD.pl line 2719.
Use of uninitialized value $ThirdCount in numeric ge (>=) at AftrRAD.pl line 2736.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value in addition (+) at AftrRAD.pl line 2709.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $CurrentLocus in concatenation (.) or string at AftrRAD.pl line 2900.
Use of uninitialized value $PreviousLocusNumber in concatenation (.) or string at AftrRAD.pl line 3103, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[0] in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[1] in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value $PreviousLocusNumber in concatenation (.) or string at AftrRAD.pl line 3103, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[0] in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[1] in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value $PreviousLocusNumber in concatenation (.) or string at AftrRAD.pl line 3103, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[0] in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[1] in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value $PreviousLocusNumber in concatenation (.) or string at AftrRAD.pl line 3103, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[0] in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[1] in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value $PreviousLocusNumber in concatenation (.) or string at AftrRAD.pl line 3103, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[0] in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[1] in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value $PreviousLocusNumber in concatenation (.) or string at AftrRAD.pl line 3103, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[0] in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3104, <READFILE> line 1.
Use of uninitialized value $TempSeqArray[1] in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value in concatenation (.) or string at AftrRAD.pl line 3105, <READFILE> line 1.
Use of uninitialized value $PreviousLocusNumber in concatenation (.) or string at AftrRAD.pl line 3103, <READFILE> line 1.

I had observed this error before, for this reason, I consulted you.
any idea??

best regards

Jose

Mike Sovic

unread,
Sep 24, 2015, 10:36:38 PM9/24/15
to AftrRAD
Hi Jose,

OK - we'll try to find the problem, but we will need some more information.  At this point in the script, you should have the folder TempFiles/RawReadCountFiles.  Inside this folder, there should be a set of files named "RawReadCounts_X.txt", where X is one of your sample names.  First, check to make sure all of these files are approximately the same size.  If so, open one of them (it doesn't matter which one), and send us the last ~10-20 lines of this file.  Once we see what those look like, we'll go from there.

          Mike

jose luis acosta

unread,
Sep 25, 2015, 4:06:36 PM9/25/15
to AftrRAD
Hi Mike,

All files were created , but are empty. What is this about??

best regards

Jose

Mike Sovic

unread,
Sep 25, 2015, 4:26:04 PM9/25/15
to AftrRAD
Hi Jose,

Not sure yet - next check the folder TempFiles/ErrorReadTest.  This folder should contain the files "AllReadsAndDepths.txt" and "ErrorTestOut.txt".  Check to make sure there are sequences in both of these and let me know.  If these do contain sequences, send the first ~50 lines or so from the ErrorTestOut.txt file.

              Mike 

jose luis acosta

unread,
Sep 25, 2015, 5:42:34 PM9/25/15
to AftrRAD
Hi Mike,

AllReadsAndDepths.txt  file content is:

AAACGAAGGAGAGCTGGCGATTGCGCATCAGCGCGCCCGGCAGAAGGCCGGCTTCGAGCAGGATCTGGAAACCGTTGCACAC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AAAGCTAGGTCAATGGTGGAAACATGCACCATACTAGCTAGCCGGACACAGAGTACCTCCCCACCAAAGGGCAGAGATCGGA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
AAATGTAAATATATATAGGGAAGAGACTGTAGGGTTCATGGGAAATGTCTATTGGCAGTTGAGACTTTAAAGTGGGTAGTTT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
AACATCTTAGTAGCCAGAGGAAAAGAAAGCAAAAGCGATTCCCGTAGTAGCGGCGAGCGAAATGGGAGCAGAGATCGGAAGA 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
AACCTTAACCGATGGTCAAGTAGACAAGTTTGAGAAGGGAGGATTAGCTACCATTGCAATAATGGTAGCACAGGCAGAGATC 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
AACGCGGTGATCTACCGTATGCGAACAGGGTGCCAGTGGAACCACCTTCCCAAAGAGTTTCCCGACGATGCCTCGGTGCACC 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AAGAAATAGACCAATCTTGTGCTCATCCTCTTGAACCGGCCTCTGTATCTGAGTCCGTGAAGGTTTTGACTGAGGGCCTGGC 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AAGCTCAGAAGGGAGAAGAGAACTCGGACAGCTCTCGTTTATAGAGACTCTAGTAAATATCATCCGGGACGGTAGTCCAAAG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
AATATGGGTGTCCGATCCTTATGGACTAACGGGAAAAATACAACCTGTAAATCCGGCGTGGGGCGTGGAAGGTTTTGATCCT 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AATGGCTAGTACTGCAGTCGGCCTTGCCTAAACACTTTGATCTCAGGGCCGTGCGTCGAGAAGGAGCCGAAATTGAACATTA 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AATTAACAGCAAATGGTTATGATAGCACTTGAGCTAATTTGTTGATTTAAATTGAGCCCCACGTAGAATATACAGACAAGCT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
AATTACCTTCAGTTTAAATGACTCCAGTTGTTTCTCTGAATCTCTTGTCAGGCGTTCAACCTCCGCACATGCAATTCTCCTT 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The ErrorTestOut.tx file is empty . This error does not occur when the barcode file was separated by length , ie , the same size of sequence. The analysis this way does not help me .

Jose

Mike Sovic

unread,
Sep 25, 2015, 9:24:15 PM9/25/15
to AftrRAD
Hi Jose,

This AllReadsAndDepths file contains every unique sequence read in the dataset, and the values represent the number of times that sequence read occurred in each of your samples.  For each read, these values are averaged (ignoring any zeros) to determine whether the read is included in the ErrorTestOut.txt file.  If the average for the read is greater than the MinDepth parameter (default is 5 I think), then the read is printed to the ErrorTestOut.txt file, and is used for the next steps.  Otherwise, the read is discarded.  All of the reads you sent have an average of 1, so they would be discarded (presumably error reads).  

I assume what you sent is just a sample from the AllReadsAndDepths.txt file - right?  If so, can you scroll through this file and make sure there are reads that have an average of 5 or more?  I'm guessing there are if this worked previously with a subset of your barcodes, but want to make sure before we go further.

                         Mike
Reply all
Reply to author
Forward
0 new messages