demultiplexing help

270 views
Skip to first unread message

safi...@gmail.com

unread,
Dec 16, 2020, 2:05:30 PM12/16/20
to Stacks
Hello Every one,

I am quite sure some of you can solve this issue.
I am trying to Demultiplexing my paired end ddRAD data 
my command was: 
$ process_radtags -P -p ./data/ -o try/ --inline_index -b ./barcode.txt --renz_1  SbfI --renz_2 MspI -c -q -r -i gzfastq

Error says could not found my files, It was in the data folder, could you please suggest  me what i did wrong thank you
best Safi


Output was: 

Processing paired-end data.

Using Phred+33 encoding for quality scores.

Unable to locate any input files to process within './data/'

Found 0 paired input file(s).

Searching for single-end, inlined and paired-end, indexed barcodes.

Loaded 1 barcodes (8bp / 8bp).

Will attempt to recover barcodes with at most 1 / 1 mismatches.

Closing files, flushing buffers...

Outputing details to log: 'try/process_radtags.data.log'

0 total sequences

0 barcode not found drops (nan%)

0 low quality read drops (nan%)

0 RAD cutsite not found drops (nan%)

0 retained reads (nan%)



Safiqul Islam

unread,
Dec 16, 2020, 2:31:30 PM12/16/20
to stacks...@googlegroups.com
Dear All,

I solved it, now I am facing another problem.
Seq could not found barcode

$ process_radtags --paired --interleaved -p ./data/ -o ./try/  --inline_inline -b ./barcode.txt --renz_1  SbfI --renz_2 MspI -c -q -r



Processing paired-end data.


4131032 total sequences

4131032 barcode not found drops (100.0%)

      0 low quality read drops (0.0%)

      0 RAD cutsite not found drops (0.0%)

      0 retained reads (0.0%)


Seq look like that

forward:

@A01182:27:HVF75DMXX:1:1101:22779:1016 1:N:0:GAGGACTT+NATTGCTC:ATCACG

ATCACGTGCATGTGAGAATATCAGTGGTTGTAGGGCTCCTTGTCCCCCATGGCACCGATTCCTGCTCTGTGCCCACATCTTTTTGCTCGTGCTGTATTTTT

+

FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF


Reverse: 

@A01182:27:HVF75DMXX:1:1101:22779:1016 2:N:0:GAGGACTT+NATTGCTC:ATCACG

NGGTGCTGAACTCTAAGGCAGTCAGTCGTCTTGTGTTACTCGGGAGCACAACGCAGTCATGACAAAAATAAACCACGATAAAAATACAGCACGAGCAAAAA

+

#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFF,FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFF



Barcode: GAGGACTT GATTGCTC L63586



help me please how can I do demultiplexing my data.



thank you in an advance 


--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/e4efa86d-1408-471a-a2d7-8d3c5aed7910n%40googlegroups.com.


--
Safiqul Islam
Founder & CEO, wildmentor (http://www.wildmentor.org/)
Scientific Assistant at Senckenberg Naturhistorische Sammlungen Dresden (http://www.senckenberg.de/)
Doctoral Student (Phylogeography-Ornithology) at Ludwig-Maximilians-Universität München (http://www.uni-muenchen.de/)

kwojt...@gmail.com

unread,
Dec 17, 2020, 8:41:04 AM12/17/20
to Stacks
Hi Safi,

In your code, the --inline_inline flag tells process_radtags to look for the barcodes inline with the sequences. Your barcodes are in the header of the read (hi-lighted in yellow below). Change your flag to --index_index and that should solve that problem. Hope this helps.

-Kris

@A01182:27:HVF75DMXX:1:1101:22779:1016 1:N:0:GAGGACTT+NATTGCTC:ATCACG
ATCACGTGCATGTGAGAATATCAGTGGTTGTAGGGCTCCTTGTCCCCCATGGCACCGATTCCTGCTCTGTGCCCACATCTTTTTGCTCGTGCTGTATTTTT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

Reverse:
@A01182:27:HVF75DMXX:1:1101:22779:1016 2:N:0:GAGGACTT+NATTGCTC:ATCACG
NGGTGCTGAACTCTAAGGCAGTCAGTCGTCTTGTGTTACTCGGGAGCACAACGCAGTCATGACAAAAATAAACCACGATAAAAATACAGCACGAGCAAAAA
+
#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFF,FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFF

Safiqul Islam

unread,
Dec 17, 2020, 2:09:06 PM12/17/20
to stacks...@googlegroups.com
Dear All specially Kris,

I could not solve it. Is there anyone who can help me. I Tried to solve demultiflexing the data. I could not succes 

My command line: $ process_radtags -P -p ./ngs/ -I --index_index -o ./out/ -b ./barcode/barcode.txt --renz_1  SbfI --renz_2 MspI -c -q -r -i gzfastq

Seq: 

@A01182:27:HVF75DMXX:1:1101:32533:1000 1:N:0:ACCTAAGG+NGGCATGT:ACAGTG

ACAGTGTGCAGGACTCTAGCTCATGGAAAGCCACAGGTGTCCCAATCCATCCTCATGCAAAATCTCCTCTACCATGCCATAGAGACCTTGCTCCTAACTAG

+

FFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF:FFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF


Barcode seq: ACCTAAGG TGGCATGT L63678_Track-104983


Demultiflexing not worked but out put was:


Processing paired-end data.

Using Phred+33 encoding for quality scores.

Found 4 interleaved, paired input file(s).

Searching for single and paired-end, indexed barcodes.

Loaded 2 barcodes (8bp / 8bp).

Will attempt to recover barcodes with at most 1 / 1 mismatches.

Processing file 1 of 4 [L63680_Track-104984.R2.fastq.gz]

  Reading data from:

  ./ngs/L63680_Track-104984.R2.fastq.gz

  Processing RAD-Tags...1M...2M...3M...4M...5M...6M...7M...8M...9M...10M...11M...12M...13M...

  27264684 total reads; -27264684 ambiguous barcodes; -0 ambiguous RAD-Tags; +0 recovered; -0 low quality reads; 0 retained reads.

Processing file 2 of 4 [L63678_Track-104983.R1.fastq.gz]

  Reading data from:

  ./ngs/L63678_Track-104983.R1.fastq.gz

  Processing RAD-Tags...1M...2M...3M...

  6002336 total reads; -6002336 ambiguous barcodes; -0 ambiguous RAD-Tags; +0 recovered; -0 low quality reads; 0 retained reads.


I am attaching here log also, someone can suggest me please please 


best

Safi







process_radtags.ngs.log

Kristopher Wojtusik

unread,
Dec 18, 2020, 9:09:15 AM12/18/20
to stacks...@googlegroups.com
Hi Safi,

Sorry you keep running into issues. My guess is (and this may be wrong but its the only thing i can think of) is that the additional sequence in the header (highlighted below) is throwing process_radtags off. Is that the cutsite? Or an additional barcode? I'm afraid I can't be much in that respect, I have not seen more than two indices in the header of a fastq before. Sorry I can't be more help.

Seq: 

@A01182:27:HVF75DMXX:1:1101:32533:1000 1:N:0:ACCTAAGG+NGGCATGT:ACAGTG

ACAGTGTGCAGGACTCTAGCTCATGGAAAGCCACAGGTGTCCCAATCCATCCTCATGCAAAATCTCCTCTACCATGCCATAGAGACCTTGCTCCTAACTAG

+

FFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF:FFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF


-Kris




--
Kris Wojtusik
Graduate Research Assistant
Department of Biological Sciences
University of New Hampshire

Julian Catchen

unread,
Dec 18, 2020, 9:26:56 AM12/18/20
to stacks...@googlegroups.com, Safiqul Islam
Your indexed barcodes in your FASTQ headers look weird and appear to me
to be non-standard. There seems to be a pair of index barcodes,
specified as XXXXXX-XXXXXX, but these are followed by a third barcode?
e.g. XXXXXX-XXXXXX:XXXXXX which seems to match the sequence on the 5'
end of the singled-end read.

It is important that you determine how your library was constructed and
what the sequencing core did with the data instead of writing a dozen
messages to this email list.

For example, have you checked whether your files are already
demultiplexed? Most libraries that only use index barcodes are
demultiplexed by the sequencer itself and you would have one barcode set
per file. In this case, your sequences would start with the RAD cut site
remnants and you don't have to specify barcodes to process_radtags.

On the other hand, if your header does have a third barcode, then you
need to figure out how the library was constructed.

I highly recommend reading some of the protocol papers on how to process
RAD data.

julian

Safiqul Islam wrote on 12/17/20 1:08 PM:
> Dear All specially Kris,
>
> I could not solve it. Is there anyone who can help me. I Tried to solve
> demultiflexing the data. I could not succes
>
> My command line: $ process_radtags -P -p ./ngs/ -I --index_index -o
> ./out/ -b ./barcode/barcode.txt --renz_1SbfI --renz_2 MspI -c -q -r -i
> <mailto:kwojt...@gmail.com> <kwojt...@gmail.com
> <mailto:kwojt...@gmail.com>> wrote:
>
> Hi Safi,
>
> In your code, the --inline_inline flag tells process_radtags to look
> for the barcodes inline with the sequences. Your barcodes are in the
> header of the read (hi-lighted in yellow below). Change your flag to
> --index_index and that should solve that problem. Hope this helps.
>
> -Kris
>
> @A01182:27:HVF75DMXX:1:1101:22779:1016 1:N:0:GAGGACTT+NATTGCTC:ATCACG
> ATCACGTGCATGTGAGAATATCAGTGGTTGTAGGGCTCCTTGTCCCCCATGGCACCGATTCCTGCTCTGTGCCCACATCTTTTTGCTCGTGCTGTATTTTT
> +
> FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>
> Reverse:
> @A01182:27:HVF75DMXX:1:1101:22779:1016 2:N:0:GAGGACTT+NATTGCTC:ATCACG
> NGGTGCTGAACTCTAAGGCAGTCAGTCGTCTTGTGTTACTCGGGAGCACAACGCAGTCATGACAAAAATAAACCACGATAAAAATACAGCACGAGCAAAAA
> +
> #FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFF,FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFF
> On Wednesday, December 16, 2020 at 2:31:30 PM UTC-5
> safi...@gmail.com <mailto:safi...@gmail.com> wrote:
>
> Dear All,
>
> I solved it, now I am facing another problem.
> Seq could not found barcode
>
> $ process_radtags --paired --interleaved -p ./data/ -o
> ./try/--inline_inline -b ./barcode.txt --renz_1SbfI --renz_2
Reply all
Reply to author
Forward
0 new messages