Hello,
I'm trying to analyze some weird MiSeq data (at least for me).
Data consist in 3 FASTQ files: R1 (reads), R2 (barcodes), R3 (reads)
Barcodes have 8 length (see example below):
@MISEQ:154:000000000-A4PK8:1:1101:15615:1331 2:N:0:
ATCACGAN
+
AABBBBA#
I've demultiplexed data using split_libraries_fastq.py with the following commands:
split_libraries_fastq.py -i lane1_NoIndex_L001_R1_001.fastq -b lane1_NoIndex_L001_R2_001.fastq -o r1_demux_1mm -m map_corrected.txt --barcode_type 8 --max_barcode_errors 0 --store_demultiplexed_fastq
split_libraries_fastq.py -i lane1_NoIndex_L001_R3_001.fastq -b lane1_NoIndex_L001_R2_001.fastq -o r3_demux_1mm -m map_corrected.txt --barcode_type 8 --max_barcode_errors 0 --store_demultiplexed_fastq
I've tried playing with --max_barcode_errors option but I always get two FASTQ files that differ in number of lines.
I've to mention that my map.txt file is correct, I have barcodes but no primer sequences:
#SampleID BarcodeSequence LinkerPrimerSequence Description
sample1 (tab) CGTGATAT (tab) (empty column here) (tab) none
I would really appreciate some help.
Thank you very much ^^