Hello, Dear Alex,
I have encountered a weird bug when I tried to use STAR to map PAS seq reads (I expect that STAR will mark the untemplated sequence as soft-clip).
I ended up to reproduce the error with very simple sequence:
#index
>chr1
TCAAATTATACTCTGAATACAGAATGGCATTTTCAGAATCAAACTTTAAT
# build index
STAR --runMode genomeGenerate --genomeDir STARindex/ --genomeFastaFiles test.fa
# it has no problem mapping the following fastq (individually)
STAR --runMode alignReads --genomeDir STARindex --readFilesIn test.fq
@perfect
TTATACTCTGAATACAGAAT
+
IIIIIIIIIIIIIIIIIIII
@tail1
TTATACTCTGAATACAGAATA
+
IIIIIIIIIIIIIIIIIIIII
@tail2
TTATACTCTGAATACAGAATAA
+
IIIIIIIIIIIIIIIIIIIIII
@tail3
TTATACTCTGAATACAGAATAAA
+
IIIIIIIIIIIIIIIIIIIIIII
@tail4
TTATACTCTGAATACAGAATAAAA
+
IIIIIIIIIIIIIIIIIIIIIIII
# but it gives segmentation fault when mapping
@tail5
TTATACTCTGAATACAGAATAAAAA
+
IIIIIIIIIIIIIIIIIIIIIIIII
# this error does NOT happen for polyC/G/T
@tail5G
TTATACTCTGAATACAGAATGGGGG
+
IIIIIIIIIIIIIIIIIIIIIIIII
@tail5C
TTATACTCTGAATACAGAATCCCCC
+
IIIIIIIIIIIIIIIIIIIIIIIII
@tail5T
TTATACTCTGAATACAGAATTTTTT
+
IIIIIIIIIIIIIIIIIIIIIIIII
# when I changed the last A to T/C/G, the crash went away and results are as expected.
@tail5T
TTATACTCTGAATACAGAATAAAAT
+
IIIIIIIIIIIIIIIIIIIIIIIII
@tail5C
TTATACTCTGAATACAGAATAAAAC
+
IIIIIIIIIIIIIIIIIIIIIIIII
@tail5G
TTATACTCTGAATACAGAATAAAAG
+
IIIIIIIIIIIIIIIIIIIIIIIII
tail5T 0 chr1 6 255 20M5S * 0 0 TTATACTCTGAATACAGAATAAAAT IIIIIIIIIIIIIIIIIIIIIIIII NH:i:1 HI:i:1 AS:i:19 nM:i:0
tail5C 0 chr1 6 255 20M5S * 0 0 TTATACTCTGAATACAGAATAAAAC IIIIIIIIIIIIIIIIIIIIIIIII NH:i:1 HI:i:1 AS:i:19 nM:i:0
tail5G 0 chr1 6 255 20M5S * 0 0 TTATACTCTGAATACAGAATAAAAG IIIIIIIIIIIIIIIIIIIIIIIII NH:i:1 HI:i:1 AS:i:19 nM:i:0
# the error doesn't happen when the AAAAA is located at the 5' end
@tail5
AAAAATTATACTCTGAATACAGAAT
+
IIIIIIIIIIIIIIIIIIIIIIIII
I was able to reproduce this error in version 2.3.0e and 2.3.1o. And both self-compiled and downloaded precompiled static version.
Thanks in advance,
Bo