>>>1- Can we tune STAR in which for the single end, first n bp only allowed m mismatch?No, this is not possible.
#!/usr/bin/env bash
# vim: set noexpandtab tabstop=2:
tmpdir=$(mktemp -d)
ref=GCACTGTCCGCCAGCCGGTGGATGTGCGGCAACAACATGTCCGCTCCGA
query=${ref:0:14}${ref:20}
ref_fa=$tmpdir/ref.fa
query_fa=$tmpdir/query.fa
cat > "$ref_fa" <<EOF
>chr
$ref
EOF
cat > "$query_fa" <<EOF
>seq
$query
EOF
cd "$tmpdir"
mkdir -p genomeDir
source trapdebug.sh
STAR --runMode genomeGenerate --genomeDir genomeDir --genomeFastaFiles "$ref_fa"
outdir1=$tmpdir/outdir1
mkdir -p "$outdir1"
cd "$outdir1"
STAR --genomeDir "$tmpdir/genomeDir" --readFilesIn "$query_fa"
outdir2=$tmpdir/outdir2
mkdir -p "$outdir2"
cd "$outdir2"
STAR --genomeDir "$tmpdir/genomeDir" --readFilesIn "$query_fa" --alignIntronMax 1
function filter {
grep 'Uniquely mapped reads number'
}
diff <(filter < "$outdir1"/Log.final.out) <(filter < "$outdir2"/Log.final.out)
~~~
>>>
~~~
bash> diff <(filter < "$outdir1"/Log.final.out) <(filter < "$outdir2"/Log.final.out)
1c1
< Uniquely mapped reads number | 0
---
> Uniquely mapped reads number | 1
~~~
It looks like in your example the read could be mapped with splicing (look for N in the CIGAR), but when splicing is prohibited. no good (unspliced) alignment is found.