—alignEndsType: ”Local” versus ”EndToEnd”

1,152 views
Skip to first unread message

Shanrong Zhao

unread,
Dec 18, 2014, 5:08:26 PM12/18/14
to rna-...@googlegroups.com
 
I want to disable soft clipping when running STAR by setting —alignEndsType from "Local” to ”EndToEnd”.  I have two samples (from human whole blood) and have request BGI to do both stranded and non-stranded RNA-sequencing for each sample. I got a total of 4 datasets. Below are the mapping summary for my analysis:
 

—alignEndsType=”Local”

Sample     Total_reads     Uniq_Rate Multi_Rate Unmap_Rate

00004_nonstranded 61617213 92.57     3.64       3.79

00005_nonstranded 63096109 93.02     3.53       3.45

00004_stranded 62893222    90.56      3.73      5.71

00005_stranded 61184208    90.17      3.66       6.17

 

—alignEndsType=”EndToEnd”

Sample     Total_reads     Uniq_Rate Multi_Rate Unmap_Rate

00004_nonstranded 61617213 89.44     3.40       7.16

00005_nonstranded 63096109 89.89     3.29       6.82

00004_stranded 62893222    85.53      3.37       11.10

00005_stranded 61184208    84.63      3.28       12.09

 

STAR itself is NOT strand-aware when mapping the reads to the reference genome. In general:

1.       Stranded vs non-stranded sequencing:  slightly more reads are uniquely mapped in non-stranded sequencing (accordingly, less reads become unmapped in non-stranded sequencing

2.       When “—alignEndsType” is set from “Local” to “EndToEnd”, its impact on “stranded” sequencing is much bigger than nonstranded sequencing. For instance, for "uniquely mapped reads" in sample 0004:

a.       Nonstranded:  92.57 (LOCAL)  à 89.44 (EndToEnd”):   ~3% DIFFERENCE

b.      Stranded:         90.56 (LOCAL)  à 84.63 (EndToEnd”):   ~6% DIFFERENCE

 

Especially for the 2nd fact, I don't have a good explanation.

 

My command line looks like:

STAR --genomeDir /hpc/grid/shared/ngsdb/STAR/hg19_gencode19 --readFilesIn /hpc/grid/ngsws/molmed/data/BGIpilotNov2014/CleanData/RC-140808-00004_stranded_1.fq.gz /hpc/grid/ngsws/molmed/data/BGIpilotNov2014/CleanData/RC-140808-00004_stranded_2.fq.gz --readFilesCommand zcat --runThreadN 8 --alignSJDBoverhangMin 1  --outReadsUnmapped Fastx --alignEndsType EndToEnd    --outFilterMismatchNoverLmax 0.05 --alignIntronMax 1000000 --outSAMtype BAM SortedByCoordinate;

Reply all
Reply to author
Forward
0 new messages