Commands to launch single and paired end data with star

9,702 views
Skip to first unread message

rnaFan

unread,
Sep 26, 2013, 3:32:16 AM9/26/13
to rna-...@googlegroups.com
Dear all,
I started to use star just one month ago.
It's a very fast and usable tool but I have a problem to understand the differences between the usage of star with single and paired end data.
I read the manual but the only options about the paired end data are:
_outFilterScoreMinOverLread 0.66
float: outFilterScoreMin normalized to read length (sum of mates'
lengths for paired-end reads)
_outFilterMatchNminOverLread 0.66
float: outFilterMatchNmin normalized to read length (sum of
mates' lengths for paired-end reads)
_seedSearchStartLmaxOverLread 1.0
float: seedSearchStartLmax normalized to read length (sum of mates'
lengths for paired-end reads)

Reading on a forum I catch this comment:
"If you have multiple files you wish to map in one run, they should be separated by commas, while paired-end mates are separated by space:
--readFilesIn Read1a.gz,Read1b.gz Read2a.gz,Read2b.gz"

but in the official manual(ftp://ftp2.cshl.edu/gingeraslab/tracks/STARrelease/2.1.4/STARmanual_2.1.4.pdf) there are no references about the usage of spaces or commas in a case of single/paired end data.

For instance with bowtie2 the options to run mapping with paired end data are very clear:
-1 reads1 -2 reads2

Does anybody can post the right input commands to use single and paired end data?

Thanks in advance.
Regards

Alexander Dobin

unread,
Sep 26, 2013, 12:58:08 PM9/26/13
to rna-...@googlegroups.com
Hi @rnaFan,

the only place where you have to specify whether you have paired vs single-end reads is in the 
--readFilesIn parameter:
        for single end: --readFilesIn Reads.fastq 
        for paired-end --readFilesIn Read1.fastq Read2.fastq
So, if you specify one file name STAR will now it's single-end, and if you specify two file names, separated by space, that it's paired-end.

Cheers
Alex

Daofeng Li

unread,
Dec 5, 2013, 12:25:37 PM12/5/13
to rna-...@googlegroups.com
Hi Alex,

does this mean that STAR doesn't support reads from multiple lanes?
like this way:

--readFilesIn Read1a.gz,Read1b.gz Read2a.gz,Read2b.gz

have to merge the reads from different lanes before run STAR?

Thanks.

Alexander Dobin

unread,
Dec 7, 2013, 11:01:04 AM12/7/13
to rna-...@googlegroups.com
Hi Daofeng,

you can use
--readFilesIn Read1a.gz,Read1b.gz Read2a.gz,Read2b.gz
to map multiple lanes "a","b" ... In this list comma separates different lanes of the same mate (1st or 2nd), while space separates the mates.

Cheers
Alex

Daofeng Li

unread,
Dec 12, 2013, 12:28:25 PM12/12/13
to rna-...@googlegroups.com
Yes, it works,
Thank you very much Alex.

prpven

unread,
Dec 16, 2013, 11:17:39 AM12/16/13
to rna-...@googlegroups.com
Similarly, can I input mixed single and paired-end reads as follows:
--readFilesIn SingleReads.gz ReadPair1.gz,ReadPair1.gz
Thanks.

Yifang Tan

unread,
Aug 16, 2016, 5:17:45 PM8/16/16
to rna-star
Hell Alex:
I came across this question when I trying STAR to map mixture of SE and PE reads after trimming from the same lane.
What's the command line for this situation?


"So, if you specify one file name STAR will now it's single-end, and if you specify two file names, separated by space, that it's paired-end.
you can use:--readFilesIn Read1a.gz,Read1b.gz Read2a.gz,Read2b.gz
to map multiple lanes "a","b" ... In this list comma separates different lanes of the same mate (1st or 2nd), while space separates the mates. "

This is your old reply, but I am still not clear with mixture of SE and PE.
My try is:
--readFilesIn read1_SE.fq.gz,read2_SE.fq.gz,PE_read1.fq.gz PE_read2.fq.gz
Is this correct?

Thanks a lot!

Yifang

Alexander Dobin

unread,
Aug 17, 2016, 12:22:08 PM8/17/16
to rna-star
Hi Yifang,

you cannot mix SE and PE reads in one run - you would need to map them separately, and then merge the BAM files.

Cheers
Alex

Catalina Aguilar Hurtado

unread,
Jan 17, 2018, 11:22:49 AM1/17/18
to rna-star
Hi Alex,

I have 48 pair samples to map. Does that mean that all 48 R1 samples and 48 R2 samples are separated by a space? Do I need to change their names to 1 and 2? Or STAR will identify that S1_F and S1_R are pairs?

e.g.
S1_F_paired.fq.gz,S10_F_paired.fq.gz,S11_F_paired.fq.gz,S12_F_paired.fq.gz,S13_F_paired.fq.gz,S14_F_paired.fq.gz,S15_F_paired.fq.gz S1_R_paired.fq.gz,S10_R_paired.fq.gz,S11_R_paired.fq.gz,S12_R_paired.fq.gz,S13_R_paired.fq.gz,S14_R_paired.fq.gz,S15_R_paired.fq.gz

Thanks,

Cata

Alexander Dobin

unread,
Jan 17, 2018, 4:26:23 PM1/17/18
to rna-star
Hi Catalina,

your example string looks good to me.
To recap, you have to have two lists, for read1 and read2. There should be a space between the two lists.
Within each list, the files have to be separated by commas, but no spaces. The naming of the files does not matter, but the order of the files for the read1 list and read2 lists have to be exactly the same - this is how STAR matches the files.

Cheers
Alex

Catalina Aguilar Hurtado

unread,
Jan 18, 2018, 5:37:45 PM1/18/18
to rna-star
Thanks for the explanation, it worked!

Lizhen Wu

unread,
Jan 2, 2019, 10:44:45 AM1/2/19
to rna-star
Hi, I have created a list of Read1 and Read2. But the output file is only one "Aligned.out.sam". I wonder if the output file is overwritten? Thanks.

Alexander Dobin

unread,
Jan 4, 2019, 1:30:18 PM1/4/19
to rna-star
Hi Lizhen,

if you use comma separated lists in one STAR command, all the results will be output in one file.
If you want separate outputs per sample, you would need to run them separately, e.g. using shell for-loop.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages