Commands to launch single and paired end data with star

rnaFan

unread,

Sep 26, 2013, 3:32:16 AM9/26/13

to rna-...@googlegroups.com

Dear all,
I started to use star just one month ago.
It's a very fast and usable tool but I have a problem to understand the differences between the usage of star with single and paired end data.
I read the manual but the only options about the paired end data are:
_outFilterScoreMinOverLread 0.66
float: outFilterScoreMin normalized to read length (sum of mates'
lengths for paired-end reads)
_outFilterMatchNminOverLread 0.66
float: outFilterMatchNmin normalized to read length (sum of
mates' lengths for paired-end reads)
_seedSearchStartLmaxOverLread 1.0
float: seedSearchStartLmax normalized to read length (sum of mates'
lengths for paired-end reads)

Reading on a forum I catch this comment:
"If you have multiple files you wish to map in one run, they should be separated by commas, while paired-end mates are separated by space:
--readFilesIn Read1a.gz,Read1b.gz Read2a.gz,Read2b.gz"

but in the official manual(ftp://ftp2.cshl.edu/gingeraslab/tracks/STARrelease/2.1.4/STARmanual_2.1.4.pdf) there are no references about the usage of spaces or commas in a case of single/paired end data.

For instance with bowtie2 the options to run mapping with paired end data are very clear:
-1 reads1 -2 reads2

Does anybody can post the right input commands to use single and paired end data?

Thanks in advance.
Regards

Alexander Dobin

unread,

Sep 26, 2013, 12:58:08 PM9/26/13

to rna-...@googlegroups.com

Hi @rnaFan,

the only place where you have to specify whether you have paired vs single-end reads is in the

--readFilesIn parameter:

for single end: --readFilesIn Reads.fastq

for paired-end --readFilesIn Read1.fastq Read2.fastq

So, if you specify one file name STAR will now it's single-end, and if you specify two file names, separated by space, that it's paired-end.

Cheers

Alex

Daofeng Li

unread,

Dec 5, 2013, 12:25:37 PM12/5/13

to rna-...@googlegroups.com

Hi Alex,

does this mean that STAR doesn't support reads from multiple lanes?

like this way:

--readFilesIn Read1a.gz,Read1b.gz Read2a.gz,Read2b.gz

have to merge the reads from different lanes before run STAR?

Thanks.

Alexander Dobin

unread,

Dec 7, 2013, 11:01:04 AM12/7/13

to rna-...@googlegroups.com

Hi Daofeng,

you can use

--readFilesIn Read1a.gz,Read1b.gz Read2a.gz,Read2b.gz

to map multiple lanes "a","b" ... In this list comma separates different lanes of the same mate (1st or 2nd), while space separates the mates.

Cheers

Alex

Daofeng Li

unread,

Dec 12, 2013, 12:28:25 PM12/12/13

to rna-...@googlegroups.com

Yes, it works,

Thank you very much Alex.

prpven

unread,

Dec 16, 2013, 11:17:39 AM12/16/13

to rna-...@googlegroups.com

Similarly, can I input mixed single and paired-end reads as follows:

--readFilesIn SingleReads.gz ReadPair1.gz,ReadPair1.gz

Thanks.

Yifang Tan

unread,

Aug 16, 2016, 5:17:45 PM8/16/16

to rna-star

Hell Alex:
I came across this question when I trying STAR to map mixture of SE and PE reads after trimming from the same lane.
What's the command line for this situation?

"So, if you specify one file name STAR will now it's single-end, and if you specify two file names, separated by space, that it's paired-end.

you can use:--readFilesIn Read1a.gz,Read1b.gz Read2a.gz,Read2b.gz

to map multiple lanes "a","b" ... In this list comma separates different lanes of the same mate (1st or 2nd), while space separates the mates. "

This is your old reply, but I am still not clear with mixture of SE and PE.

My try is:
--readFilesIn read1_SE.fq.gz,read2_SE.fq.gz,PE_read1.fq.gz PE_read2.fq.gz
Is this correct?

Thanks a lot!

Yifang

Alexander Dobin

unread,

Aug 17, 2016, 12:22:08 PM8/17/16

to rna-star

Hi Yifang,

you cannot mix SE and PE reads in one run - you would need to map them separately, and then merge the BAM files.

Cheers

Alex

Catalina Aguilar Hurtado

unread,

Jan 17, 2018, 11:22:49 AM1/17/18

to rna-star

Hi Alex,

I have 48 pair samples to map. Does that mean that all 48 R1 samples and 48 R2 samples are separated by a space? Do I need to change their names to 1 and 2? Or STAR will identify that S1_F and S1_R are pairs?

e.g.

S1_F_paired.fq.gz,S10_F_paired.fq.gz,S11_F_paired.fq.gz,S12_F_paired.fq.gz,S13_F_paired.fq.gz,S14_F_paired.fq.gz,S15_F_paired.fq.gz S1_R_paired.fq.gz,S10_R_paired.fq.gz,S11_R_paired.fq.gz,S12_R_paired.fq.gz,S13_R_paired.fq.gz,S14_R_paired.fq.gz,S15_R_paired.fq.gz

Thanks,

Cata

Alexander Dobin

unread,

Jan 17, 2018, 4:26:23 PM1/17/18

to rna-star

Hi Catalina,

your example string looks good to me.

To recap, you have to have two lists, for read1 and read2. There should be a space between the two lists.

Within each list, the files have to be separated by commas, but no spaces. The naming of the files does not matter, but the order of the files for the read1 list and read2 lists have to be exactly the same - this is how STAR matches the files.

Cheers

Alex

Catalina Aguilar Hurtado

unread,

Jan 18, 2018, 5:37:45 PM1/18/18

to rna-star

Thanks for the explanation, it worked!

Lizhen Wu

unread,

Jan 2, 2019, 10:44:45 AM1/2/19

to rna-star

Hi, I have created a list of Read1 and Read2. But the output file is only one "Aligned.out.sam". I wonder if the output file is overwritten? Thanks.

Alexander Dobin

unread,

Jan 4, 2019, 1:30:18 PM1/4/19

to rna-star

Hi Lizhen,

if you use comma separated lists in one STAR command, all the results will be output in one file.

If you want separate outputs per sample, you would need to run them separately, e.g. using shell for-loop.