Unmapped.out.mate1 and Umapped.out.mate2 have different order

608 views
Skip to first unread message

Cedric Gobet

unread,
Jan 25, 2016, 8:27:11 AM1/25/16
to rna-...@googlegroups.com
Hello,
I'm working on Human total RNA-Seq, paired-ends (2x125bp). In order to assess the efficiency of ribosomal RNA depletion, I'm mapping first on ribosomal sequences with option --OutReadsUnmapped Fastx and secondly I process the unmapped reads on the genome. 
Surprisingly, for some samples, the files Unmapped.out.mate1 and Unmapped.out.mate2 have not the same order of the reads at some places (e.g tail of the file is ok but not the head) . This leads to a very low mapping rate on the total genome.
I'm using 12 threads and the last version of STAR (2.5.1b). Could it be a concatenation issue? The files in the tmp directory have the right order.

Best regards,

Cédric


Alexander Dobin

unread,
Jan 26, 2016, 3:30:30 PM1/26/16
to rna-star
Hi Cédric

this is not supposed to happen... Could you post a minimal example (hopefully, a few 100,000 reads should be enough) so that I can reproduce it?
Also, mapping to just rRNA is probably very slow, see this post.

Cheers
Alex

Cedric Gobet

unread,
Jan 27, 2016, 7:46:47 AM1/27/16
to rna-star
Hi Alex,
Thanks for your prompt response.
I have reproduced the problem with a minimum sample of 1M reads. Is there an easy way to share these files with you?
Best,

Cédric

Alexander Dobin

unread,
Jan 28, 2016, 2:51:00 PM1/28/16
to rna-star
Hi Cédric

could you zip them and send them through Google Drive or Dropbox?

Cheers
Alex

Cedric Gobet

unread,
Feb 3, 2016, 5:30:33 AM2/3/16
to rna-star
Hi,
Have you received the link I sent you? (private message and e-mail)

Best

Cédric

David Soong

unread,
Mar 2, 2016, 3:41:47 PM3/2/16
to rna-star
I'm aligning my RNAseq data and see the ordering problem with the unmapped reads, too.

I'm running STAR_2.4.2a_modified


On Monday, January 25, 2016 at 8:27:11 AM UTC-5, Cedric Gobet wrote:

Alexander Dobin

unread,
Mar 3, 2016, 11:25:16 AM3/3/16
to rna-star
Hi David,

I have run Cédric's dataset, but could not reproduce this sorting problem on my system. So it seems that the problem is system-specific.

If you are willing to try to debug it, here are the thing I would ask you to do:
1. Is this problem reproducible? If you run the same sample again, will you see this problem for the same reads?
If yes:
2. Check if the problem occurs with the latest version of STAR https://github.com/alexdobin/STAR/releases/tag/2.5.1b
If yes:
3. Try the different STAR executables:
    pre-compiled dynamic: bin/Linux_x86_64/STAR
    pre-compiled static: bin/Linux_x86_64_static/STAR
    compile your own with `make` in the source directory
If the problem persists for all executables:
4. Try to reduce the size of the fastq files to create the minimal example where the problem still occurs, and send me the reads, link to the genome, and Log.out file.

Cheers
Alex

David Soong

unread,
Mar 3, 2016, 3:30:27 PM3/3/16
to rna-star
Hi Alex,

I just ran STAR_2.5.1b_modified (Linux_x86_64_static) and the problem remained.

Since the temporary files (Unmapped.out.mate1/2.thread*) in _STARtmp/ are in the correct order, is it possible to include a command line option to keep the intermediate files? I can do the concatenation on my end.

Thanks,
David

Alexander Dobin

unread,
Mar 3, 2016, 6:03:11 PM3/3/16
to rna-star
Hi David,

the latest patch on GitHub master has the option to keep the temporary files.
Could you please send me the output of `wc -l _STARtmp/Unmap*` and `wc -l Unmap*` from the run directory. I still hope to find the reason for the incorrect ordering.

Cheers
Alex

Wencheng Li

unread,
Mar 7, 2016, 12:21:13 PM3/7/16
to rna-star
Hi Alex,

I recently has the same problem for 1 out of 14 samples. 13 other samples are reporting the correct order.  I 'm now re-run STAR to see if this time it will be right.
I'm using multi-threads, I suspect for the sample failed, the output of different threads may not been concatenated in the same order.

I'm using STAR_2.4.2a.

Alexander Dobin

unread,
Mar 8, 2016, 3:13:59 PM3/8/16
to rna-star
Hi Wencheng,

please try the latest patch on GitHub master. I have made some changes that should make the ordering more robust.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages