Will it make difference when mapping the illumina reads to the artificial supercontigs?

liuhui

unread,

Jun 20, 2016, 2:24:21 PM6/20/16

to rna-star

Hello,

I am working with very fragmented and very large genome, which is over 1,000,000 scaffolds!(http://dendrome.ucdavis.edu/ftp/Genome_Data/genome/pinerefseq/Pita/v1.01/ptaeda.v1.01.fa.masked.trimmed.gz)

I want to assemble the scaffolds into artificial supercontigs (where the real scaffolds are separated by stretches of Ns that are a few times as long as the read length,such as 100bp) to speed up the mapping. But I am wondered that whether the results of alignments will make difference, between the reads mapped to supercontigs and the reads mapped to real scaffolds?

Any advice would be great!

Thank you very much

Alexander Dobin

unread,

Jun 20, 2016, 5:44:32 PM6/20/16

to rna-star

Hi @liuhui,

concatenating contigs into super-contigs is the right way to go when you have a large number (>100,000) of small contigs.

The main issue with this approach is that STAR may splice some of the alignments between different contigs, or place PE reads on different contigs, thus creating chimeric alignments.

These have to be filtered out after mapping.

Cheers

Alex

liuhui

unread,

Jun 20, 2016, 10:34:27 PM6/20/16

to rna-star

Hi, Alex,

Do you happen to know the the way or idea to remove the chimeric alignments.

Thank you very much.

Hui Liu

在 2016年6月21日星期二 UTC+8上午5:44:32，Alexander Dobin写道：

Reply all

Reply to author

Forward