Screen out read pairs that map in any way

Gavin Douglas

unread,

Aug 29, 2017, 12:13:43 PM8/29/17

to kneaddata users

Hi there,

I'm wondering what would be the easiest way to retain only read pairs that do not map concordantly, discordantly, or when only 1 mate maps. I'd like to be more conservative about what is retained for downstream steps and retaining all reads that don't map concordantly only seems pretty lenient. I've been using bowtie2 with kneaddata to perform this mapping. I'm thinking of just running bowtie2 and treating the reads as SE and PE in successive commands, but I thought maybe kneaddata could do this already.

Thanks,

Gavin

Lauren McIver

unread,

Sep 1, 2017, 11:39:44 AM9/1/17

to Gavin Douglas, kneaddata users

Hi Gavin - The latest version of kneaddata (v0.6.1) has options related to discordant alignments. By default it will allow discordant alignments and if you add the option "--no-discordant" it will only allow concordant alignments. This is a relatively new feature which has not yet been added to the user manual. I will get it added shortly.

Thanks!

Lauren

--
You received this message because you are subscribed to the Google Groups "kneaddata users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kneaddata-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kneaddata-users/17725eb8-03f8-42f1-ae89-9c1fa83a38cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Gavin Douglas

unread,

Sep 1, 2017, 12:07:17 PM9/1/17

to Lauren McIver, kneaddata users

Hi Lauren,

Thanks for getting back to me.

I was actually using v0.6.1 and tested out the bowtie2 "—no-discordant" option, but that’s actually not what I was looking for. Since the output option is "--un-conc” it actually doesn’t matter whether this option is used or not from what I can tell since only reads that aligned concordantly anyway are screened out.

I was confused about what discordant read alignments are, but it’s well described in the bowtie2 manual (in case others are interested too):

'A pair that aligns with the expected relative mate orientation and with the expected range of distances between mates is said to align "concordantly". If both mates have unique alignments, but the alignments do not match paired-end expectations (i.e. the mates aren't in the expected relative orientation, or aren't within the expected distance range, or both), the pair is said to align "discordantly”.'

I was originally thinking that discordant alignments included cases where only 1 read mate aligns, but actually that’s not true, as described above. I think it would be more conservative to screen out these discordant alignments too since they could just be due to structural variants that aren’t present in the reference genome. I also think that filtering out cases where 1 mate maps and the other doesn’t would be helpful too in samples where we’re really worried about host contamination. What do you think about this?

Cheers,

Gavin

Lauren McIver

unread,

Sep 5, 2017, 5:07:45 PM9/5/17

to Gavin Douglas, kneaddata users

Hi Gavin - Thanks for posting the bowtie2 manual reference on discordant alignments. I agree it is useful to be able to have different options for alignments. Sorry if my original email was confusing; kneaddata has an option with the same name as bowtie2 "--no-discordant" to allow for discordant alignments. It also has an option "--cat-pairs" which will allow for cases where only a single read aligns while keeping track of paired counts.