Duplication Fragments showing as deletions (2.3.72)

383 views
Skip to first unread message

bradyf...@gmail.com

unread,
Jun 21, 2016, 12:04:07 PM6/21/16
to igv-help
Hello,
I recently upgraded from 2.1 and have had to tweak some of the alignment settings to get my paired end reads to look the same as before. I have encountered one issue so far. These reads should be duplication fragments according to the software used in our analysis. In 2.1 they also show as duplication fragments. I have Color by set to orientation and insert size, and the insert size options set to compute with .0001 and 100. I believe this means it will default to orientation, and if orientation isn't clear then it will use colors if it falls between these percentages (which should be anything else). Am I missing something?

Thanks





James Robinson

unread,
Jun 21, 2016, 1:16:24 PM6/21/16
to igv-...@googlegroups.com
That’s correct, orientation is checked first, if it is “normal” then insert size.   The insert size detection is relative to all alignments loaded, so if you use the compute option its percentiles relative to the other alignments.   If you know the expected insert size of your library you might turn off “compute” and just set hard-coded values.   

The panel on the right looks like a low complexity region, with mates all over the genome,  so any events detected there would be suspect.


--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/igv-help/82507d32-504f-4d10-b4ec-85bce3549095%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

bradyf...@gmail.com

unread,
Jun 21, 2016, 1:59:09 PM6/21/16
to igv-help
So here's a better example. I have color by orientation and insert size on. I am using fragments with 50bp adapters and 100bp inserts. From what I've seen online, this can be interpreted as either 100 or 200. Regardless, I think this should come through as an inversion (LL reads) as orientation is checked first. I have been playing around with the hardcoded values and I can't get the reads to show in inversion blue. If I set the thresholds very high they turn very dark blue, which doesn't look like the usual inversion or any chr color. Anything else I can try?


James Robinson

unread,
Jun 21, 2016, 2:02:37 PM6/21/16
to igv-...@googlegroups.com
Hi, IGV knows nothing about inversions or any other event, it only can color based on insert size and pair orientation.   This looks like an unusual sequencing library,  so statistics to figure out insert size distribution could be off.  Why not just color by pair orientation if that is what you are interested in?

On Jun 21, 2016, at 10:59 AM, bradyf...@gmail.com wrote:

So here's a better example. I have color by orientation and insert size on. I am using fragments with 50bp adapters and 100bp inserts. From what I've seen online, this can be interpreted as either 100 or 200. Regardless, I think this should come through as an inversion (LL reads) as orientation is checked first. I have been playing around with the hardcoded values and I can't get the reads to show in inversion blue. If I set the thresholds very high they turn very dark blue, which doesn't look like the usual inversion or any chr color. Anything else I can try?



--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+u...@googlegroups.com.

James Robinson

unread,
Jun 21, 2016, 2:06:26 PM6/21/16
to igv-...@googlegroups.com
Sorry you are correct, orientation is checked first,  so coloring by orientation alone shouldn’t matter.   To determine expected orientation IGV looks at the distribution of orienations of all reads read.   If you are only loading reads with the orientations shown that is going to be the “expected” one,  i.e. these are not unusual wrt this dataset so they are not going to get colored.   

Perhaps we could add a setting so you tell IGV explicitly the type of library.   That doesn’t exist now,  but would be a good addition.


On Jun 21, 2016, at 10:59 AM, bradyf...@gmail.com wrote:

So here's a better example. I have color by orientation and insert size on. I am using fragments with 50bp adapters and 100bp inserts. From what I've seen online, this can be interpreted as either 100 or 200. Regardless, I think this should come through as an inversion (LL reads) as orientation is checked first. I have been playing around with the hardcoded values and I can't get the reads to show in inversion blue. If I set the thresholds very high they turn very dark blue, which doesn't look like the usual inversion or any chr color. Anything else I can try?



--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+u...@googlegroups.com.

bradyf...@gmail.com

unread,
Jun 21, 2016, 2:35:57 PM6/21/16
to igv-help
I would like to be able to find all types of rearrangements detected by both orientation and color. 

James Robinson

unread,
Jun 21, 2016, 2:41:32 PM6/21/16
to igv-...@googlegroups.com
Hi,

I’m not sure we’re communicating.   IGV does not find rearrangements, it colors reads who’s orientations or pair sizes are different than the majority of reads loaded.    In other words it relies on statistics of all alignments that have been read to determine library type, then colors accordingly.   In your screenshot all reads are RR,  so that is normal for this library as IGV sees it so they will not get colored.  Does this make sense?  Can you tell me more about this library?

As I suggest below,  for this case an explicit setting by the user of library type would be useful.  That doesn’t currently exist, but I just added a git issue to add it https://github.com/igvteam/igv/issues/263.  The would be available in the next release.

Jim



bradyf...@gmail.com

unread,
Jun 21, 2016, 4:24:40 PM6/21/16
to igv-help
Ah, I see what you mean. By "find rearrangements" I meant be able to correctly color all types of rearrangements. Also I understand how it looks at the reads loaded to determine what is considered different from the reference reads, then chooses how to identify them. In my example all reads are RR so this is the norm.

What I don't understand is what you mean by library type. I'm loading BAM files aligned from illumina paired end reads to identify abnormalities in a given sample. The locations I am loading will likely always have some rearrangements, therefore making the abnormal reads the majority.

James Robinson

unread,
Jun 21, 2016, 5:02:52 PM6/21/16
to igv-...@googlegroups.com
Library type refers to expected pair orientation.  If can be FR (normal Illumina type),  RF,  or even a rather complex FF/RR combination.   Since this information is not in the BAM file IGV tries to determine it from the reads themselves.   “Some” rearrangements should not result in a majority of reads being abnormal,  unless the BAM has been filtered to remove normal reads or you are using some type of capture protocol to only sequence abnormal reads.

However, this does raise the issue of what to do for bams such as yours.   I entered a git issue to let the user specify this,  similiar to how you can override the computation of insert size.  That should be available with a week or two.   If you have a git account you can follow the issue here:  https://github.com/igvteam/igv/issues/263.

As a workaround until then you might try loading a normal region first.

bradyf...@gmail.com

unread,
Jun 22, 2016, 9:26:34 AM6/22/16
to igv-help
That makes sense. Thanks a lot for your help

Jim Robinson

unread,
Jun 22, 2016, 9:45:04 AM6/22/16
to igv-...@googlegroups.com
Hi,

After reviewing the source code I'm still a bit puzzled.  It determines based on reads marked "proper pair",  so the abnormal reads should be filtered unless your aligner is not setting that flag.  Would you be able to share a slice of the BAM corresponding to the screenshot?   To do this right click over the alignments and select "Export Alignments..."  near the bottom of the menu.   I will send you a DropBox invitation where you can transfer them securely.

Secondly,  could you the "snapshot" build and tell me if it performs better?   Its available at http://www.broadinstitute.org/software/igv/download_snapshot
--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+u...@googlegroups.com.

bradyf...@gmail.com

unread,
Jun 22, 2016, 11:48:17 AM6/22/16
to igv-help
Unfortunately the BAMs have sensitive data assoicated with them, so I am unable to send you any portion of the files. However I will test the snapshot and report back.

bradyf...@gmail.com

unread,
Jun 22, 2016, 11:54:51 AM6/22/16
to igv-help
With snapshot 595 I am still getting the same issue. Just to confirm, what would you suggest I set my insert size thresholds to?


On Wednesday, June 22, 2016 at 9:45:04 AM UTC-4, Jim Robinson wrote:

Jim Robinson

unread,
Jun 22, 2016, 5:55:59 PM6/22/16
to igv-...@googlegroups.com
Sorry, I really have no idea what's going on here.   If you are interested in pair orientations I suggest just turning insert size off (just color by pair orientation).   Then select "view as pairs",  if there is a pair that is not colored as you think it should be right-click and see what the orientation is.    If you find a way to scrub this data so you can share a slice of the bam (privately) I can look deeper,  for example by removing the read sequence from the records.  I can give you instructions for doing that if that is feasible.

Jim


Message has been deleted

James Robinson

unread,
Sep 14, 2016, 7:55:22 AM9/14/16
to igv-help
Yes,  I sent you a dropbox link to upload it.   Alternatively you can zip it and attach it here, or send us a link to download it.   I am on vacation and won't be able to look at it until after Sept 23.

On Tue, Sep 13, 2016 at 3:39 PM, <bradyf...@gmail.com> wrote:
Hi Jim,
I know this thread is older, but I have been given permission to share a nonPHI slice of a bam file with a rearrangement. Is it still possible to investigate this?

Thanks
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/igv-help/1a0263a3-a4e2-42e9-ac79-b72e6b4e9fde%40googlegroups.com.

bradyf...@gmail.com

unread,
Sep 14, 2016, 9:32:05 AM9/14/16
to igv-help
I uploaded the samples to the dropbox link provided. Thanks for looking into this.

Jim Robinson

unread,
Sep 19, 2016, 9:43:57 AM9/19/16
to igv-help
I'm looking into this, but have a question.  The template length values in this file are strange, for example in the record below the value is -18028632.    This should be approximately the distance between the pairs when aligned.   Are they really 18 MB apart (i.e. are the molecules sequenced really ~18 MB)?    


0f03Z9TR 177 chr10 61638608 25 49M = 43609976 -18028632 TGCGGCAAGGAATTATTTAACAATTATGACTATTCATTTAGGCCAAGCC JJIJIJJJJJJJIG>JJJJIJJJJJJJJHJJJJJJJHHHHHFFFFFCCC X0:i:1 X1:i:0 MD:Z:0A0T0A46 RG:Z:H7RRHADXX.lane0.2P_FMI_32 XG:i:0 AM:i:25 NM:i:3 SM:i:25 XM:i:3 XO:i:0 XT:A:U


James Robinson

unread,
Sep 19, 2016, 10:19:23 AM9/19/16
to igv-help
OK,  I'm not sure if this is an artifact of taking a small sample, but in the file you sent me there are no alignments marked "proper pair"  (sam flag 0x2).   So there are no alignments that can be used to compute an expected insert size and orientation.   If this is the case in the file as a whole you will need to turn off (uncheck) the "Compute" option and manually set the min and max thresholds to values that make sense for your sequencing library.   





Unfortunately we don't have a way to manually set the expected orientation, and it will default for FR.   We will add a preference for this in a future release,  in the meantime if your sequence library is RR and enough "proper pairs" cannot be found to compute the expected orientation then color by orientation will not be useful.

On Mon, Sep 19, 2016 at 9:43 AM, Jim Robinson <jrob...@broadinstitute.org> wrote:
I'm looking into this, but have a question.  The template length values in this file are strange, for example in the record below the value is -18028632.    This should be approximately the distance between the pairs when aligned.   Are they really 18 MB apart (i.e. are the molecules sequenced really ~18 MB)?    


0f03Z9TR 177 chr10 61638608 25 49M = 43609976 -18028632 TGCGGCAAGGAATTATTTAACAATTATGACTATTCATTTAGGCCAAGCC JJIJIJJJJJJJIG>JJJJIJJJJJJJJHJJJJJJJHHHHHFFFFFCCC X0:i:1 X1:i:0 MD:Z:0A0T0A46 RG:Z:H7RRHADXX.lane0.2P_FMI_32 XG:i:0 AM:i:25 NM:i:3 SM:i:25 XM:i:3 XO:i:0 XT:A:U


--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/igv-help/fcec8eca-eda8-4fcd-ab86-55050fb03ff3%40googlegroups.com.

bradyf...@gmail.com

unread,
Sep 19, 2016, 2:01:11 PM9/19/16
to igv-help
Yes this 18MB is likely correct. These reads are describing a large intergenic event so this is reasonable.

bradyf...@gmail.com

unread,
Sep 19, 2016, 2:09:02 PM9/19/16
to igv-help
I had a conversation with one of our engineers. It turns out that when making the BAM files we use for rearrangements, the normal read pairs are filtered out. This would explain why there are no normal FR read pairs in this slice. Is it possible to add a preference for manually specifying the expected read orientation?


On Monday, September 19, 2016 at 10:19:23 AM UTC-4, Jim Robinson wrote:
OK,  I'm not sure if this is an artifact of taking a small sample, but in the file you sent me there are no alignments marked "proper pair"  (sam flag 0x2).   So there are no alignments that can be used to compute an expected insert size and orientation.   If this is the case in the file as a whole you will need to turn off (uncheck) the "Compute" option and manually set the min and max thresholds to values that make sense for your sequencing library.   





Unfortunately we don't have a way to manually set the expected orientation, and it will default for FR.   We will add a preference for this in a future release,  in the meantime if your sequence library is RR and enough "proper pairs" cannot be found to compute the expected orientation then color by orientation will not be useful.
On Mon, Sep 19, 2016 at 9:43 AM, Jim Robinson <jrob...@broadinstitute.org> wrote:
I'm looking into this, but have a question.  The template length values in this file are strange, for example in the record below the value is -18028632.    This should be approximately the distance between the pairs when aligned.   Are they really 18 MB apart (i.e. are the molecules sequenced really ~18 MB)?    


0f03Z9TR 177 chr10 61638608 25 49M = 43609976 -18028632 TGCGGCAAGGAATTATTTAACAATTATGACTATTCATTTAGGCCAAGCC JJIJIJJJJJJJIG>JJJJIJJJJJJJJHJJJJJJJHHHHHFFFFFCCC X0:i:1 X1:i:0 MD:Z:0A0T0A46 RG:Z:H7RRHADXX.lane0.2P_FMI_32 XG:i:0 AM:i:25 NM:i:3 SM:i:25 XM:i:3 XO:i:0 XT:A:U


--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+u...@googlegroups.com.

James Robinson

unread,
Sep 19, 2016, 2:52:45 PM9/19/16
to igv-help
Hi,  what is the expected orientation?  If its FR you don't need a preference as that is the default.    

A preference for this will be added,  but I'm not sure you need one.  

To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/igv-help/c6981170-a7a3-45a9-b16d-7b14e8c2e78d%40googlegroups.com.

James Robinson

unread,
Sep 19, 2016, 3:55:47 PM9/19/16
to igv-help
And a follow-up question.  What is the approximate expected template length of fragments in your sequencing protocol?   

bradyf...@gmail.com

unread,
Sep 19, 2016, 4:46:34 PM9/19/16
to igv-help
We do expect to see FR reads. 

Our expected insert size is ~200bps.

For the life of me, I can't get both translocations and intrachromosomal rearrangement colors to show correctly.

James Robinson

unread,
Sep 19, 2016, 5:59:02 PM9/19/16
to igv-help
In that case everything in the file you sent looks correct.    The insert size is > expected  (much greater) for most reads,  so those reads are red.   There is 1 intrachromosal rearrangment in the 2nd file,  read 0105XYme,  and it is colored pink for chr20.   If I color by pair orientation only reads in the first file are the "RR" color,  in the second the "FF" color.   So I don't really know what you are looking to see but this looks correct.   If you want me to examine a specific read send me its name.

I do suggest you uncheck the "compute" option for insert size and set the min and max parameters manually.

Jim

bradyf...@gmail.com

unread,
Sep 22, 2016, 2:09:25 PM9/22/16
to igv-help
I don't believe I sent you everything you needed. The reads in those files are only half of the rearrangements. They should be inversion reads if you could see the other side.

James Robinson

unread,
Sep 23, 2016, 5:07:38 PM9/23/16
to igv-help
Understood, but they are colored correctly, as I noted.  If you color by orientation you will see the inversion colors.   What I don't get actually is what you are expecting to see. 

--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/igv-help/df60c107-3c32-44f2-8dd4-b2c8aa9b7b44%40googlegroups.com.

bradyf...@gmail.com

unread,
Nov 2, 2016, 9:52:49 AM11/2/16
to igv-help
That's right, when I select orientation, they will show correctly. We end up looking at a lot of rearrangements during the day and it is very beneficial to have insert size and pair orientation on. We had this color scheme set in the old version (2.1) which worked well. 

However in the most recent version, color by both makes these reads show red over the inversion blue. My goal is to be able to get that functionality back from 2.1 where orientation will override the insert size coloring.

From my understanding this is accomplished by finding the ideal setting for "insert size options", but this seems to be very tricky, as we deal with highly various insert sizes for the rearrangements.

Hope this makes sense. I really appreciate the amount of time you put into looking into this.


On Friday, September 23, 2016 at 5:07:38 PM UTC-4, Jim Robinson wrote:
Understood, but they are colored correctly, as I noted.  If you color by orientation you will see the inversion colors.   What I don't get actually is what you are expecting to see. 
On Thu, Sep 22, 2016 at 12:09 PM, <bradyf...@gmail.com> wrote:
I don't believe I sent you everything you needed. The reads in those files are only half of the rearrangements. They should be inversion reads if you could see the other side.

On Monday, September 19, 2016 at 5:59:02 PM UTC-4, Jim Robinson wrote:
In that case everything in the file you sent looks correct.    The insert size is > expected  (much greater) for most reads,  so those reads are red.   There is 1 intrachromosal rearrangment in the 2nd file,  read 0105XYme,  and it is colored pink for chr20.   If I color by pair orientation only reads in the first file are the "RR" color,  in the second the "FF" color.   So I don't really know what you are looking to see but this looks correct.   If you want me to examine a specific read send me its name.

I do suggest you uncheck the "compute" option for insert size and set the min and max parameters manually.

Jim


On Mon, Sep 19, 2016 at 4:46 PM, <bradyf...@gmail.com> wrote:
We do expect to see FR reads. 

Our expected insert size is ~200bps.

For the life of me, I can't get both translocations and intrachromosomal rearrangement colors to show correctly.


--

---
You received this message because you are subscribed to the Google Groups "igv-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to igv-help+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages