New feature--gathering data about sequencing artifacts in IGV

10 views
Skip to first unread message

Noah Friedman

unread,
Sep 20, 2017, 3:16:57 PM9/20/17
to igv-help
Hi there IGV team/community.  My name is Noah Friedman and I am a bioinformatics scientist for the Undiagnosed Disease Network at Stanford.  One of the larger time bottlenecks in our curation processes is evaluating candidate variants one by one in IGV.  Our team discards a large number of the variants they evaluate in IGV after realizing that they are sequencing artifacts.  To solve this problem I have been collecting screenshots of variants that our curation team looks at in IGV and marking these screenshots as sequencing artifacts or true reads.  We had aimed to use this data as the foundation for a machine learning classifier that would hopefully learn how to distinguish between IGV images of artifacts and valid variants.  
However, the volume of our data flow is too small to build up a strong set of labeled data.  This is why I wanted to get in touch with you guys.  Do you think it would be within the scope of IGV's development to add a feature/button where users could check a button to mark whether or not a variant is a sequencing artifact and IGV could log the data and a screenshot of the variant in question?  If some version of this feature was possible I think the data could vastly improve curation and efficiency for users of IGV.  Let me know if this is within the scope of what you can add to IGV and if so I'd love to talk about it and explain the work I have done towards this goal. 
thanks so much
Reply all
Reply to author
Forward
0 new messages