Building a graph database for querying with statistics

18 views
Skip to first unread message

Shyam Sarkar

unread,
Jul 29, 2013, 2:21:06 AM7/29/13
to bgi-...@googlegroups.com
Hello,

I am trying to build a graph database with outputs from SOAPsnp and I need the essential
relationships over offsets and SNPs.  Can someone please suggest which fields from 
SOAPsnp output will make sense to create such a graph database for further statistical
querying ?  I have the output format as:

1. Chromosome ID.
2. 1-based offset into chromosome.
3. Reference genotype.
4. Subject genotype.
5. Quality score of subject genotype.
6. Best base.
7. Average quality score of best base.
8. Count of uniquely aligned reads corroborating the best base.
9. Count of all aligned reads corroborating the best base.
10. Second best base.
11. Average quality score of second best base.
12. Count of uniquely aligned reads corroborating second best base.
13. Count of all aligned reads corroborating second best base.
14. Overall sequencing depth at the site.
15. Rank sum test P-value.
16. Average copy number of nearby region.
17. Whether the site is a known dbSNP.

I need essential fields to detect SNP to create nodes and relationships
in a graph. (Or what will be a query to detect SNP ?).

I appreciate any help.

Thanks,
Dr. Shyam Sarkar
Fremont, CA
Reply all
Reply to author
Forward
0 new messages