pygr and local data sets newbie help

20 views
Skip to first unread message

synt

unread,
Jul 2, 2013, 4:49:46 PM7/2/13
to pygr...@googlegroups.com
Hi guys,
I'm very new to pygr and to python but after reading through the specs I think pygr will greatly simplify my work once I learn how to use it properly.. I've done this with just python but wanted to learn this framework and improve on my current work. 

I have a couple of tasks I'm trying to accomplish for preparation of some wetlab work.. 

Data I will be using:
1. Disease specific TCGA SSM data (single mutations much like SNP) 
2. 3'UTR libraries 
3. miRNA target SNP datasets (mostly from PolymiRTS)
4. SNPdb disease specific data.

Actions that will be performed:
1. map TCGA SSM mutations to 3UTRs
2. map TCGA SSM to PolymiRTS SNP datasets (all different formats)
3. ..... to SNPdb disease specific data.

When looking through the documentation I came across literature on annotations and came up with a sample TCGA ssm class

 class tcga_ssm(object):
  def __init__(self, chromosome_start, chromosome_end, chromosome_strand, gene_name, ensembl_gene_id):
   (self.id, self.start, self.stop, self.orientation) = \
        ( "%s|%s|%s" % (chromosome_start, gene_name, ensembl_gene_id), chromosome_start, chromosome_end, chromosome_strand) 

the problem is this doesn't account for the 40+ other fields that I may need later once I match.. 
Also few of of the datasets include much sequence information and the example I saw requires a DNA db to create an annoDB

annodb = annotation.AnnotationDB(slice_db, dna_db

I will have to overlay coordinates, grab seed sequences and provide reports for mixing and matching of the above data.. what is the best approach.. a nudge in the right direction would be awesome.


Reply all
Reply to author
Forward
0 new messages