NOTES ===== bsddb -- what's the official "large pickle store" included with Python? seqdb2 stuff - Alex Nolley -------------------------- Loading big and small sequences from seqdb2 "Cache hints" to deal with situations going from keeping nothing in memory to keeping an entire sequences Q: C++/Pyrex/etc. Chris sez: Python has fseek built in; have a Python reader, C++/Pyrex writer? Cross-hyb between modules. Russell: NetCDF or HDF5? http://fiehnlab.ucdavis.edu/staff/kind/Metabolomics/Peak_Alignment/ quality scores? quantitative data associated with sequence? quantitative data must be: 1. map-pable 2. slice-able so, sequence data has associated "annotations" that are quantitative. other examples: ka/ks and conservation scores. what else? Chris: generalize into a schema? Design stuff. GSoC doctests - Rachel McCreary ------------------------------- Increase accessibility of pygr to new users: provide a bunch of examples. newpygrIntro Discussion about how to *create* databases! (My) idea of making creation & information passing explicit, vs Chris's pygr.Data/readonly databases Show also motif tutorial. Make interval logic complete? Discussion of vertex. Chris on visualization. Can we link these up to Jenny's ENSEMBL API? Killer apps include GUI frontends + Web stuff + ..., NLMSA wrappers to do large-scale transformations and manipulations of NLMSAs, etc. GSoC ENSEMBL API - Jenny Qian ----------------------------- Too many languages in one pipeline (SAGE and Solexa) -- no ENSEMBL API! Basic ENSEMBL connections working. One problem -- no standard way to get list of "interesting" columns on TupleO (or other) objects. Define one! Many of Jenny's mapping classes can be replaced by internal pygr classes! Issues of retrieving actual genomic sequence from ENSEMBL... Some (many) of the interesting sequence features are mapped directly onto genomic sequence, but others are mapped onto contigs. Either way, the only way to extract genomic sequence from their database appears (?) to be to map genomic coordinates onto contigs and then extract the relevant contig sequence. Options -- - have programmatic jiujitsu that does appropriate mapping dynamically - make final sequence once, programmatically, and save it - download genomic sequence from their Web site - download genomic sequence "standard" from somewhere else Either way, extremely frustrating situation! Unpickling saved resources isn't working too well. Whoops, pyrex stuff isn't installed on Jenny's computer :)