defining a new pygr.Data entry

0 views
Skip to first unread message

C. Titus Brown

unread,
Jul 24, 2008, 10:48:54 PM7/24/08
to pygr...@googlegroups.com
I can't figure out how to introduce a non-Bio resource, e.g.

import pygr.Data
pygr.Data.SomethingElse.Blah = resource

Am I limited to 'Bio' if I'm not using layers? (I could define
arbitrary names using the 'here' layer, earlier.)

thanks,
--titus
--
C. Titus Brown, c...@msu.edu

Jenny Qing Qian

unread,
Jul 24, 2008, 11:27:54 PM7/24/08
to pygr...@googlegroups.com
On Thu, Jul 24, 2008 at 7:48 PM, C. Titus Brown <c...@msu.edu> wrote:

I can't figure out how to introduce a non-Bio resource, e.g.

       import pygr.Data
       pygr.Data.SomethingElse.Blah = resource

Am I limited to 'Bio' if I'm not using layers?  (I could define
arbitrary names using the 'here' layer, earlier.)

Hey Titus, I've met a similar problem before.  Namshin has pointed out to me in a previous thread:

I am not quite sure whether it would work or not, but what about trying Bio top-level category? In Data.py you can see,
 
# PROVIDE TOP-LEVEL NAMES IN OUR RESOURCE HIERARCHY
Bio = ResourcePath('Bio')

It seems that this is hard-coded in the Data.py module.

--jenny

C. Titus Brown

unread,
Jul 24, 2008, 11:35:22 PM7/24/08
to pygr...@googlegroups.com
On Thu, Jul 24, 2008 at 08:27:54PM -0700, Jenny Qing Qian wrote:
-> On Thu, Jul 24, 2008 at 7:48 PM, C. Titus Brown <c...@msu.edu> wrote:
-> > I can't figure out how to introduce a non-Bio resource, e.g.
-> >
-> > import pygr.Data
-> > pygr.Data.SomethingElse.Blah = resource
-> >
-> > Am I limited to 'Bio' if I'm not using layers? (I could define
-> > arbitrary names using the 'here' layer, earlier.)
-> >
->
-> Hey Titus, I've met a similar problem before. Namshin has pointed out to me
-> in a previous thread:
->
-> I am not quite sure whether it would work or not, but what about trying Bio
-> top-level category? In Data.py you can see,
->
-> # PROVIDE TOP-LEVEL NAMES IN OUR RESOURCE HIERARCHY
-> Bio = ResourcePath('Bio')
->
-> It seems that this is hard-coded in the Data.py module.

Ahh! I see some relevant-looking documentation under 2.5.15,
'ResourcePath'...

I am currently investigating the mysteries of XML-RPC service, but my
simple attempts to serve the 15-way insect alignment are not working.
Can I not serve PrefixUnionDicts with the default serverClass list?

cheers,

Namshin Kim

unread,
Jul 25, 2008, 2:22:56 PM7/25/08
to pygr...@googlegroups.com
Hi Titus,
 
There is a way to use non-Bio resources.
 
pygr.Data.SomethingElse.Blah = resource
 
In above case, you have to always *use* Bio.
 
But, if you use getResource.addResource, you can use non-Bio resource. See _megatest.py in tests directory for more details.
 
pygr.Data.getResource.addResource('TEST.hg18', hg18)
 
Maybe this is a bug, but I have been using that in my _megatest.py scripts.
 
Yours,
Namshin Kim

Christopher Lee

unread,
Jul 25, 2008, 3:11:36 PM7/25/08
to pygr...@googlegroups.com

On Jul 25, 2008, at 11:22 AM, Namshin Kim wrote:

> Hi Titus,
>
> There is a way to use non-Bio resources.
>
> pygr.Data.SomethingElse.Blah = resource
>
> In above case, you have to always *use* Bio.
>
> But, if you use getResource.addResource, you can use non-Bio
> resource. See _megatest.py in tests directory for more details.
>
> pygr.Data.getResource.addResource('TEST.hg18', hg18)
>
> Maybe this is a bug, but I have been using that in my _megatest.py
> scripts.
>

This is a limitation of Python module name spaces, not of pygr.Data.
pygr.Data is a Python module, so its namespace is limited by the rules
of a Python module. For any name to be available in the top-level
name space of the module (e.g. pygr.Data.foo), it must actually exist
prior to the user's request for it. I.e. there is no way to implement
a __getattr__ method for a module object, which is Python's usual way
for finding / creating new names dynamically.

As Namshin said, if you call the addResource() or getResource()
methods, you are bypassing that namespace limitation and can add /
retrieve new top-level names to the pygr.Data namespace. Once those
top-level names exist in your resource database(s), subsequent imports
of pygr.Data will discover those top-level names and automatically add
them to the module namespace during import. So once saved in the
resource database, they should be available in the pygr.Data namespace
just like pygr.Data.Bio. This hasn't been tested much, though.

Note that this issue of adding top-level names automatically to the
module namespace is the one and only reason why pygr.Data actually
connects to the resource servers DURING import (as opposed to the
first time a user requests a resource name). I could be wrong about
this, but my impression was that you can only add to the module
namespace during import... hmm, I just did a little experiment that
suggests you CAN keep a reference to locals() and add keys to it
later, and they will successfully appear in the module name space.
This seems very pythonic, but I have no idea whether this convenient
behavior is guaranteed by the language definition...
I created a module containing the following code:
foo.py ---------
d = locals()
bob = 'Roberto'
----------------

Then I open a python session:
>>> import foo
>>> foo.bob
'Roberto'
>>> foo.d['joe'] = 'Giuseppe'
>>> foo.joe
'Giuseppe'

Interesting. The obligation to connect to the resource databases
DURING import has always seemed to me one of the more annoying,
unconventional aspects of pygr.Data (for example, I think this leads
to bad interactions with the "nose" testing suite; to run various
pygr.Data tests I'm forced to reload() the pygr.Data module to run
each test... which nose doesn't seem to like). My little experiment
above suggests it may be unnecessary. To access "non-standard" top-
level names the user would have to first call an initializer function
e.g. pygr.Data.connect(), which would connect to the resource
databases and automatically add top-level names to namespace.

The user always has the option of directly calling getResource() to
bypass the whole module namespace issue.

Another possible approach would be to have a simple top-level name
(e.g. "top", "root", or "all") that acts as a gateway to the root of
the namespace. In this case the user would access "Bio.Foo.Whatever"
as pygr.Data.top.Bio.Foo.Whatever. This may be the simplest answer of
all -- it totally solves the dumb problem of having to add names to
the top-level namespace. Of course, it forces the user to type a few
more characters...

What do you think is best?

C. Titus Brown

unread,
Jul 27, 2008, 8:03:03 PM7/27/08
to pygr...@googlegroups.com
-> Another possible approach would be to have a simple top-level name
-> (e.g. "top", "root", or "all") that acts as a gateway to the root of
-> the namespace. In this case the user would access "Bio.Foo.Whatever"
-> as pygr.Data.top.Bio.Foo.Whatever. This may be the simplest answer of
-> all -- it totally solves the dumb problem of having to add names to
-> the top-level namespace. Of course, it forces the user to type a few
-> more characters...
->
-> What do you think is best?

I don't mind having a limited set of public top-level names, but it'd be
nice to know what they were and how to add new ones for your own private
use (i.e. => docs).

Reply all
Reply to author
Forward
0 new messages