[pygr] bsddb unavailable

375 views
Skip to first unread message

Naomi Ellen

unread,
Dec 21, 2010, 3:21:32 AM12/21/10
to pygr...@googlegroups.com
Hi,

I recently installed pygr-0.8.2-py2.6-macosx-10.6-universal.egg.  

When I was trying to construct sequence objects from a fasta file, I ran into the following error:

>>> from pygr import seqdb
>>> sp = seqdb.SequenceFileDB('/Users/naomi/projects/data/MM2235/MM2235.fa')
WARNING dbfile.open_index: Falling back to hash index: unable to import bsddb
DEBUG seqdb._create_seqLenDict: Building sequence length index...
WARNING dbfile.open_index: Falling back to hash index: unable to import bsddb
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.egg/pygr/seqdb.py", line 433, in __init__
    SequenceDB.__init__(self, filepath=filepath, dbname=dbname, **kwargs)
  File "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.egg/pygr/seqdb.py", line 232, in __init__
    self._set_seqtype()
  File "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.egg/pygr/seqdb.py", line 244, in _set_seqtype
    for seqID in self: # get an iterator
  File "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.egg/pygr/seqdb.py", line 304, in __iter__
    return iter(self.seqInfoDict)
  File "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.egg/pygr/seqdb.py", line 517, in __iter__
    return iter(self.seqDB.seqLenDict)
  File "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.egg/pygr/dbfile.py", line 150, in __iter__
    Original error message was: %s''' % str(exc_value))
TypeError: cannot iterate over this dictionary.
This means that you do not have bsddb, bsddb3, or gdbm available for use by
the 'shelve' module in this Python install.  Please fix this!

Original error message was: 'dbm.dbm' object is not iterable

Could you suggest how to proceed?

Thanks,
Naomi

Namshin Kim

unread,
Dec 21, 2010, 4:42:45 AM12/21/10
to pygr...@googlegroups.com
Hi Naomi,

You can find some threadings in google groups.

Namshin Kim



--
You received this message because you are subscribed to the Google Groups "pygr-dev" group.
To post to this group, send email to pygr...@googlegroups.com.
To unsubscribe from this group, send email to pygr-dev+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en.

Naomi Ellen

unread,
Dec 22, 2010, 2:26:38 AM12/22/10
to pygr...@googlegroups.com
Hi Namshin,

can you help me with the same bug? I found a thread on the pygr-dev forum,

It suggested that the pygr team was going to be working on a fix that uses sqlite, right after version 0.8. But has that happened yet?
None of the alternatives seem to work for me: gdbm is apparently not supported on the Mac, and bsddb requires the BerkeleyDB, which you can only download when you have an account with Oracle.
And the dbm patches from the Python community do not seem to apply to Python version 2.6 but to 2.7 and 3.2.

Thanks for your help,
Naomi

gawbul

unread,
Dec 22, 2010, 4:25:34 AM12/22/10
to pygr-dev
Hi Naomi,

What Python version are you using? Do a python -V from the comand line
(not that capital V)!

Cheers,

Steve

On Dec 21, 8:21 am, Naomi Ellen <naomi.r.el...@gmail.com> wrote:
> Hi,
>
> I recently installed pygr-0.8.2-py2.6-macosx-10.6-universal.egg.
>
> When I was trying to construct sequence objects from a fasta file, I ran
> into the following error:
>
> >>> from pygr import seqdb
> >>> sp = seqdb.SequenceFileDB('/Users/naomi/projects/data/MM2235/MM2235.fa')
>
> WARNING dbfile.open_index: Falling back to hash index: unable to import
> bsddb
> DEBUG seqdb._create_seqLenDict: Building sequence length index...
> WARNING dbfile.open_index: Falling back to hash index: unable to import
> bsddb
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/seqdb.py",
> line 433, in __init__
>     SequenceDB.__init__(self, filepath=filepath, dbname=dbname, **kwargs)
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/seqdb.py",
> line 232, in __init__
>     self._set_seqtype()
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/seqdb.py",
> line 244, in _set_seqtype
>     for seqID in self: # get an iterator
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/seqdb.py",
> line 304, in __iter__
>     return iter(self.seqInfoDict)
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/seqdb.py",
> line 517, in __iter__
>     return iter(self.seqDB.seqLenDict)
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/dbfile.py",

gawbul

unread,
Dec 22, 2010, 4:26:10 AM12/22/10
to pygr-dev
Hi Naomi,

What Python version are you using? Do a python -V from the comand line
(note that capital V)!

Cheers,

Steve

On Dec 21, 8:21 am, Naomi Ellen <naomi.r.el...@gmail.com> wrote:
> Hi,
>
> I recently installed pygr-0.8.2-py2.6-macosx-10.6-universal.egg.
>
> When I was trying to construct sequence objects from a fasta file, I ran
> into the following error:
>
> >>> from pygr import seqdb
> >>> sp = seqdb.SequenceFileDB('/Users/naomi/projects/data/MM2235/MM2235.fa')
>
> WARNING dbfile.open_index: Falling back to hash index: unable to import
> bsddb
> DEBUG seqdb._create_seqLenDict: Building sequence length index...
> WARNING dbfile.open_index: Falling back to hash index: unable to import
> bsddb
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/seqdb.py",
> line 433, in __init__
>     SequenceDB.__init__(self, filepath=filepath, dbname=dbname, **kwargs)
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/seqdb.py",
> line 232, in __init__
>     self._set_seqtype()
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/seqdb.py",
> line 244, in _set_seqtype
>     for seqID in self: # get an iterator
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/seqdb.py",
> line 304, in __iter__
>     return iter(self.seqInfoDict)
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/seqdb.py",
> line 517, in __iter__
>     return iter(self.seqDB.seqLenDict)
>   File
> "/Library/Python/2.6/site-packages/pygr-0.8.2-py2.6-macosx-10.6-universal.e gg/pygr/dbfile.py",

Namshin Kim

unread,
Dec 22, 2010, 6:36:23 PM12/22/10
to pygr...@googlegroups.com

Christopher Lee

unread,
Dec 23, 2010, 12:34:40 AM12/23/10
to pygr...@googlegroups.com
Hi Naomi,
yes, the Apple-supplied python 2.6 unfortunately lacks working bsddb, bsddb3 and gdbm. It has dbm but unfortunately dbm does not even support the iterator protocol!

The simplest fix is to install PyBSDDB (get it here:)
http://www.jcea.es/programacion/pybsddb.htm

Simply installing that should fix the problem: i.e. you should be able to import this in your Python interpreter via "import bsddb3", and the problem you described in your email message should go away. Pygr 0.8.2 automatically looks for bsddb3 and uses it if bsddb is missing / not working, so you do not need to rename bsddb3 as Namshin used to do...

-- Chris Lee

Naomi Ellen

unread,
Dec 23, 2010, 3:50:25 AM12/23/10
to pygr...@googlegroups.com
Hi,

Unfortunately, your suggestion does not address my problem.

% easy_install bsddb3
Searching for bsddb3
Best match: bsddb3 5.1.1
Processing bsddb3-5.1.1.zip
Running bsddb3-5.1.1/setup.py -q bdist_egg --dist-dir /var/folders/+6/+6JHnURVE3G3kz+p4dnQfE+++TM/-Tmp-/easy_install-oBTvxP/bsddb3-5.1.1/egg-dist-tmp-lX8NCy
Can't find a local Berkeley DB installation.
(suggestion: try the --berkeley-db=/path/to/bsddb option)
error: Setup script exited with 1

My Mac does not have the Berkeley DB installed, and there doesn't seem to be any Berkeley DB build for the Mac.
Besides, you need an account with Oracle to download the source code.

Is there really no way to run pygr with gdbm on the Mac? I installed Python 2.7 and gdbm from Macports, and still,
no success:

>>> import gdbm
>>> from pygr import seqdb
>>> sp = seqdb.SequenceFileDB('/Users/naomi/projects/data/MM0240/MM0240.fa')
WARNING dbfile.open_index: Falling back to hash index: unable to import bsddb
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pygr-0.8.2-py2.7-macosx-10.6-x86_64.egg/pygr/seqdb.py", line 433, in __init__
    SequenceDB.__init__(self, filepath=filepath, dbname=dbname, **kwargs)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pygr-0.8.2-py2.7-macosx-10.6-x86_64.egg/pygr/seqdb.py", line 232, in __init__
    self._set_seqtype()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pygr-0.8.2-py2.7-macosx-10.6-x86_64.egg/pygr/seqdb.py", line 244, in _set_seqtype
    for seqID in self: # get an iterator
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pygr-0.8.2-py2.7-macosx-10.6-x86_64.egg/pygr/seqdb.py", line 304, in __iter__
    return iter(self.seqInfoDict)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pygr-0.8.2-py2.7-macosx-10.6-x86_64.egg/pygr/seqdb.py", line 517, in __iter__
    return iter(self.seqDB.seqLenDict)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pygr-0.8.2-py2.7-macosx-10.6-x86_64.egg/pygr/dbfile.py", line 150, in __iter__
    Original error message was: %s''' % str(exc_value))
TypeError: cannot iterate over this dictionary.
This means that you do not have bsddb, bsddb3, or gdbm available for use by
the 'shelve' module in this Python install.  Please fix this!

Original error message was: 'dbm.dbm' object is not iterable
>>> definitions = gdbm.open('test_gdbm.dbm', 'c')

Apparently gdbm works, but pygr doesn't.

Thanks,

Naomi

Christopher Lee

unread,
Dec 23, 2010, 1:25:52 PM12/23/10
to pygr...@googlegroups.com

On Dec 23, 2010, at 12:50 AM, Naomi Ellen wrote:

> My Mac does not have the Berkeley DB installed, and there doesn't seem to be any Berkeley DB build for the Mac.
> Besides, you need an account with Oracle to download the source code.

Since you raised this, I just tried downloading and building Berkeley DB on Mac OS 10.6 following these instructions, which seemed to work fine:
http://code.google.com/p/tonatiuh/wiki/InstallingBerkeleyDBForMac
(he provides a link for downloading Berkeley DB). The only annoyance I hit was that it's not obvious what rules Oracle enforces on your proposed password; it kept rejecting everything I tried. Finally I succeeded. The rule seems to be: at least one number, at least one capital letter, at least one lowercase letter, no non-alphanumeric characters, and at least 8 characters long. Also, your username has to be an email address, although they don't seem to do anything to verify it.

The build ran fine. I didn't try to actually install it, as it's my policy never to modify my Apple-supplied system files (i.e. I always install software to other locations, a la Fink).

Pygr relies on the Python standard module anydbm to return whatever dbm style hash module your install supports, including gdbm. (anydbm tries them in the order: dbhash, gdbm, dbm, dumbdbm). The Pygr error message above means that both iter(d) and d.firstkey() failed, i.e. the hash-dictionary object supports neither standard Python iteration nor gdbm's firstkey()/nextkey() iteration methods. The implication is that either

- anydbm is choosing something other than gdbm, even though you say you have gdbm.

- or somehow gdbm is misbehaving.

To figure out what's going on, try the following tests:

import anydbm
print anydbm._defaultmod
d = anydbm.open('test_anydbm.dbm', 'c')
it = iter(d)
it.next()

anydbm._defaultmod shows which hash module anydbm actually chose.
If the hash supports iteration, it.next() should simply raise StopIteration (because the hash is empty).

A gdbm hash will fail on iter(d) because it lacks standard iteration methods. In that case try the following:

k = d.firstkey()
print k

If gdbm is working normally, the value of k should be None (because the hash is empty).

Let me know what results you get.

-- Chris

Naomi Ellen

unread,
Dec 24, 2010, 3:20:36 AM12/24/10
to pygr...@googlegroups.com
Ok, I found out what the problem was with gdbm.

Since I had run pygr with Apple's builtin python distribution before, I had an existing database called
/Users/naomi/projects/data/MM0240/MM0240.fa.seqlen.db.
which was of format dbm.
So, even though I am using Macports' python 2.7 with gdbm now, pygr picked up the existing dbm database and complained that it could not iterate over it.
Removing the database fixed the problem, of course.
But I wonder, is there something like a "make clean" command for pygr, which gets rid of all databases created, or at least those created for a certain .fa file?

-Naomi


--

Christopher Lee

unread,
Dec 24, 2010, 12:25:45 PM12/24/10
to pygr...@googlegroups.com
No, we could add that as an option to the SequenceFileDB constructor.

-- Chris

C. Titus Brown

unread,
Dec 24, 2010, 7:09:45 PM12/24/10
to pygr...@googlegroups.com
Naomi,

you might also consider using screed, which stores sequences in a
sqlite database directly. See:

https://github.com/ctb/screed/blob/master/screed/tests/test_pygr_api.py

for an example of how to use it with pygr. Briefly,

from screed.pygr_api import ScreedSequenceDB

should give you a drop-in replacement for SequenceFileDB, although you'll
need to index the database separately; see

https://github.com/ctb/screed/blob/master/doc/example.txt

for information on how to do that.

cheers,
--titus

Reply all
Reply to author
Forward
0 new messages