ANNOUNCE: xsdb -- the eXtremely Simple Database goes alpha

Aaron Watters

unread,

Dec 10, 2003, 6:13:21 AM12/10/03

to pytho...@python.org

Links:
Home page with docs and other links:
http://xsdb.sourceforge.net/
Sourceforge project with download links
http://sourceforge.net/projects/xsdb/

Executive Summary:

The xsdb package is an open source database implementation implemented
in Python
and hosted on SourceForge.

The xsdb package provides fundamental concurrent database functionality
with
concurrency control and recovery. Fundamental characteristics include:

- Extreme portability and ease of installation and use.
- A simple semantics of objects with associated descriptions
compatible with the relational model, object modelling methods,
and other data organizations such as OLAP.
- Multiple access paths and indices.
- Timestamp based concurrency control for safe concurrent database access.
- Commit/Rollback and recovery support.
- A variety of underlying storage implementations with configurable
features and
performance characteristics.
- No intrinsic database size limitations.

The package is intended to provide what you really need from a database
for most applications,
without the other stuff (among other goals).

General Technical Notes:
The xsdb package is implemented in Python, and the server mode requires
stackless python.
An xsdb database (not in server mode) will run using standard C Python
or Java Python (Jython).

Please have a look and give it a try. Thanks very much!

-- Aaron Watters [attempt 2]

===
Even in a perfect world where everyone is equal
I'd still own the movie rights and be working on the sequel
-- Elvis Costello "Every day I write the book"

Paul Rubin

unread,

Dec 10, 2003, 7:19:53 AM12/10/03

to

Aaron Watters <aa...@reportlab.com> writes:
> The xsdb package is implemented in Python, and the server mode
> requires stackless python.

What is it doing with stackless?

Istvan Albert

unread,

Dec 10, 2003, 10:10:44 AM12/10/03

to

Aaron Watters wrote:

> General Technical Notes:

> The xsdb package is implemented in Python, and the server mode requires
> stackless python.

Why?

IMHO this sort of limitations severely reduce any
project's potential.

Istvan.

A.M. Kuchling

unread,

Dec 10, 2003, 12:05:13 PM12/10/03

to

On Wed, 10 Dec 2003 10:10:44 -0500,
Istvan Albert <ZZZ-ial...@mailblocks-ZZZ.com-ZZZ> wrote:
>> The xsdb package is implemented in Python, and the server mode requires
>> stackless python.
> Why?

Probably because Stackless made it easier to write the server without
having to wrestle an async socket library such as Medusa or Twisted.

> IMHO this sort of limitations severely reduce any
> project's potential.

Yeah, but it's his code, so he can do whatever he likes.

Sheesh, Aaron announces a database system that looks really spiffy from the
examples, like ZODB without the pain of ExtensionClass, and the first two
responses are griping about one aspect of it. Sometimes folks don't know
when they're well-off. (Now if only he'd released it before I wrote the
PyCon proposal tracker using PostgreSQL...)

--amk

Istvan Albert

unread,

Dec 10, 2003, 1:26:14 PM12/10/03

to

A.M. Kuchling wrote:

> Sheesh, Aaron announces a database system that looks really spiffy from the
> examples, like ZODB without the pain of ExtensionClass, and the first two
> responses are griping about one aspect of it

But that is not just one aspect of it, it is a matter of the
most fundamental usability question, will it work on my system?

The answer is no, I have to install another version of python
to use it.

> Probably because Stackless made it easier to write the server without
> having to wrestle an async socket library such as Medusa or Twisted.

The question that needs to be answered is whether it would
be worth to wrestle with those rather than locking out
the vast majority of the potential users.

I have a lot of respect and admiration for everyone who
undertakes a project of this magnitude and complexity,
on the other hand I think that tying the project to a
python implementation that most of us have no compelling reason
to use will severely affect its overall impact. And it would
be a shame.

Istvan.

Skip Montanaro

unread,

Dec 10, 2003, 1:58:00 PM12/10/03

to Istvan Albert, pytho...@python.org

>> Probably because Stackless made it easier to write the server without
>> having to wrestle an async socket library such as Medusa or Twisted.

Istvan> The question that needs to be answered is whether it would be
Istvan> worth to wrestle with those rather than locking out the vast
Istvan> majority of the potential users.

I imagine Aaron provides the code (it is hosted on SF, after all). All you
need to do is port it to use Twisted or Medusa, then feed the diffs back to
Aaron. If it results in broader reach for xsdb without making the existing
code a nightmare to maintain, he'll probably fold it in.

Skip

Aaron Watters

unread,

Dec 10, 2003, 2:26:11 PM12/10/03

to

Istvan Albert <ZZZ-ial...@mailblocks-ZZZ.com-ZZZ> wrote in message news:<Y-ydnYsliZv...@giganews.com>...

Because it's the Right Way (tm) to do it :).

First let me emphasize that only the server layer
uses stackless at present.

I'm using stackless because (aside from the fact that
it was simplest way implement the functionality)
database concurrency control requires the following:

If a young transaction tries to read something written by
an old transaction which has not yet committed it must wait
until the old transaction decides to commit or abort.

In order to allow transactions to wait the options are:

1) Use an event loop and write the application "inside out",
much like a fortran 4 program attempting to emulate recursion.

2) Use operating system threads (which have very high overhead
and sometimes don't really work the same across different
platforms...)

3) Use stackless.

4) punt: automatically abort any transaction which needs
to wait.

As a first approach I went for (3) because it was easy. I don't
plan to do (1) because I treasure my sanity. I intend to
implement both (2) and (4) as server options before I call
xsdb a "beta", but I want to also keep the stackless version alive.

I'm still wishing that real stackless functionality will make it
into standard Python, but I also don't really understand the
deep implications.
-- Aaron Watters

===
I don't know if you've been loving somebody
I only know it isn't mine. -- Elvis Costello "Alison"

Paul Rubin

unread,

Dec 10, 2003, 3:33:59 PM12/10/03

to

aa...@reportlab.com (Aaron Watters) writes:
> In order to allow transactions to wait the options are:
>
> 1) Use an event loop and write the application "inside out",
> much like a fortran 4 program attempting to emulate recursion.

> ...

Maybe you could find some clever way to do it with Python generators.

Fredrik Lundh

unread,

Dec 11, 2003, 1:29:27 AM12/11/03

to pytho...@python.org

Aaron Watters wrote:

> ANNOUNCE: xsdb -- the eXtremely Simple Database goes alpha

now, how cool is this. hugunin and watters both reappear after
many years, on nearly the same day, both with stuff that shows
that they didn't really give up on Python hacking; they've just
been working on the perfect design...

> In order to allow transactions to wait the options are:
>
> 1) Use an event loop and write the application "inside out",
> much like a fortran 4 program attempting to emulate recursion.
>
> 2) Use operating system threads (which have very high overhead
> and sometimes don't really work the same across different
> platforms...)
>
> 3) Use stackless.
>
> 4) punt: automatically abort any transaction which needs
> to wait.

5) use an event loop and use a generator for the relevant code;
when you discover that you need to pause, yield to the framework.

</F>

Paul Rubin

unread,

Dec 11, 2003, 2:12:18 AM12/11/03

to

"Fredrik Lundh" <fre...@pythonware.com> writes:
> 5) use an event loop and use a generator for the relevant code;
> when you discover that you need to pause, yield to the framework.

This kind of design really could benefit from Raymond Hettinger's PEP
of a while back, proposing being able to pass parameters from the
caller back to the yield statement of a yielded generator.

Robin Becker

unread,

Dec 11, 2003, 4:15:00 AM12/11/03

to

In article <7xwu93g...@ruckus.brouhaha.com>, Paul Rubin <http@?.cx>
writes

I was just about to ask if generators allow for a stream like mechanism,
but obviously if we're not allowed to change the generator state then it
seems quite hard.
--
Robin Becker

Paul Rubin

unread,

Dec 11, 2003, 4:22:37 AM12/11/03

to

Robin Becker <ro...@jessikat.fsnet.co.uk> writes:
> I was just about to ask if generators allow for a stream like mechanism,
> but obviously if we're not allowed to change the generator state then it
> seems quite hard.

Well, there's always global state.

Aaron Watters

unread,

Dec 11, 2003, 8:38:57 AM12/11/03

to

regarding the use of stackless in
http://xsdb.sourceforge.net
"Fredrik Lundh" <fre...@pythonware.com> wrote in message news:<mailman.56.10711241...@python.org>...

> Aaron Watters wrote:
> > 5) use an event loop and use a generator for the relevant code;
> when you discover that you need to pause, yield to the framework.

Please tell me I'm missing something, but I don't think
this will really help. The problem is that I need to "yield"
or "suspend" or "send something across a channel" from about
45 places in the code some of which are arbitrarily deep into
multiple recursions. The generator thing will only allow
me to go one level deep into a single call -- no? By contrast
the stackless.channel mechanism is a far more general construct,
allowing me to "yield" at any point without restructuring the
code at all. Stackless rules.
-- Aaron Watters
===
I'm standing in the middle of the desert
waiting for my ship to come in
-- Sheryl Crow "Leaving Las Vegas"

Duncan Booth

unread,

Dec 11, 2003, 9:03:35 AM12/11/03

to

aa...@reportlab.com (Aaron Watters) wrote in
news:9a6d7d9d.03121...@posting.google.com:

> regarding the use of stackless in
> http://xsdb.sourceforge.net
> "Fredrik Lundh" <fre...@pythonware.com> wrote in message
> news:<mailman.56.10711241...@python.org>...
>> Aaron Watters wrote:
>> > 5) use an event loop and use a generator for the relevant code;
>> when you discover that you need to pause, yield to the framework.
>
> Please tell me I'm missing something, but I don't think
> this will really help. The problem is that I need to "yield"
> or "suspend" or "send something across a channel" from about
> 45 places in the code some of which are arbitrarily deep into
> multiple recursions. The generator thing will only allow
> me to go one level deep into a single call -- no? By contrast
> the stackless.channel mechanism is a far more general construct,
> allowing me to "yield" at any point without restructuring the
> code at all. Stackless rules.

The generator solution may not be appropriate for your task, but it isn't
entirely accurate to say that you can only go one level deep. You can (sort
of) yield from arbitrarily deep function nesting, or even from recursive
functions. The catch though is that you do have to write the code in a
slightly contorted manner in order to yield from below the first function.

The rule to follow is simply: any function which wants to yield, or which
calls a function that wants to yield has to be a generator and has to be
called from a 'for' loop which itself yields.

e.g. A generator that walks a tree recursively:

def inorder(t):
if t:
for x in inorder(t.left):
yield x
yield t.label
for x in inorder(t.right):
yield x

--
Duncan Booth dun...@rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?

Michael Hudson

unread,

Dec 11, 2003, 9:36:27 AM12/11/03

to

Or, if the generator is an instance method, instance state.

Cheers,
mwh

--
Academic politics is the most vicious and bitter form of politics,
because the stakes are so low. -- Wallace Sayre

Jp Calderone

unread,

Dec 11, 2003, 9:26:07 AM12/11/03

to pytho...@python.org

This works, but it is even easier. All you need is top-level code to
handle it:

def unroll(f, *a, **kw):
gstack = [iter(f(*a, **kw))]
while gstack:
try:
e = gstack[-1].next()
except StopIteration:
gstack.pop()
else:
if isinstance(e, types.GeneratorType):
gstack.append(e)
else:
yield e

def inorder(t):
if t:
yield inorder(t.left)
yield t.label
yield inorder(t.right)

unroll(inorder, t)

A bit more frameworky code, but it's all isolated in one place, which is
much nicer than having to spread it all over the place.

Jp

Duncan Booth

unread,

Dec 11, 2003, 10:50:07 AM12/11/03

to

Jp Calderone <exa...@intarweb.us> wrote in
news:mailman.68.10711528...@python.org:

> This works, but it is even easier. All you need is top-level code
> to
> handle it:
>
>
> def unroll(f, *a, **kw):
> gstack = [iter(f(*a, **kw))]
> while gstack:
> try:
> e = gstack[-1].next()
> except StopIteration:
> gstack.pop()
> else:
> if isinstance(e, types.GeneratorType):
> gstack.append(e)
> else:
> yield e
>
>
> def inorder(t):
> if t:
> yield inorder(t.left)
> yield t.label
> yield inorder(t.right)
>
> unroll(inorder, t)
>
>
> A bit more frameworky code, but it's all isolated in one place,
> which is
> much nicer than having to spread it all over the place.

Nice idea, provided you never want to yield a generator. Also should it
check for a generator, or just for any iterator.

You can also go for a recursive definition of unroll and use it to unroll
itself which I think reads a bit more clearly.

def unroll(iterator):
for v in iterator:
if isinstance(v, types.GeneratorType)
for inner in unroll(v): yield inner
else:
yield v

for node in unroll(inorder(t)):
... do whatever ...

I wonder if this is useful enough to go in itertools?

David Mertz, Ph.D.

unread,

Dec 11, 2003, 2:23:34 PM12/11/03

to Pythonistas

|Robin Becker <ro...@jessikat.fsnet.co.uk> writes:
|> I was just about to ask if generators allow for a stream like mechanism,
|> but obviously if we're not allowed to change the generator state then it
|> seems quite hard.

Paul Rubin <http://phr...@NOSPAM.invalid> wrote previously:

|Well, there's always global state.

But the state need not be global, just a mutable object yielded by a
generator. As I thought about this fact, I have come to find Raymond
Hettinger's proposals for enhancing simple generators less urgent (but I
probably still vote +1, though now moot).

>>> def echo():
... message = [None]
... while message[0] != "EXIT":
... yield message
...
>>> for mess in echo():
... if mess[0] is not None: print mess[0]
... mess[0] = raw_input("Word: ")
...
Word: foo
foo
Word: bar
bar
Word: EXIT

This is a toy example, but the point is that we are perfectly able to
pass data back into a generator without using global state.

--
---[ to our friends at TLAs (spread the word) ]--------------------------
Echelon North Korea Nazi cracking spy smuggle Columbia fissionable Stego
White Water strategic Clinton Delta Force militia TEMPEST Libya Mossad
---[ Postmodern Enterprises <me...@gnosis.cx> ]--------------------------

Aaron Watters

unread,

Dec 11, 2003, 7:10:05 PM12/11/03

to

Duncan Booth <dun...@NOSPAMrcp.co.uk> wrote in message news:<Xns944E8E75C59...@127.0.0.1>...

> aa...@reportlab.com (Aaron Watters) wrote in
> news:9a6d7d9d.03121...@posting.google.com:
>
> > regarding the use of stackless in
> > http://xsdb.sourceforge.net
> > "Fredrik Lundh" <fre...@pythonware.com> wrote in message
> > news:<mailman.56.10711241...@python.org>...
> >> Aaron Watters wrote:
> >> > 5) use an event loop and use a generator for the relevant code;
> >> when you discover that you need to pause, yield to the framework.
> >
> > Please tell me I'm missing something, but I don't think

> > this will really help. ...

>
> The generator solution may not be appropriate for your task, but it isn't

> entirely accurate to say that you can only go one level deep....

>
> The rule to follow is simply: any function which wants to yield, or which
> calls a function that wants to yield has to be a generator and has to be

> called from a 'for' loop which itself yields....

Yes. I see this would work. But this would then have to be
pervasive throughout my code -- and even in client code that
uses the xsdb code directly (but not from a remote client)....
yuck. No thanks :(. The acceptible options still are threads,
stackless, or punt.

-- Aaron Watters
===
How do zen masters walk through walls?
Doors.

Michele Simionato

unread,

Dec 12, 2003, 1:44:15 AM12/12/03

to

me...@gnosis.cx (David Mertz, Ph.D.) wrote in message news:<mailman.85.10711713...@python.org>...

> But the state need not be global, just a mutable object yielded by a
> generator. As I thought about this fact, I have come to find Raymond
> Hettinger's proposals for enhancing simple generators less urgent (but I
> probably still vote +1, though now moot).
>
> >>> def echo():
> ... message = [None]
> ... while message[0] != "EXIT":
> ... yield message
> ...
> >>> for mess in echo():
> ... if mess[0] is not None: print mess[0]
> ... mess[0] = raw_input("Word: ")
> ...
> Word: foo
> foo
> Word: bar
> bar
> Word: EXIT
>
> This is a toy example, but the point is that we are perfectly able to
> pass data back into a generator without using global state.

A more verbose but arguably more elegant way would be to wrap the
generator in a class. Let me repost some code I wrote some time ago.

"""An object-oriented interface to iterators-generators"""

class Iterator(object):
"""__gen__ is automatically called by __init__, so must have signature
compatibile with __init__. Subclasses should not need to override __init__:
you can do it, but you must do it cooperatively or, at least, ensure that
__gen__ is called correctly and its value assigned to self.iterator.
"""
def __init__(self,*args,**kw):
super(Iterator,self).__init__(*args,**kw)
self.iterator=self.__gen__(*args,**kw)
def __gen__(self,*args,**kw):
"Trivial generator, to be overridden in subclasses"
yield None
def __iter__(self):
return self
def next(self):
return self.iterator.next()

class MyIterator(Iterator):
def __gen__(self):
self.x=1
yield self.x # will be changed outside the class
yield self.x

iterator=MyIterator()

print iterator.next()
iterator.x=5
print iterator.next()

Wrapping the generator in the class, I can pass parameters to it (in
this case x). IOW, here the generator has an explicit "self" rather
than an implicit "__self__" as in the PEP. I am not sure if I like the
PEP, wouldn't be easier to have a built-in iterator class?

Michele

David Mertz, Ph.D.

unread,

Dec 12, 2003, 2:52:32 AM12/12/03

to Pythonistas

michele....@poste.it (Michele Simionato) wrote previously:

|A more verbose but arguably more elegant way would be to wrap the
|generator in a class. Let me repost some code I wrote some time ago.

I don't disagree, of course, with Michele's class-based approach. For
something fleshed out, his style lets you do a lot more with the
underlying class.

But there *is* something awfully elegant about the function-like
definition syntax of simple generators. Doing what I do in the simple
example--yielding a mutable object, and manipulating that object outside
the generator--feels very Pythonic to me. I use a list for this, but a
different mutable object would work similarly (a dictionary, instance,
shelve, etc.)

But obviously, keep both styles in mind; either might be a good
solution to a problem. It just depends on your needs and coding style.

Yours, David...

--
mertz@ | The specter of free information is haunting the `Net! All the
gnosis | powers of IP- and crypto-tyranny have entered into an unholy
.cx | alliance...ideas have nothing to lose but their chains. Unite
| against "intellectual property" and anti-privacy regimes!
-------------------------------------------------------------------------

Paul McGuire

unread,

Dec 16, 2003, 6:32:54 PM12/16/03

to

"Skip Montanaro" <sk...@pobox.com> wrote in message
news:mailman.43.10710826...@python.org...

Well, actually, retro-installing Stackless isn't all that difficult. I had
already upgraded to Python 2.3, but wanted to experiment with xsdb. I
installed Python2.2 in its own tree, downloaded Stackless, and copied the
compiled .libs into the Python 2.2 tree. Then went to the xsdb directory,
and wrote a short setup.bat file to tweak my PATH to look at Python2.2 ahead
of Python2.3, and then run the xsdb startup script. Now I can fire up an
xsdb server using Python2.2+Stackless, and then hit at it from my mainline
Python 2.3 client environment.

Since this Stackless requirement would only be needed on server machines,
this really shouldn't impair any serious deployment. xsdb has some very
interesting features that, if they work out, will be well worth this minor
inconvenience.

-- Paul

Christian Tismer

unread,

Dec 24, 2003, 8:53:42 AM12/24/03

to A.M. Kuchling, pytho...@python.org

A.M. Kuchling wrote:
...

> Sheesh, Aaron announces a database system that looks really spiffy from the
> examples, like ZODB without the pain of ExtensionClass, and the first two
> responses are griping about one aspect of it. Sometimes folks don't know
> when they're well-off. (Now if only he'd released it before I wrote the
> PyCon proposal tracker using PostgreSQL...)

Yes, too bad. By that, Stackless would sit in the PyCon
proposal tracker. (and I want to have it everywhere)

happy Xmas -- chris

--
Christian Tismer :^) <mailto:tis...@tismer.com>
Mission Impossible 5oftware : Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/
14109 Berlin : PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/

Paul Rubin

unread,

Dec 24, 2003, 9:10:09 AM12/24/03

to

Christian Tismer <tis...@tismer.com> writes:
> > Sheesh, Aaron announces a database system that looks really spiffy from the
> > examples, like ZODB without the pain of ExtensionClass, and the first two
> > responses are griping about one aspect of it. Sometimes folks don't know
> > when they're well-off. (Now if only he'd released it before I wrote the
> > PyCon proposal tracker using PostgreSQL...)
>
> Yes, too bad. By that, Stackless would sit in the PyCon proposal
> tracker. (and I want to have it everywhere)

I think I started the griping; the issue is having to install a special
Python implementation built around a feature with just one maintainer.
On technical grounds Stackless is great and I hope PyPy can implement
something like it.

Wolfgang Lipp

unread,

Dec 29, 2003, 10:37:41 AM12/29/03

to

hi,

thanks for some great ideas and implementations. i'm in the course of
testing out whether xsdb could be useful for my present project,
and i like it! one minor thing i've found is that when i create
a database, populate it, commit, and then re-open it, i get a nasty
xsdb.xsdbExceptions.RollBackError: write attempt after younger read
exception. this turned out to be avoidable by placing time.sleep(1)
between the commit and the re-open. i guess the culprit might be
on line 34 of xsdb\Connection.py: self.startTime = int(time()).
maybe this should be changed to sth like int(time()*TIME_GRANULARITY),
with the constant being a number in the hundreds or thousands?

_wolf

Aaron Watters <aa...@reportlab.com> wrote in message news:<mailman.25.10710548...@python.org>...

Aaron Watters

unread,

Jan 5, 2004, 1:26:20 PM1/5/04

to

li...@epost.de (Wolfgang Lipp) wrote in message news:<c0f3952d.03122...@posting.google.com>...
> hi,
>
> ....when i create

> a database, populate it, commit, and then re-open it, i get a nasty

> xsdb.xsdbExceptions.RollBackError....

Thank you very much for finding this problem!

Double apologies: Sorry that I made this bug and sorry that I didn't
respond to your message earlier. I decided take a different approach
than the one you suggested that fixes the problem systematically (and
required a lot of changes and testing).

Basically I hadn't considered
the possibility that a program might open two direct connections
to a database within the same minute (because my machine is so slow :( ).
If you do make very fast connections I added the ability to clear out
all the timestamp information using a special keyword argument to the
connection initializer. Here is the test case (fastconnect.py):

===
"test connecting and then reconnecting to a database quickly"

def test(filename="data/fastconnect.dat"):
import time
now = time.time()
from xsdb.KISSBaseHierarchy import xsdb
print "testing fast reconnection"
connection = xsdb(filename, startup=1)
t = connection.transaction()
t.add("object1", "this is the object")
t.commit()
t = connection.transaction()
t.add("object2", "this is the other object")
t.commit()
connection = None
# must clear the connection for fast reconnect!
connection = xsdb(filename, startup=0, clear=1)
print "fast reconnect completed in ", time.time()-now

if __name__=="__main__":
test()
===

If you omit the "clear=1" above you get the exception you reported.

I just uploaded a new file release which includes this and a number
of other bugfixes, enhancements, doc changes, etcetera. Get it at:

http://xsdb.sourceforge.net

-- Aaron Watters

===
You can't talk to a man
When he don't wanna understand -- Carole King

Aaron Watters

unread,

Jan 6, 2004, 9:31:54 AM1/6/04

to

aa...@reportlab.com (Aaron Watters) wrote in message news:<9a6d7d9d.04010...@posting.google.com>...

> Basically I hadn't considered
> the possibility that a program might open two direct connections

> to a database within the same minute...

whoops I meant "second" not "minute". This may not happen very
often because only one direct connection is permissable at any
given moment (but many indirect connections are okay).
-- Aaron Watters

===
TO INFINITY AND BEYOND