on the future of Schevo - feedback solicited

4 views

Skip to first unread message

Matthew Scott

unread,

Sep 14, 2009, 7:42:31 PM9/14/09

to Schevo List

Hello all,

First off, I'd like to say two things:

1) Schevo, while dormant, is not being abandoned.

2) Thank you to everyone who currently uses Schevo, or has used Schevo in the past.

What follows are some honest thoughts about Schevo that I've been collecting over the last few months.

I'd appreciate any and all feedback

Here's an "executive summary":

- Schevo works well for single-process, single-thread GUI apps...

- ...but it sucks for web apps and distributed apps.

- Schevo still has a great data modeling language, and I miss using it.

- For wider adoption, it needs to support concurrency much better!

- Schevo 4 may mean a rewrite, much like Schevo 3 was.

- The data modeling language and schema evolution simplicity stays, perhaps w/ minor tweaks.

- Thankfully, several hundred tests cover the basis.

- Does anyone have an interest in helping? :)

- Alternatively, what is a good alternative when Schevo 3 doesn't meet the needs?

**Where does Schevo work?**

Schevo can work wherever there is a need for a robust structured data store for a single-process, single-threaded Python app.

Historically, Schevo was directly developed to handle the needs of engineering-oriented GUI apps, where data is entered and manipulated by one user at a time on a desktop or laptop.

In this manner, Schevo was used as a "structured file format", and not as a "database server". The app would load a database file as if it were a document on disk. Opening the same file in another copy of the app, or with a Python shell, would give a file locking error.

Looked at from another perspective, Schevo has a great data modeling language: it works well for rapidly prototyping an app, it's easy to write tests against, and it's robust enough to support complex scenarios such as the analysis of large electrical circuits.

**Where is Schevo lacking?**

Why am I not using Schevo now?

It's simple: the web wins, and distributed computing wins.

The projects that I am involved with personally and professionally do not cross-over with Schevo in terms of concurrency. However, they typically involve a data model for which a relational or relational-like style would work. Usually you can fit them soundly into the SQL world.

For me, as a principal author of Schevo, this is terribly unfortunate. :) I like to stay as pragmatic as possible, but I believe that many problem domains could be handily and comfortably tackled with Schevo, if only it had a better concurrency story!

**How could Schevo be more relevant?**

To be fair, the design of Schevo limits it to the "ACID" compliant side of things, so we must keep that in mind when designing its future.

I've considered two ways of approaching the concurrency problem:

- Use Durus client/server model

- Durus server is in a separate process or separate thread

- Multiple Schevo processes/threads access Durus via sockets

- Would likely need some rewriting of the database core to be thread-safe

- Reposition Schevo as a layer atop SqlAlchemy

- Specifically sqlite, so we can keep relying on "batteries included"

- Get benefits of SqlAlchemy for sanely handling transactions

- Schevo schema would more or less map to SQL

- other tools could inspect data stored by Schevo

- transaction execution & restrictions would still be enforced with Python

- Some indexing limitations on complex fields that would be stored as blobs or JSON objects

I really think that if Schevo could stop saying "no" to multi-thread or multi-process apps, and to some extent say "yes" to easier inspection of its data by other tools, it could have a larger audience and more relevancy.

**Does Schevo need to be more relevant?**

Is there even a need to further develop Schevo? Should it get a proper 3.1 release, with 3.1.x releases for bug fixes and slowly-but-surely improved documentation?

There seem to be several packages out in the SqlAlchemy world that could be pieced together to achieve the same thing, but I haven't found anything that offers all of these tools in one tidy set:

- data modeling w/ minimal boilerplate

- unit testing w/ minimal boilerplate

- schema migration management

- developer command line tools

So I don't know if it's best to look at how to bring the design aesthetics of Schevo to those other projects, or if it'd be better to codify those aesthetics in a "Schevo v4.0".

**Call for help**

I think Schevo is pretty nifty, and if there is anyone willing to help me help it out more, I'd love to increase its activity.

FYI: I'm leaning toward Schevo 4 as Schevo rewritten as a layer atop SqlAlchemy as noted above. I've even created a repository for it, just no code as of yet: http://github.com/11craft/schevoalchemy/

Finally, for what it's worth, here is a snapshot of my current TODO list for Schevo:

- all schevo projects to buildout

- integrate coverage 3 with buildbots

- schevo under cpython 3.1

- schevo under pypy testing

- Update schevo to use __dir__ as per Python 2.6

- for tutorial use allow new entity classes to be attached to a schema without quoting with a screen

- schevo project status matrix

- schevo project docs to sphinx

- schevo unit testing docs

- release schevo 3.1

- Screencasts

- Installation

- Windows

- Ubuntu

- OSX

- History

- History of Schevo

- Why I Like Schevo

- Compare other apps' ORM models to Schevo models

- Clean up schevo.org home page

- Research reentrant data storage solutions

- Create a decent web navigator!

- Create a modern PyQt 4 take on navigator and Schevo-aware widgets.

Thanks for your time!

--
Matthew R. Scott

Etienne Robillard

unread,

Sep 14, 2009, 9:07:44 PM9/14/09

to sch...@googlegroups.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Matthew,

I've decided to use Schevo to get away from SQLAlchemy. I still like
the fact that Schevo is a very Pythonic database, thanks to Durus. :)

Could we not try making Schevo 3-4 thread-safe ? Moreover, what benefits
would switching to SQLAlchemy gives besides multi-threading support ?

If Python supports multi-thread applications out-of-the-box, why would
only multi-thread support in SQLAlchemy works out ?

I think of Schevo has a remarquable piece of technology. To my opinion,
it has nothing to be ashamed of in terms of missing features compared to
SQLAlchemy. Besides, SQLAlchemy has different goals...

For example, I'm planning to use Schevo to create a set of tools for
schema migration. Using Schevo allows me migrate the schema while
recording changes in an internal db file. To use SQLAlchemy for this
task would be horrible and counter-productive. I don't need to serve
multiple concurrent requests, since the application is not web-based.

Cheers,

Etienne

- --
Etienne Robillard <robillar...@gmail.com>
Green Tea Hackers Club <http://gthc.org/>
Blog: <http://gthc.org/blog/>
PGP Fingerprint: 178A BF04 23F0 2BF5 535D 4A57 FD53 FD31 98DC 4E57
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkqu6OAACgkQ/VP9MZjcTldqPACfdcPPWDriFzxWHINHZXUdn9rU
nWEAn2akYq2ZrJqAj5mBXCiWWflYPchQ
=rB+v
-----END PGP SIGNATURE-----

Matthew Scott

unread,

Sep 14, 2009, 10:53:09 PM9/14/09

to sch...@googlegroups.com

Etienne,

Thank you for your prompt response. I knew there were current Schevo users out there, but it must work so well for its current niche that we don't need to discuss much at the moment. :)

On Mon, Sep 14, 2009 at 18:07, Etienne Robillard <robillar...@gmail.com> wrote:

I've decided to use Schevo to get away from SQLAlchemy. I still like
the fact that Schevo is a very Pythonic database, thanks to Durus. :)

Good to hear.

Could we not try making Schevo 3-4 thread-safe ? Moreover, what benefits
would switching to SQLAlchemy gives besides multi-threading support ?

If Python supports multi-thread applications out-of-the-box, why would
only multi-thread support in SQLAlchemy works out ?

A few ideas popped into my head about how to make Schevo 3 thread-safe, or at least thread-capable, after I read your email. I'll go into those after I've written some notes about them.

Pros of switching to SQLAlchemy:

- data could be read (and, carefully, written) by apps not using Schevo or even Python

- sqlite would offer a developer/small-scale, single-machine, multiple-process solution

- also supporting postgres would offer ability to scale across machines

Cons of switching to SQLAlchemy:

- more difficulty, and probably some concessions, with supporting arbitrarily-complex field types (especially when they are used as unique keys or in an index)

- significant effort would be needed to translate Schevo schema and operations to relational algebra, quite different than translating it to the Durus object store. SqlAlchemy would ease this, but not make it magical. :)

- not file-compatible with Durus, of course.

I think of Schevo has a remarquable piece of technology. To my opinion,
it has nothing to be ashamed of in terms of missing features compared to
SQLAlchemy. Besides, SQLAlchemy has different goals...

My personal goal with Schevo is: create prototype apps quickly, then refine them directly into production apps.

This has worked with Single-user apps. Schevo 2 was built in tandem with the first app it powered. From very few lines of code, we deployed a production-ready product from vague requirement specs after about 4 different prototypes and 6 months of calendar time. :-) Schevo 3 was a rewrite of the initial effort to handle larger databases.

But this agility doesn't yet work to build web apps that can reliably be deployed and handle multiple users. :(

For example, I'm planning to use Schevo to create a set of tools for
schema migration. Using Schevo allows me migrate the schema while
recording changes in an internal db file. To use SQLAlchemy for this
task would be horrible and counter-productive. I don't need to serve
multiple concurrent requests, since the application is not web-based.

Again, good to hear about what others are using Schevo for.

Perhaps we could shoot for this:

- Schevo 3.1 to support *some* sort of reasonable thread and process concurrency using Durus. This would keep impact to current users minimal, and also open up Schevo for use with the low-to-medium traffic web apps I'd like to create.

- Schevo4 to be, for now, an experimental rewrite of Schevo as a layer atop SqlAlchemy and Sqlite/Postgresql. Many Schevo databases could be migrated to Schevo4 and retain their same data and behavior but allow for the Sqlite-->PostgreSQL concurrency scalability path.

As noted above, I'll be posting again shortly with some thoughts on making our multi-process story better.