[This is a repost from a message to python-dev on the users' list, because python-dev-ers thought a broader discussion of this is a good idea.]
Last December, we had a short thread discussing (in python-dev) the integration of PySQLite into Python 2.4. At the time, I was against inclusion, because I thought PySQLite was not ripe for it, mostly because I thought the API was not stable.
Now, I have started writing a new PySQLite module, which has the following key features:
- Uses iterator-style SQLite 3.x API: sqlite3_compile, sqlite3_step() etc. This way, it is possible to use prepared statements, and for large resultsets, it requires less memory, because the whole resultset isn't fetched into memory at once any longer.
- Completely incompatible with the SQLite 0.x/1.x API: I'm free to create a much better API now.
- "In the face of ambiguity, refuse the temptation to guess." - PySQLite 1.x tries to "guess" which Python type to convert to. It's pretty good at it, because it queries the column type information. This works for, I'd say 90 % of all cases at least. But as soon as you use anything fancy like functions, aggregates or expressions in SQL, the _typeless_ nature of SQLite breaks through and it will tell us nothing about the declared column type (of course, because the data is not coming from a database column).
So I decided to change the default behaviour and make PySQLite typeless by default, too. Everything will be returned as a Unicode string (the default might be user-configurable per connection).
Unless, unless of course the user explicitly activates the "guess-mode" ;-) But to do so, she must read the docs then she will be aware of the fact that it only works in 90 % of all cases.
So why am I bothering you about this?
I think that a simple embedded relational database would be a good thing to have in Python by default. And as Python 2.5 won't happen anytime soon, there's plenty of time for developing it, getting it stable, and integrating it.
Especially those of you that have used PySQLite in the past, do you have any suggestions that would make the rewrite a better candidate for inclusion into Python?
One problem I see is that even the new PySQLite will grow and try to wrap much of the SQLite API that are not directly related to the DB-API. If such a thing is too complicated/big for the standard library, then maybe it would be better to produce a much simpler PySQLite, especially for the Python standard library that leaves all the fancy stuff out. My codename would be "embsql".
So, what would you like to see? "import sqlite", "import embsql", or "pypi.install('pysqlite')" ?
On Thu, 21 Oct 2004 10:13:56 +0200, Gerhard Haering <g...@ghaering.de> wrote:
(snip)
> I think that a simple embedded relational database would be a good > thing to have in Python by default. And as Python 2.5 won't happen > anytime soon, there's plenty of time for developing it, getting it > stable, and integrating it.
I think it's a great idea.
> One problem I see is that even the new PySQLite will grow and try to > wrap much of the SQLite API that are not directly related to the > DB-API. If such a thing is too complicated/big for the standard > library, then maybe it would be better to produce a much simpler > PySQLite, especially for the Python standard library that leaves all > the fancy stuff out. My codename would be "embsql".
I think it's important that we realise that if Python ships with a default database engine, its API will rapidly become the de-facto standard, eclipsing the DB-API if it is different in any way.
Which is not to say that the current DB-API should be used. There have been discussions elsewhere about improving and simplifying the DB-API - providing iterators, getting rid of all but one of the parameter styles, that kind of thing. Perhaps the best thing would be to agree a DB-API version 3.0 over on the DB SIG, then make sure that the built in module supports that.
> So, what would you like to see? "import sqlite", "import embsql", or > "pypi.install('pysqlite')" ?
I'd like to see a package into which you could plug different SQL database engines, with SQLLite as the default. So, you might do:
>>>>> "Gerhard" == Gerhard Haering <g...@ghaering.de> writes: > Last December, we had a short thread discussing (in python-dev) the > integration of PySQLite into Python 2.4. At the time, I was against > inclusion, because I thought PySQLite was not ripe for it, mostly > because I thought the API was not stable.
Thank you for PySQLite and congratulations for reaching the 1.0 milestone. I have only experimented with PySQLite so far but it's definitely in my toolbox for any future projects :-).
> I think that a simple embedded relational database would be a good > thing to have in Python by default. And as Python 2.5 won't happen > anytime soon, there's plenty of time for developing it, getting it > stable, and integrating it.
+1. bsddb is in there and SQLite definitely deserves to be included. A free license, no setup, easy to use - what more can you ask for :-).
> Especially those of you that have used PySQLite in the past, do you > have any suggestions that would make the rewrite a better candidate > for inclusion into Python?
I have not used it extensively to be able to get meaningful suggestions. My personal peeve is not with the PySQLite API but with DB-API 2.0. As some one (Alex Martelli?) mentioned in another thread, it's high time Python standardized on one (may be two) mandatory param styles.
> One problem I see is that even the new PySQLite will grow and try to > wrap much of the SQLite API that are not directly related to the > DB-API. If such a thing is too complicated/big for the standard > library, then maybe it would be better to produce a much simpler > PySQLite, especially for the Python standard library that leaves all > the fancy stuff out. My codename would be "embsql". > So, what would you like to see? "import sqlite", "import embsql", or > "pypi.install('pysqlite')" ?
Gerhard Haering wrote: > Especially those of you that have used PySQLite in the past, do you > have any suggestions that would make the rewrite a better candidate > for inclusion into Python?
great idea!
I'm currently in the planning phase of moving an application from a relational database to SQLite via PySQL. I think embedded, SQL capable databases are "the next big thing" there is just to much pain being felt by having to manage a big database when you don't really need them ....
> One problem I see is that even the new PySQLite will grow and try to > wrap much of the SQLite API that are not directly related to the > DB-API. If such a thing is too complicated/big for the standard > library, then maybe it would be better to produce a much simpler > PySQLite, especially for the Python standard library that leaves all > the fancy stuff out. My codename would be "embsql".
> So, what would you like to see? "import sqlite", "import embsql", or
I think import sqlite would be the sensible thing. Just as you said ... ... refuse the temptation to guess ...
Great idea and about time. Python needs some type of default SQL engine. I have not reviewed what you are doing but may I suggest you take a look at what VFP did with it's built-in data engine. VFP uses the concept of a cursor along with many built-in commands/functions/statements to work with the data.
My only question is what happens when SQLLite becomes out of favor i.e. TK is today?
John Fabiani <jfabi...@yolo.com> wrote: > Great idea and about time. Python needs some type of default SQL engine. I > have not reviewed what you are doing but may I suggest you take a look at > what VFP did with it's built-in data engine. VFP uses the concept of a > cursor along with many built-in commands/functions/statements to work with > the data.
> My only question is what happens when SQLLite becomes out of favor i.e. TK > is today?
Same as what happens when BSDDB "becomes out of favor", or SAX, etc, etc. Python either keeps supporting it (likely), moves it to a separate download rather than bundling it (sensible), or drops it (unlikely but not impossible) at some major release. We're not talking about stuff that's particularly "fickle": Tkinter did its job excellently well for MANY years, after all, to take your same example.
Gerhard Haering <g...@ghaering.de> wrote about adding a PySQLite module (and perhaps the DLL for windows) to the Python standard library.
Six months ago, I would have been against the inclusion of such a module. After having worked with SQL database for the last 5 months or so, I am now in favor of adding such a module.
While there are arguments against it, I believe that as long as we follow the DB API spec(s), and recommend that people actually read the documentation (with regards to table-level locking with ATTACH, translating to Python types, etc.), we can't really go too wrong.
> So, what would you like to see? "import sqlite", "import embsql", or > "pypi.install('pysqlite')" ?
import sqlite
Where all the standard DB API stuff gets regular sqlite.names, and any extra SQLite functionality gets sqlite._names .
On Thu, 21 Oct 2004 18:14:51 +0200, alea...@yahoo.com (Alex Martelli) wrote: >John Fabiani <jfabi...@yolo.com> wrote:
>> Great idea and about time. Python needs some type of default SQL engine. I >> have not reviewed what you are doing but may I suggest you take a look at >> what VFP did with it's built-in data engine. VFP uses the concept of a >> cursor along with many built-in commands/functions/statements to work with >> the data.
>> My only question is what happens when SQLLite becomes out of favor i.e. TK >> is today?
>Same as what happens when BSDDB "becomes out of favor", or SAX, etc, >etc. Python either keeps supporting it (likely), moves it to a separate >download rather than bundling it (sensible), or drops it (unlikely but >not impossible) at some major release. We're not talking about stuff >that's particularly "fickle": Tkinter did its job excellently well for >MANY years, after all, to take your same example.
ISTM there could be a useful compromise between inclusion and separate downloads. I.e., useful addons could be included in the core distribution as proxy stubs in an add-on package, so that you could interactively type e.g.,
from addons import sqlite as sql
and if you had not previously done that, the stub would ask you if you'd like to download and install the thing in question. You could cancel, or proceed, perhaps with various options. E.g. PIL might let you exclude jpg support if you didn't want it, and so forth. IIRC&UIAM the BSD ports system has this kind of functionality -- i.e., instead of full packages, there's automated download (or off CD) install info pre-packaged.
Options just to download and view release notes or to download docs only for looking over before installing all might be useful sometimes also.
Like wise, there could be an option to install under addons or site-packages. And I'm sure people would think of other handy stuff.
There would have to be security checks with md5's or pgp signatures etc., but IWT the aproach could be made to work. Official addons could live be anywhere, even with multiple mirrors, and different components used by higher level packages could have different urls, but the location urls, sizes, and hashes would be registered at python.org, so a stub would go there to find addon info for a particular version of python. This would also guarantee compatibility (or a notice that no compatible set of required parts is available).
Simon Brunning <simon.brunn...@gmail.com> wrote: > On Thu, 21 Oct 2004 10:13:56 +0200, Gerhard Haering <g...@ghaering.de> wrote:
> (snip)
> > I think that a simple embedded relational database would be a good > > thing to have in Python by default. And as Python 2.5 won't happen > > anytime soon, there's plenty of time for developing it, getting it > > stable, and integrating it.
> I think it's a great idea.
Seconded.
> > One problem I see is that even the new PySQLite will grow and try to > > wrap much of the SQLite API that are not directly related to the > > DB-API. If such a thing is too complicated/big for the standard > > library, then maybe it would be better to produce a much simpler > > PySQLite, especially for the Python standard library that leaves all > > the fancy stuff out. My codename would be "embsql".
> I think it's important that we realise that if Python ships with a > default database engine, its API will rapidly become the de-facto > standard, eclipsing the DB-API if it is different in any way.
Good point.
> Which is not to say that the current DB-API should be used. There have > been discussions elsewhere about improving and simplifying the DB-API > - providing iterators, getting rid of all but one of the parameter > styles, that kind of thing. Perhaps the best thing would be to agree a > DB-API version 3.0 over on the DB SIG, then make sure that the built > in module supports that.
Gerhard Haering wrote: > [This is a repost from a message to python-dev on the users' list, > because python-dev-ers thought a broader discussion of this is a good > idea.]
> Last December, we had a short thread discussing (in python-dev) the > integration of PySQLite into Python 2.4. At the time, I was against > inclusion, because I thought PySQLite was not ripe for it, mostly > because I thought the API was not stable.
> Now, I have started writing a new PySQLite module, which has the > following key features:
> - Uses iterator-style SQLite 3.x API: sqlite3_compile, sqlite3_step() > etc. This way, it is possible to use prepared statements, and for > large resultsets, it requires less memory, because the whole > resultset isn't fetched into memory at once any longer.
> - Completely incompatible with the SQLite 0.x/1.x API: I'm free to > create a much better API now.
> - "In the face of ambiguity, refuse the temptation to guess." - > PySQLite 1.x tries to "guess" which Python type to convert to. It's > pretty good at it, because it queries the column type information. > This works for, I'd say 90 % of all cases at least. But as soon as > you use anything fancy like functions, aggregates or expressions in > SQL, the _typeless_ nature of SQLite breaks through and it will tell > us nothing about the declared column type (of course, because the > data is not coming from a database column).
> So I decided to change the default behaviour and make PySQLite > typeless by default, too. Everything will be returned as a Unicode > string (the default might be user-configurable per connection).
> Unless, unless of course the user explicitly activates the > "guess-mode" ;-) But to do so, she must read the docs then she will > be aware of the fact that it only works in 90 % of all cases.
> So why am I bothering you about this?
> I think that a simple embedded relational database would be a good > thing to have in Python by default. And as Python 2.5 won't happen > anytime soon, there's plenty of time for developing it, getting it > stable, and integrating it.
> Especially those of you that have used PySQLite in the past, do you > have any suggestions that would make the rewrite a better candidate > for inclusion into Python?
> One problem I see is that even the new PySQLite will grow and try to > wrap much of the SQLite API that are not directly related to the > DB-API. If such a thing is too complicated/big for the standard > library, then maybe it would be better to produce a much simpler > PySQLite, especially for the Python standard library that leaves all > the fancy stuff out. My codename would be "embsql".
> So, what would you like to see? "import sqlite", "import embsql", or > "pypi.install('pysqlite')" ?
> -- Gerhard
"import sqlite"
Do we really have to wait unitil Python 2.5 to get it? ;-)
On Fri, 22 Oct 2004 00:33:34 +0200, Alex Martelli <alea...@yahoo.com> wrote: > Young and idealistic, I assume. Care to name two more SQL engines > accepting exactly the same dialect as sqlite...?-)
Err, well, idealistic, perhaps. ;-)
In practice, as you say, you often have to tweak your SQL for different engines. But you really can minimise this by sticking to the basics. Joe Celko's "I Will Never Have To Port This Code: Debunking shortcuts and SQL myths"[1] is worth a look, but I can speak from experience here. I'm working on a system where we mirror parts of an iSeries DB2 database down to SQL Server. I can and do write even fairly complex queries that work on both.
On Fri, Oct 22, 2004 at 12:33:34AM +0200, Alex Martelli wrote: > Simon Brunning <simon.brunn...@gmail.com> wrote: > > Which is not to say that the current DB-API should be used. There have > > been discussions elsewhere about improving and simplifying the DB-API > > - providing iterators, getting rid of all but one of the parameter > > styles, that kind of thing. Perhaps the best thing would be to agree a > > DB-API version 3.0 over on the DB SIG, then make sure that the built > > in module supports that.
> Amen, hallelujah.
I'm all for it, but it's depressing that talks about it have been on the DB-SIG mailing list since years. The question is, if the DB-API 3.0 should be a usable interface for application programming itself, or if it should still only be a driver API where higher-level modules need to be built on top to make it actually useful and enjoyable. Currently, people either do that, swear and use stuff like the crappy index-based access for columns, or use nonstandard extensions to the DB-API like dictfetch() methods or PgResultSet in PySQLite/pyPgSQL.
If we really want this solved, we can try further discussion on the DB-SIG list, or maybe we could get a DB-API 3.0 sprint organized? I don't know if an IRC meeting would get us further faster, but it's worth a try.
If all else fails, and consensus cannot be reached, then IMO it's time to fork and see if the courageous or the (extremely) conservative approach is accepted better by the community.
> > [...] But you if you were using another engine, you'd only need to > > change this to:
Simon Brunning <simon.brunn...@gmail.com> wrote: > On Fri, 22 Oct 2004 00:33:34 +0200, Alex Martelli <alea...@yahoo.com> wrote: > > Young and idealistic, I assume. Care to name two more SQL engines > > accepting exactly the same dialect as sqlite...?-)
> Err, well, idealistic, perhaps. ;-)
> In practice, as you say, you often have to tweak your SQL for > different engines. But you really can minimise this by sticking to the > basics. Joe Celko's "I Will Never Have To Port This Code: Debunking > shortcuts and SQL myths"[1] is worth a look, but I can speak from
Tx, I love Celko's writing (and speaking) and I'll be sure to read it.
> experience here. I'm working on a system where we mirror parts of an > iSeries DB2 database down to SQL Server. I can and do write even > fairly complex queries that work on both.
> Eventually.
My hat's off to you. In my experience the only decent solution is always, in the end, to write a "RDBMS engine portability layer" with "higher level" operations mapping down to subtly different (or sometimes not so subtly) SQL code. But maybe this just means I'm not as good as I thought at writing good, portable, efficient SQL...
Gerhard Haering <g...@ghaering.de> writes: > So, what would you like to see? "import sqlite", "import embsql", or > "pypi.install('pysqlite')" ?
apt-get install python-sqlite ;-)
I would like to see sqlite in python, the inclusion of sqlite in php was greatly apreciated. I often use it for prototyping.
But for new users who will learn with sqlite, they will be disapointed when they will want to use an other database, because of different paramstyle, different datetime object... So i think before it could be better to improve dbapi for user application side, and make pysqlite a reference of implementation to follow (one paramstyle, return python datetime, fetchdict...). If not it shall not be include imho.
We have the same problem with the web frameworks and server. I would like to see a wsgi equivalent for database. To can plug any framework on any database.
But when we look at the db-sig ml, it seems that there is a lot of work to achieve this... btw guido was pushing to return python datetime.