Server-side cursors in psycopg2

75 views
Skip to first unread message

Jeremy Dunck

unread,
Mar 13, 2007, 12:15:59 AM3/13/07
to django-d...@googlegroups.com
Over on the psycopg2 list, I saw a mention that psycopg2 supports
optional server-side cursors, which is good for client-side memory
usage but requires more network round-trips to fetch the whole
resultset. (Apparently, by default, it fetches all results into the
client process on the first fetch.)

The short version is that when you do this:

cur = conn.cursor(name=spam)
cur.execute('select "eggs"')
This is sent to the server:
'DECLARE spam CURSOR WITHOUT HOLD FOR select "eggs";'

Is there any interest in supporting this in QuerySet when using
postgres? If so, I'll add it (and benchmarking) to my todo list.

Cheers,
Jeremy

Malcolm Tredinnick

unread,
Mar 14, 2007, 3:51:39 AM3/14/07
to django-d...@googlegroups.com
Hey Jeremy,

I'd be interested to see some performance numbers for this (with a
remote server, since network traffic times will be a non-zero portion of
the result in some cases). It probably shouldn't be too painful to hook
it up to use server-side cursors always and just time a few things.

I hadn't realised that existing cursors pulled everything over at once
-- I would have "guessed" (based on no information other than "that's
the way I would think of doing it") that they would pull a chunk of
results each time to find a balance between efficiency and memory usage.

I guess there's a bunch of questions to ask about how this performs,
such as whether destroying the cursor object correctly notifies the
server that the result can now be freed (otherwise you've transferred
any memory problems you might have to the server). It sounds interesting
for some high-volume cases, though.

Cheers,
Malcolm

Paul Boddie

unread,
Mar 14, 2007, 10:28:35 AM3/14/07
to Django developers
On 14 Mar, 08:51, Malcolm Tredinnick <malc...@pointy-stick.com> wrote:
>
> I hadn't realised that existing cursors pulled everything over at once
> -- I would have "guessed" (based on no information other than "that's
> the way I would think of doing it") that they would pull a chunk of
> results each time to find a balance between efficiency and memory usage.

In fact, pyPgSQL typically uses server-side cursors automatically, as
far as I recall. In contrast, psycopg2 requires an explicit cursor
name in order to make use of them, as described earlier. Since this
breaks the illusion of DB-API compatibility, I've made a rough patch
to psycopg2 which can be found here:

http://www.initd.org/tracker/psycopg/ticket/158

> I guess there's a bunch of questions to ask about how this performs,
> such as whether destroying the cursor object correctly notifies the
> server that the result can now be freed (otherwise you've transferred
> any memory problems you might have to the server). It sounds interesting
> for some high-volume cases, though.

Transferring the entire result set from the server is highly
undesirable for various parts of an application I am working on. It
was only when I saw that my client process was using 1.5GB that I
investigated and discovered such differences between the cursor
policies of the different database modules, but that information is
critical if you plan to work with more than a few thousand rows at a
time. My interpretation of the DB-API was that fetchone and fetchmany
did precisely what they suggest, not a fetchall behind the scenes, but
I'm clearly not in agreement with everyone on that topic.

Paul

Reply all
Reply to author
Forward
0 new messages