Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
pyodbc Cursor object
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  10 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Tzur Turkenitz  
View profile  
 More options Oct 24 2011, 4:11 am
From: Tzur Turkenitz <tz...@vision.bi>
Date: Mon, 24 Oct 2011 01:11:11 -0700 (PDT)
Local: Mon, Oct 24 2011 4:11 am
Subject: pyodbc Cursor object
Hi,
I have a question regarding the implementation of python database API.
I have read the API documentation, but it wasn't very helpful and I
was hoping for some more information.

My question is regarding the Cursor object in pyodbc.
I know that when I execute "fetchmany" I return the remaining rows in
the result set into memory (can be dangerous when the result set is
large).
but what happens when I execute a cursor, where is the result set is
kept, and If I iterate over the cursor returning one row at a time
what are the performance issues I should consider.

Thanks!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Michael Kleehammer  
View profile  
 More options Oct 25 2011, 1:05 pm
From: Michael Kleehammer <mkleeham...@gmail.com>
Date: Tue, 25 Oct 2011 10:05:47 -0700 (PDT)
Local: Tues, Oct 25 2011 1:05 pm
Subject: Re: pyodbc Cursor object

The memory is maintained by the driver, so you would have to reference your
driver's documentation.  Most seem to read a set of rows in a tightly
compressed format and buffer them in memory or on disk.  As you read rows,
pyodbc formats them as Python objects and the driver discards them.  When
it's buffer runs low, it requests more.

From a memory point-of-view, your best optimizations are always reducing
the number of rows and columns you read.  To reduce RAM, you can read
either 1 row at a time, or you can use fetchmany to read them in batches.
 The performance difference will simply be the number of times the fetch
function is called and some of its preliminary checks before it starts
formatting results.  This is usually negligible, but timing is the only way
to know for sure.  If one row at a time is noticeably slower, simply
increase your fetchmany() batch size.

Note that the  fetchall() function is the one that reads all remaining
rows, not fetchmany(), so make sure you are reading them all.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dharol Tankersley  
View profile  
 More options Oct 27 2011, 11:43 am
From: Dharol Tankersley <dtankers...@schipul.com>
Date: Thu, 27 Oct 2011 08:43:18 -0700 (PDT)
Local: Thurs, Oct 27 2011 11:43 am
Subject: Re: pyodbc Cursor object
I am fairly new to python and relational databases and struggling with
memory errors.
Accessing an MSSQL database using python 2.7.

To reduce RAM i am
1. pulling only two columns in a query
2. limiting the query to 100 rows
3. deleting the cursor query
4. closing the cursor

Somehow the cursor (or something else?) is retaining data, because it
successfully executes the first 100-row query and fails on the second
100-row query.

SCRIPT:
    invfields = ['invoiceid','datename(q,tenderdate)']

    for field in range(1,len(invfields)):
            thisdata = ();
cursor.close(); cursor = pyodbcconn.cursor()
            thisquery = 'SELECT ' + invfields[0] + ', ' +
invfields[field] + ' FROM invoices WHERE invoiceid < 100'
            print thisquery
            cursor.execute(thisquery);
            thisdata = cursor.fetchall()
            deleted = cursor.execute(thisquery); print thisdata

On Oct 25, 12:05 pm, Michael Kleehammer <mkleeham...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dharol Tankersley  
View profile  
 More options Oct 27 2011, 11:56 am
From: Dharol Tankersley <dtankers...@schipul.com>
Date: Thu, 27 Oct 2011 08:56:00 -0700 (PDT)
Subject: Re: pyodbc Cursor object
hello.
i am fairly new to python and relational databases, struggling with
memory errors using cursor.execute to query MSSQL database. Using
python 2.7

To save ram i am:
1. querying only two columns from the table at a time
2. limiting query to 100 rows
3. deleting cursor query
4. closing cursor

the cursor (or something else?) must still be retaining data though,
because it successfully executes the first 100-row query and fails on
the second 100-row query.

#SCRIPT:
    chunksize = 100; itemEND = 100; rowST = 1; rowMAX = 1000
    invfields = ['invoiceid','datename(q,tenderdate)']

    for field in range(1,len(invfields)):
        while (itemEND <= itemMAX):
            thisdata = ();
            cursor.close();
            cursor = pyodbcconn.cursor()

            thisquery = 'SELECT ' + invfields[0] + ', ' +
invfields[field] + ' FROM invoices \
                WHERE invoiceid >= ' + str(itemST) + ' AND invoiceid <
' + str(itemEND)
            print thisquery
            cursor.execute(thisquery);
            thisdata = cursor.fetchall()
            deleted = cursor.execute(thisquery);
            print itemST, itemEND

            itemST = itemST + chunksize; itemEND = itemEND +
chunksize;

#OUTPUT & ERROR:
    SELECT invoiceid, datename(q,tenderdate) FROM invoices WHERE
invoiceid >= 1 and invoiceid < 100
    1 100
    SELECT invoiceid, datename(q,tenderdate) FROM invoices WHERE
invoiceid >= 101 and invoiceid < 200
    101 200

    Traceback (most recent call last):
    File "/Users/dharol/eclipse/workspace/mongodb/invoicequery.py",
line 48, in <module>
    thisdata = cursor.fetchall()
    MemoryError

Any help, guidance towards alternatives to cursors, or more effective
means of using cursors with mssql, greatly appreciated.
-d


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
ashaner  
View profile  
 More options Oct 27 2011, 1:54 pm
From: ashaner <asha...@chumpland.org>
Date: Thu, 27 Oct 2011 10:54:52 -0700 (PDT)
Local: Thurs, Oct 27 2011 1:54 pm
Subject: Re: pyodbc Cursor object
Your code is kind of confusing because it can't be what you're
actually running, e.g.,, it has syntax errors and you're calling
execute twice in the loop. Can you post the actual code?

You can iterate over the cursor instead of using fetchall, or use
fetchmany. I wouldn't think closing the cursor each iteration saves
much memory.

Did you read the Wiki on cursors?
http://code.google.com/p/pyodbc/wiki/Cursor

On Oct 27, 11:56 am, Dharol Tankersley <dtankers...@schipul.com>
wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
dmoney  
View profile  
 More options Oct 27 2011, 2:54 pm
From: dmoney <dtankers...@schipul.com>
Date: Thu, 27 Oct 2011 11:54:19 -0700 (PDT)
Local: Thurs, Oct 27 2011 2:54 pm
Subject: Re: pyodbc Cursor object
Apologies. Below is the actual code, as well as an additional attempt
with fetchone instead of fetchall. I did read the wiki on cursors,
have tried fetchmany, fetchone, closing and deleting cursors, even a
generator for iterating over cursor results (http://
code.activestate.com/recipes/137270-use-generators-for-fetching-large-
db-record-sets/).

All efforts yeild the same memory problem, leading me to believe that
somehow i am not clearing the cursor as i loop.

Thank you again for your help. -d

#I. ATTEMPT WITH CLOSE/ DELETE CURSOR
    import pyodbc
    from pymongo.collection import Collection
    from pymongo import Connection; mongoconn =
Connection('localhost')
    mongodb = mongoconn.t4xquery;
    url =
'DSN=NTSERVER17;UID=username;PWD=password;PORT=1433;DATABASE=dw_cmh'
    pyodbcconn = pyodbc.connect(url); cursor = pyodbcconn.cursor()
    thiscollection = Collection(mongodb, 'invoices')

    chunksize = 100; rowST = 1; rowEND = 100; rowMAX = 1000
    invfields = ['invoiceid','datename(q,tenderdate)']

    for field in range(1,len(invfields)):
        while (rowEND <= rowMAX):
            thisdata = ();
            cursor.close();
            cursor = pyodbcconn.cursor()
            thisquery = 'SELECT ' + invfields[0] + ', ' +
invfields[field] + ' FROM invoices WHERE invoiceid >= ' + str(rowST) +
' AND invoiceid < ' + str(rowEND)
            print thisquery

            cursor.execute(thisquery);
            thisdata = cursor.fetchall()
            deleted = cursor.execute(thisquery);
            print rowST, rowEND

            rowST = rowST + chunksize; rowEND = rowEND + chunksize;

#II. OUTPUT FROM CLOSE/ DELETE CURSOR ATTEMPT:
    SELECT invoiceid, datename(q,tenderdate) FROM invoices WHERE
invoiceid >= 1 AND invoiceid < 100
    1 100
    SELECT invoiceid, datename(q,tenderdate) FROM invoices WHERE
invoiceid >= 101 AND invoiceid < 200
    Traceback (most recent call last):
    File "/Users/dharol/eclipse/workspace/mongodb/shortexecute.py",
line 26, in <module>
      thisdata = cursor.fetchall()
    MemoryError

#III. ATTEMPT WITH FETCHONE:
    import pyodbc
    from pymongo.collection import Collection
    from pymongo import Connection; mongoconn =
Connection('localhost')
    mongodb = mongoconn.t4xquery;
    url =
'DSN=NTSERVER17;UID=username;PWD=password;PORT=1433;DATABASE=dw_spegcs'
    pyodbcconn = pyodbc.connect(url); cursor = pyodbcconn.cursor()
    thiscollection = Collection(mongodb, 'invoices')

    invfields = ['invoiceid','datename(q,tenderdate)']

    for field in range(1,len(invfields)):
        thisdata = ();
        cursor.close();
        cursor = pyodbcconn.cursor()
        thisquery = 'SELECT ' + invfields[0] + ', ' + invfields[field]
+ ' FROM invoices'
        cursor.execute(thisquery);
        print thisquery

        while (1):
            thisdata = cursor.fetchone()
            if thisdata == None: break
            print thisdata

#IV. OUTPUT FROM FETCHONE ATTEMPT:
    SELECT invoiceid, datename(q,tenderdate) FROM invoices
    (68797, u'3')
    (68798, u'3')
    (68799, u'3')
    Traceback (most recent call last):
    File "/Users/dharol/eclipse/workspace/mongodb/shortexecute.py",
line 53, in <module>
      thisdata = cursor.fetchone()
    MemoryError
    (68800, u'3')
    (68801, u'3')
    (68802, u'3')
    (68803, u'3')
    (68804, u'3')
    (68805, u'3')
    (68806, u'3')
    (68807, u'3')


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
ashaner  
View profile  
 More options Oct 27 2011, 4:09 pm
From: ashaner <asha...@chumpland.org>
Date: Thu, 27 Oct 2011 13:09:37 -0700 (PDT)
Local: Thurs, Oct 27 2011 4:09 pm
Subject: Re: pyodbc Cursor object
Not sure if you specified before, but you're using the latest versions
of pyodbc and your drivers? If you just simplify it down to a cleaner
example does it still give you the error? The following code does the
same thing as your second example:

import pyodbc
url =
'DSN=NTSERVER17;UID=username;PWD=password;PORT=1433;DATABASE=dw_spegcs'
pyodbcconn = pyodbc.connect(url)
cursor = pyodbcconn.cursor()
thisquery = 'SELECT invoiceid, datename(q,tenderdate) FROM invoices'
cursor.execute(thisquery)
print thisquery
while (1):
    thisdata = cursor.fetchone()
    if thisdata is None: break
    print thisdata

If that code gives you an error, then I'll defer to mkleehammer...

On Oct 27, 2:54 pm, dmoney <dtankers...@schipul.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
dmoney  
View profile  
 More options Oct 27 2011, 4:32 pm
From: dmoney <dtankers...@schipul.com>
Date: Thu, 27 Oct 2011 13:32:00 -0700 (PDT)
Subject: Re: pyodbc Cursor object
thank you for your response.

1. I am using pyodbc 2.1.13 (from git+git://github.com/mkleehammer/
pyodbc.git  )
and freeTDS 0.91

2. running your simplified code I still get the memory error:

    --------------------
    SELECT invoiceid, datename(q,tenderdate) FROM invoices
    MemoryError
    (68797, u'3')
    (68798, u'3')
    (68799, u'3')
    (68800, u'3')
    (68801, u'3')
    (68802, u'3')
    (68803, u'3')
    (68804, u'3')
    (68805, u'3')
    (68806, u'3')
    (68807, u'3')
    Traceback (most recent call last):
      File "/Users/dharol/eclipse/workspace/mongodb/shortexecute.py",
line 66, in <module>
        thisdata = cursor.fetchone()
    MemoryError
    --------------------

I know that there are 68,866 invoices, so am confused why it is able
to execute 10 and then have memory error, if the cursor is not
supposed to retain data through the iterations?
-d


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tzur Turkenitz  
View profile  
 More options Oct 31 2011, 7:47 am
From: Tzur Turkenitz <tz...@vision.bi>
Date: Mon, 31 Oct 2011 04:47:42 -0700 (PDT)
Local: Mon, Oct 31 2011 7:47 am
Subject: Re: pyodbc Cursor object
Thank you very much. This was very helpful in understanding

On Oct 25, 7:05 pm, Michael Kleehammer <mkleeham...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
dmoney  
View profile  
 More options Oct 31 2011, 1:51 pm
From: dmoney <dtankers...@schipul.com>
Date: Mon, 31 Oct 2011 10:51:20 -0700 (PDT)
Local: Mon, Oct 31 2011 1:51 pm
Subject: Re: pyodbc Cursor object
Pretty ugly code, but I was able to solve the problem by:
1. chunking data during SELECT query
2. doing date transformation in python instead of SQL query
(e.g., extracting month, year, from python objects that are the output
of the cursor query)

Neither
    closing the cursor          cursor.close()
    deleting the query          deleted=cursor.execute(thisquery)
    limiting cursor fetches     cursor.fetchone()
made any difference to memory use for me. Solution script below.

Thanks again for all of your assistance!
-d

SOLUTION SCRIPT:
chunksize = 1000
import pyodbc

url =
'DSN=myserver;UID=myusername;PWD=mypassword;PORT=1433;DATABASE=mydb'
pyodbcconn = pyodbc.connect(url);
cursor = pyodbcconn.cursor()

#GET COUNT OF INVOICES (query rows)
cursor.execute('SELECT MAX(invoiceid) FROM invoices');
counttup = cursor.fetchall(); rowMAX = counttup[0][0];

# LOOP THROUGH DATA CHUNKS
rowST = 1; rowEND = chunksize;

while (rowEND < rowMAX+chunksize):
    thisquery = 'SELECT tenderdate FROM invoices \
    WHERE invoiceid >= ' + str(rowST) + ' AND invoiceid < ' +
str(rowEND)
    cursor.execute(thisquery); thisdata = cursor.fetchall();

    for tup in thisdata:
        print tup[0]
        thisday, thismonth, thisyear =  tup[0].day, tup[0].month,
tup[0].year
        print thisday
        print thismonth
        print thisyear
    print 'Finished: updated {0} of {1} invoices'.format(str(rowEND),
str(rowMAX))
    rowST = rowST + chunksize; rowEND = rowEND + chunksize


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »