Fwd: django Views bug /PERFORMACE ISSUE

Jassin MEKNASSI

unread,

Mar 10, 2010, 12:40:14 PM3/10/10

to ibm...@googlegroups.com

Begin forwarded message:

From: Jassin MEKNASSI <jassin....@gmail.com>
Date: March 8, 2010 11:13:56 PM EST
To: Rahul Priyadarshi2 <rahul.pr...@in.ibm.com>
Cc: Ambrish Bhargava1 <abha...@in.ibm.com>, Mario Ds Briggs <mario....@in.ibm.com>, Abhigyan Agrawal1 <abhigyan...@in.ibm.com>
Subject: Re: django Views bug /PERFORMACE ISSUE

Hi,

I upgraded from 9,5 to 9.7 but there is still no improvement.

thnks for any suggestion

Regards,

Jassin

On Mar 4, 2010, at 2:17 PM, Jassin MEKNASSI wrote:

hi Rahul,

I copied the file to Library/Python/2.6/site-packages/ibm_db-1.0-py2.6-macosx-10.6-universal.egg/
delete all the pyc files
Unfortunately this still takes forever: 2037.1454041 seconds

For all the version of ibm_db_dbi.py i get:

/Library/Python/2.6/site-packages/ibm_db-1.0-py2.6-macosx-10.6-universal.egg/ibm_db_dbi.py:25: DeprecationWarning: the sets module is deprecated
from sets import ImmutableSet

On Mar 4, 2010, at 1:25 AM, Rahul Priyadarshi2 wrote:

Hi Jassin:
pleased find the attached ibm_db_dbi.py wrapper below. Replace existing ibm_db_dbi.py with this one (at location '../site-packages/ibm_db.py2.x-xxx.egg/').

Try same testing to this wrapper and let me know how much time now it takes.

Thanks & Regards,
Rahul Priyadarshi
System Software Engineer
Open Source - Common AD
India Software Labs - IBM India Pvt Ltd.
Tel(Off) +91 80-402-55240
Tel(Mob) +91 9916422069

Office Addr • D2. First Floor. • Manyata Embassy Business Park • Outer Ring Road, Nagavara • Bangalore: 560 045 •

Jassin MEKNASSI <jmek...@me.com>
02/03/2010 20:54

To
Rahul Priyadarshi2/India/IBM@IBMIN

cc

Subject
Re: django Views bug /PERFORMACE ISSUE

takes forever 1863.21640682 seconds :(

thnks

On Mar 2, 2010, at 1:04 AM, Rahul Priyadarshi2 wrote:

You are not using ibm_db_dbi wrapper in correct way.
Please try it as below and let me know how much time it is taking.

$ import ibm_db_dbi
$ import time
$
$ t1 = time.time()
$ conn = ibm_db_dbi.connect('node', 'user', 'pwd')
$ cur = conn.cursor()
$ sql="select sheet_id,name from nativefiles.sheetnames"
$ cur.execute(sql)
$ row = cur.fetchone()
$ wile row:
$ print row[1]
$ row = cur.fetchone()
$ elapsed = time.time() - t1
$ print elapsed

Thanks & Regards,
Rahul Priyadarshi
System Software Engineer
Open Source - Common AD
India Software Labs - IBM India Pvt Ltd.
Tel(Off) +91 80-402-55240
Tel(Mob) +91 9916422069

Office Addr • D2. First Floor. • Manyata Embassy Business Park • Outer Ring Road, Nagavara • Bangalore: 560 045 •

Jassin MEKNASSI <jassin....@googlemail.com>
01/03/2010 22:47

To
Rahul Priyadarshi2/India/IBM@IBMIN

cc
Jassin MEKNASSI <jassin....@googlemail.com>

Subject
Re: django Views bug /PERFORMACE ISSUE

Using ibm_db_dbi is as fast as using ibm_db
since i have never user dbi before i will paste the code i used to compare all the different modules (ibm_db,ibm_db_dbi,sqlalchemy and django):

TABLE has 15951 rows : DDL is
CREATE TABLE NATIVEFILES.SHEETNAMES (
SHEET_ID INTEGER NOT NULL,
NAME VARCHAR(50) NOT NULL,
PRIMARY KEY (SHEET_ID)
);

IBM_DB_DBI

import ibm_db_dbi
import time

t1 = time.time()
ibm_db_conn = ibm_db_dbi.connect('node', 'user', 'pwd')

sql="select sheet_id,name from nativefiles.sheetnames"
stmt = ibm_db_dbi.ibm_db.exec_immediate(ibm_db_conn, sql)
result = ibm_db_dbi.ibm_db.fetch_both(stmt)
while( result ):
result['SHEET_ID']
result['NAME']
result = ibm_db_dbi.ibm_db.fetch_both(stmt)
elapsed = time.time() - t1
print elapsed

ibm_db_dbi :3.89495897293 seconds

IBM_DB

import ibm_db
import time
ibm_db_conn = ibm_db.connect('node', 'user', 'pwd')

t1 = time.time()

sql="select sheet_id,name from nativefiles.sheetnames"
stmt = ibm_db.exec_immediate(ibm_db_conn, sql)
result = ibm_db.fetch_both(stmt)
while( result ):
result['SHEET_ID']
result['NAME']
result = ibm_db.fetch_both(stmt)

elapsed = time.time() - t1
print elapsed

ibm_db: 2.36251711845 seconds

SQL_ALCHEMY:

import ibm_db_sa.ibm_db_sa
from sqlalchemy import *
from sqlalchemy.orm import scoped_session,sessionmaker
from sqlalchemy import Table
from sqlalchemy import select
from sqlalchemy import MetaData
from sqlalchemy.orm import mapper

metadata = MetaData()

class Sheet(object):
pass

engine=create_engine('connstring')
Session = scoped_session(sessionmaker(bind=engine))
sheetnames_table=Table('SHEETNAMES', metadata,schema='NATIVEFILES',autoload=True,autoload_with=engine)

mapper(Sheet,sheetnames_table)
for s in Session.query(Sheet).all():
s.sheet_id
s.name

takes forever

DJANGO:

models .py

class SheetNames(models.Model):
sheetid = models.IntegerField(primary_key=True,db_column = 'SHEET_ID')
name=models.CharField(max_length=50,db_column='NAME')
class Meta:
db_table = '"NATIVEFILES"."SHEETNAMES"
managed=False

views.py

s=SheetNames.objects.all()
for a in s:
a.name

takes forever

Thnks for yr help

On Feb 28, 2010, at 4:37 AM, Rahul Priyadarshi2 wrote:

Hi ,
Please try the same query with ibm_db_dbi. Through this we would be able identify the main problem. Because this problem may aeries due to django or our ibm_db_dbi wrapper.

Try this and let me know how much time it is taking with same query.

Thanks & Regards,
Rahul Priyadarshi
System Software Engineer
Open Source - Common AD
India Software Labs - IBM India Pvt Ltd.
Tel(Off) +91 80-402-55240
Tel(Mob) +91 9916422069

Office Addr • D2. First Floor. • Manyata Embassy Business Park • Outer Ring Road, Nagavara • Bangalore: 560 045 •

Jassin MEKNASSI <jassin....@googlemail.com>
25/02/2010 21:21

To
Rahul Priyadarshi2/India/IBM@IBMIN

cc

Subject
Re: django Views bug

Thanks it works.

I noticed that sqlalchemy and django drivers are really slow compared to the ibm_db driver which seems to use the underlying client.
I m using db2 9.5 express-c.
populating values from django on a simple table (PK, char(10)) with 10k rows takes more than 10 min . with ibm_db this is almost instantaneous.
Using native sql and using execute and fetchone with django does not improve performance.

Any idea? this would be very helpful since i m now stuck with only ibm_db. Move to 9.7?

Regards

On Feb 23, 2010, at 6:15 AM, Rahul Priyadarshi2 wrote:

Hi Jassin,
You got this error not because of any django adaptor bug. This is coming due to your model definition.

In your Sheetcount model you are not specifying any attribute as primary key. So that Django automatically adds an “ID” primary key attribute. And in you db2 view there is no field of name “ID”.

If you specify any one of attribute as primary key then you will not get this error.

Please let me know this will help you or not.

By next time can you please post your issues in our forum (http://groups.google.com/group/ibm_db). Because your issues will may help our other users also. If they will get same problem.

Thanks & Regards,
Rahul Priyadarshi
System Software Engineer
Open Source - Common AD
India Software Labs - IBM India Pvt Ltd.
Tel(Off) +91 80-402-55240
Tel(Mob) +91 9916422069

Office Addr • D2. First Floor. • Manyata Embassy Business Park • Outer Ring Road, Nagavara • Bangalore: 560 045 •

Jassin MEKNASSI <jassin....@googlemail.com>
23/02/2010 03:26

To
Rahul Priyadarshi2/India/IBM@IBMIN

cc

Subject
django Views bug

hi im using Django-1.1.1 and and ibm_db_django-0.1.4-py2.5.egg and get SQL0206N when i use a db2 view
from django.db import models

class Sheetcount(models.Model):
name = models.CharField(max_length=50,db_column='SHEETNAME')
total = models.IntegerField(db_column='TOTAL')
samplefile = models.CharField(max_length=50,db_column='SAMPLEFILE')
class Meta:
db_table = '"TEST"."TESTTABLE"'
managed=False

[IBM][CLI Driver][DB2/LINUXX8664] SQL0206N "TEST.TESTTABLE.ID" is not valid in the context where it is used. SQLSTATE=42703 SQLCODE=-206

thnks

<ibm_db_dbi.py>

Mario Briggs

unread,

Mar 11, 2010, 12:49:26 AM3/11/10

to ibm_db

Adding the response from the email thread

>>
Hi Jassin,

We have a grasp of the issues (they are not related to server
version), which we suspected and are right. I am sorry that the patch
we sent was wrong (didnt cover what we wanted to cover.)

Here is the gist of the problem.
- the dbi driver is looking to implement the dbi spec which requires
that the # of rows affected by the query be immeditaely available
after cursor.execute()
To support this requirment, the dbi driver switches the cursor
type to static as well as request the server to prefetch the rowcount.
This is certainly a overhead for the dbi that can account for more
than half the extra time the dbi takes.
- there are couple of other small things that we have identified for
improving in the dbi wrapper

Rahul has a use-case local with the performance measurements for these
issues. He can fill you in with the details and plans on this and
send you a temp dbi wrapper to verify that these are the same issues
you are hitting.

The good point is we have identified the issues. We have to spend a
little time thinking how we want to remove these wrapper bottlenecks
as well as support someone who depends on this spec behaviour.

P.S. We will move this email chain to the google-groups forum and
continue the discussion there.

thanks
Mario
<<

On Mar 10, 10:40 pm, Jassin MEKNASSI <jassin.mekna...@googlemail.com>
wrote:
> Begin forwarded message:

Dave Hughes

unread,

Mar 11, 2010, 6:29:01 AM3/11/10

to ibm...@googlegroups.com

Hi Mario,

On Wed, 2010-03-10 at 21:49 -0800, Mario Briggs wrote:
> Adding the response from the email thread
>
> >>
> Hi Jassin,
>
> We have a grasp of the issues (they are not related to server
> version), which we suspected and are right. I am sorry that the patch
> we sent was wrong (didnt cover what we wanted to cover.)
>
> Here is the gist of the problem.
> - the dbi driver is looking to implement the dbi spec which requires
> that the # of rows affected by the query be immeditaely available
> after cursor.execute()
> To support this requirment, the dbi driver switches the cursor
> type to static as well as request the server to prefetch the rowcount.
> This is certainly a overhead for the dbi that can account for more
> than half the extra time the dbi takes.

I suggest you interpret the spec a bit differently. To quote from
PEP-249:

"""
.rowcount

This read-only attribute specifies the number of rows that the
last .execute*() produced (for DQL statements like 'select') or affected
(for DML statements like 'update' or 'insert').

The attribute is -1 in case no .execute*() has been
performed on the cursor or the rowcount of the last operation
is cannot be determined by the interface. [7]
"""

The way I've seen most database drivers handle this issue is with the
following train of thought:

* Static cursors are too slow for most practical purposes. Forward only
cursors are by far the most efficient, so let's use them.

* However, forward only cursors can't determine (in advance) the number
of rows in a result set before it's been completely enumerated (which is
one of the reasons they're so fast), so just return -1 in such cases
with the justification that it "cannot be determined by the interface".

* In the case of DML statements (INSERT, UPDATE, etc.) you can still
return the rowcount because it's easily determined either by retrieving
the sqlerrd(3) value from the SQLCA after execution, or by calling
SQLRowCount().

In fact, SQLRowCount() is particularly convenient. From the docs: "when
you use SQLRowCount() on a non-scrollable SELECT-only cursor, the
function sets the contents of RowCountPtr to -1 because the number of
rows is not available until all of the data has been fetched", which is
exactly the behaviour we're after.

If you want to get fancy you could update the rowcount property after
the result set is completely enumerated (e.g. after fetchall()) by
having the cursor object count the number of rows as they're retrieved
(but obviously not "publish" this result until the retrieval is
complete). This would be supported under footnote [7] from PEP-249:

"""
[7] The rowcount attribute may be coded in a way that updates its value
dynamically. This can be useful for databases that return usable
rowcount values only after the first call to a .fetch*() method.
"""

Side note: row counts are subtle things and all too often people rely on
them without thinking. In some cases they can even be somewhat
subjective. For example, if I delete 2 rows from a table which is the
parent of a foreign key with a CASCADE rule, which then causes 5 more
rows to be deleted, should rowcount return 2 or 7? The latter figure can
be obtained from sqlerrd(5), but most systems I've come across consider
the first value to be the answer.

Anyway this is one of the reasons that I don't think drivers should try
too hard to provide row counts (and especially not at the huge cost to
performance that static cursors bring). If people actually need a row
count, they can go and explicitly query for them (which will also allow
them to determine exactly what they mean by a rowcount without any of
the ambiguity alluded to above).

Cheers,

Dave.

Mario Briggs

unread,

Mar 11, 2010, 8:05:23 AM3/11/10

to ibm_db

Dave,
you have exactly said what we were 99% going to do :-) and drivers
(across languages too) have been always doing this

regards
Mario

Jassin MEKNASSI

unread,

Mar 12, 2010, 2:01:27 PM3/12/10

to ibm...@googlegroups.com

Thnks looking forward to it

> --
> You received this message because you are subscribed to the Google Groups "ibm_db" group.
> To post to this group, send email to ibm...@googlegroups.com.
> To unsubscribe from this group, send email to ibm_db+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/ibm_db?hl=en.
>

Jassin MEKNASSI

unread,

Mar 25, 2010, 12:56:06 PM3/25/10

to ibm...@googlegroups.com

Hi,

I executed the ibm_db_dbi test code directly from the server. This takes 3seconds.
Unfortunately it still takes forever when used from a remote client.
Stats are:

remote ibm_db : 2.9sec
remote ibm_db_dbi: forever ( >10min)
server ibm_db : 1.9sec
server ibm_db_dbi:3.3sec

remote client used ibm_db 1.0 on Macosx 10.6 (ibm_db-1.0-py2.6-macosx-10.6-universal.egg)
server is used ibm_db 1.0 on a debian box (ibm_db-1.0-py2.5-linux-x86_64.egg)

Rahul

unread,

Mar 29, 2010, 3:34:45 PM3/29/10

to ibm_db

Hi Jassin,
I am little bit confused here.
What about your previous findings, In that you have used database
locally or remotely.

Thanks,
Rahul Priyadarshi

On Mar 25, 9:56 pm, Jassin MEKNASSI <jassin.mekna...@googlemail.com>
wrote:

Rahul Priyadarshi2

unread,

Apr 8, 2010, 2:23:26 AM4/8/10

to ibm...@googlegroups.com

I have made some changes in ibm_db_dbi wrapper to increase performance. My initial testing gives very positive results.

Please pick the modified patch from attachment and replace the existing ibm_db_dbi.py with this one(at location “../site-packages/ibm_db.py2.x-xxx.egg/”).
Try this patch and let me know how it works.

Thanks & Regards,
Rahul Priyadarshi
System Software Engineer
Open Source - Common AD
India Software Labs - IBM India Pvt Ltd.
Tel(Off) +91 80-402-55240
Tel(Mob) +91 9916422069

Office Addr • D2. First Floor. • Manyata Embassy Business Park • Outer Ring Road, Nagavara • Bangalore: 560 045 •

From:	Rahul Priyadarshi2/India/IBM@IBMIN
To:	ibm_db <ibm...@googlegroups.com>
Date:	30/03/2010 01:01
Subject:	[ibm_db:414] Re: Fwd: django Views bug /PERFORMACE ISSUE
Sent by:	ibm...@googlegroups.com

ibm_db_dbi.py

Jassin MEKNASSI

unread,

Apr 8, 2010, 9:07:07 AM4/8/10

to ibm...@googlegroups.com

Unfortunately i do get when using sqlalchemy using the following test code:

/Library/Python/2.6/site-packages/SQLAlchemy-0.4.0-py2.6.egg/sqlalchemy/util.py:7: DeprecationWarning: the md5 module is deprecated; use hashlib instead

import md5, sys, warnings, sets

/Library/Python/2.6/site-packages/SQLAlchemy-0.4.0-py2.6.egg/sqlalchemy/util.py:7: DeprecationWarning: the sets module is deprecated

import md5, sys, warnings, sets

Traceback (most recent call last):

File "/Volumes/DATA/Development/DEV/db2_drivertest/src/sql_alchemy.py", line 20, in <module>

sheetnames_table = Table('SHEETNAMES', metadata, schema = 'MYTESTSCHEMA', autoload = True, autoload_with = engine)

File "/Library/Python/2.6/site-packages/SQLAlchemy-0.4.0-py2.6.egg/sqlalchemy/schema.py", line 112, in __call__

autoload_with.reflecttable(table, include_columns=include_columns)

File "/Library/Python/2.6/site-packages/SQLAlchemy-0.4.0-py2.6.egg/sqlalchemy/engine/base.py", line 1178, in reflecttable

self.dialect.reflecttable(conn, table, include_columns)

File "/Library/Python/2.6/site-packages/ibm_db_sa-0.1.6-py2.6.egg/ibm_db_sa/ibm_db_sa.py", line 474, in reflecttable

schema_name = self.get_default_schema_name(connection)

File "/Library/Python/2.6/site-packages/ibm_db_sa-0.1.6-py2.6.egg/ibm_db_sa/ibm_db_sa.py", line 401, in get_default_schema_name

schema_name = connection.connection.connection.get_current_schema()

File "/Library/Python/2.6/site-packages/ibm_db-1.0-py2.6-macosx-10.6-universal.egg/ibm_db_dbi.py", line 631, in get_current_schema

return self.current_schema

AttributeError: 'Connection' object has no attribute 'current_schema'

metadata = MetaData()

class Sheet(object):

pass

db_name = 'DBNAME'

schema_name = 'MYTESTSCHEMA'

engine=create_engine('ibm_db_sa://user:pwd@xxx.com:portnr/%s;CurrentSchema=%s'%(db_name, schema_name))

Session = scoped_session(sessionmaker(bind = engine))

sheetnames_table = Table('SHEETNAMES', metadata, schema = 'MYTESTSCHEMA', autoload = True, autoload_with = engine)

mapper(Sheet, sheetnames_table)

for s in Session.query(Sheet).all():

print s.sheet_id

s.name

<ibm_db_dbi.py>

Jassin MEKNASSI

unread,

Apr 8, 2010, 8:52:36 AM4/8/10

to ibm...@googlegroups.com

Great work.

I go 2.94796013832 from a remoteclient

Thank you very much

On Apr 8, 2010, at 2:23 AM, Rahul Priyadarshi2 wrote:

<ibm_db_dbi.py>

Rahul Priyadarshi2

unread,

Apr 9, 2010, 7:19:36 AM4/9/10

to ibm...@googlegroups.com

Hi Jassin,
You got this error because i removed one call to set default current schema. This call is happening at the time to creating connection. And CLI uses its local variable to return current schema, so without setting it we got empty string as current schema.

In attached dbi patch i have added that call. Replace your existing ibm_db_dbi.py to this(at location“../site-packages/ibm_db.py2.x-xxx.egg/”).

Try this patch and let me know how it works.

Thanks,
Rahul Priyadarshi

ibm_db_dbi.py

Rahul

unread,

Apr 30, 2010, 1:57:54 AM4/30/10

to ibm_db

Hi Jassin,
New released version of ibm_db(1.0.2) also contains performance issue
fix.
So, try this and let me know how it goes.

Thanks,
Rahul Priyadarshi

On Apr 9, 4:19 pm, Rahul Priyadarshi2 <rahul.priyadar...@in.ibm.com>
wrote:

> ibm_db_dbi.py
> 71KViewDownload

Reply all

Reply to author

Forward