From: Jassin MEKNASSI <jassin....@gmail.com>Date: March 8, 2010 11:13:56 PM ESTTo: Rahul Priyadarshi2 <rahul.pr...@in.ibm.com>Cc: Ambrish Bhargava1 <abha...@in.ibm.com>, Mario Ds Briggs <mario....@in.ibm.com>, Abhigyan Agrawal1 <abhigyan...@in.ibm.com>Subject: Re: django Views bug /PERFORMACE ISSUEHi,I upgraded from 9,5 to 9.7 but there is still no improvement.thnks for any suggestionRegards,JassinOn Mar 4, 2010, at 2:17 PM, Jassin MEKNASSI wrote:hi Rahul,I copied the file to Library/Python/2.6/site-packages/ibm_db-1.0-py2.6-macosx-10.6-universal.egg/delete all the pyc filesUnfortunately this still takes forever: 2037.1454041 secondsFor all the version of ibm_db_dbi.py i get:/Library/Python/2.6/site-packages/ibm_db-1.0-py2.6-macosx-10.6-universal.egg/ibm_db_dbi.py:25: DeprecationWarning: the sets module is deprecatedfrom sets import ImmutableSetOn Mar 4, 2010, at 1:25 AM, Rahul Priyadarshi2 wrote:
Hi Jassin:
pleased find the attached ibm_db_dbi.py wrapper below. Replace existing ibm_db_dbi.py with this one (at location '../site-packages/ibm_db.py2.x-xxx.egg/').
Try same testing to this wrapper and let me know how much time now it takes.
Thanks & Regards,
Rahul Priyadarshi
System Software Engineer
Open Source - Common AD
India Software Labs - IBM India Pvt Ltd.
Tel(Off) +91 80-402-55240
Tel(Mob) +91 9916422069
Office Addr • D2. First Floor. • Manyata Embassy Business Park • Outer Ring Road, Nagavara • Bangalore: 560 045 •
Jassin MEKNASSI <jmek...@me.com> 02/03/2010 20:54
ToRahul Priyadarshi2/India/IBM@IBMIN cc SubjectRe: django Views bug /PERFORMACE ISSUE
takes forever 1863.21640682 seconds :(
thnks
On Mar 2, 2010, at 1:04 AM, Rahul Priyadarshi2 wrote:
You are not using ibm_db_dbi wrapper in correct way.
Please try it as below and let me know how much time it is taking.
$ import ibm_db_dbi
$ import time
$
$ t1 = time.time()
$ conn = ibm_db_dbi.connect('node', 'user', 'pwd')
$ cur = conn.cursor()
$ sql="select sheet_id,name from nativefiles.sheetnames"
$ cur.execute(sql)
$ row = cur.fetchone()
$ wile row:
$ print row[1]
$ row = cur.fetchone()
$ elapsed = time.time() - t1
$ print elapsed
Thanks & Regards,
Rahul Priyadarshi
System Software Engineer
Open Source - Common AD
India Software Labs - IBM India Pvt Ltd.
Tel(Off) +91 80-402-55240
Tel(Mob) +91 9916422069
Office Addr • D2. First Floor. • Manyata Embassy Business Park • Outer Ring Road, Nagavara • Bangalore: 560 045 •
Jassin MEKNASSI <jassin....@googlemail.com> 01/03/2010 22:47
ToRahul Priyadarshi2/India/IBM@IBMIN ccJassin MEKNASSI <jassin....@googlemail.com> SubjectRe: django Views bug /PERFORMACE ISSUE
Using ibm_db_dbi is as fast as using ibm_db
since i have never user dbi before i will paste the code i used to compare all the different modules (ibm_db,ibm_db_dbi,sqlalchemy and django):
TABLE has 15951 rows : DDL is
CREATE TABLE NATIVEFILES.SHEETNAMES (
SHEET_ID INTEGER NOT NULL,
NAME VARCHAR(50) NOT NULL,
PRIMARY KEY (SHEET_ID)
);
IBM_DB_DBI
import ibm_db_dbi
import time
t1 = time.time()
ibm_db_conn = ibm_db_dbi.connect('node', 'user', 'pwd')
sql="select sheet_id,name from nativefiles.sheetnames"
stmt = ibm_db_dbi.ibm_db.exec_immediate(ibm_db_conn, sql)
result = ibm_db_dbi.ibm_db.fetch_both(stmt)
while( result ):
result['SHEET_ID']
result['NAME']
result = ibm_db_dbi.ibm_db.fetch_both(stmt)
elapsed = time.time() - t1
print elapsed
ibm_db_dbi :3.89495897293 seconds
IBM_DB
import ibm_db
import time
ibm_db_conn = ibm_db.connect('node', 'user', 'pwd')
t1 = time.time()
sql="select sheet_id,name from nativefiles.sheetnames"
stmt = ibm_db.exec_immediate(ibm_db_conn, sql)
result = ibm_db.fetch_both(stmt)
while( result ):
result['SHEET_ID']
result['NAME']
result = ibm_db.fetch_both(stmt)
elapsed = time.time() - t1
print elapsed
ibm_db: 2.36251711845 seconds
SQL_ALCHEMY:
import ibm_db_sa.ibm_db_sa
from sqlalchemy import *
from sqlalchemy.orm import scoped_session,sessionmaker
from sqlalchemy import Table
from sqlalchemy import select
from sqlalchemy import MetaData
from sqlalchemy.orm import mapper
metadata = MetaData()
class Sheet(object):
pass
engine=create_engine('connstring')
Session = scoped_session(sessionmaker(bind=engine))
sheetnames_table=Table('SHEETNAMES', metadata,schema='NATIVEFILES',autoload=True,autoload_with=engine)
mapper(Sheet,sheetnames_table)
for s in Session.query(Sheet).all():
s.sheet_id
s.name
takes forever
DJANGO:
models .py
class SheetNames(models.Model):
sheetid = models.IntegerField(primary_key=True,db_column = 'SHEET_ID')
name=models.CharField(max_length=50,db_column='NAME')
class Meta:
db_table = '"NATIVEFILES"."SHEETNAMES"
managed=False
views.py
s=SheetNames.objects.all()
for a in s:
a.name
takes forever
Thnks for yr help
On Feb 28, 2010, at 4:37 AM, Rahul Priyadarshi2 wrote:
Hi ,
Please try the same query with ibm_db_dbi. Through this we would be able identify the main problem. Because this problem may aeries due to django or our ibm_db_dbi wrapper.
Try this and let me know how much time it is taking with same query.
Thanks & Regards,
Rahul Priyadarshi
System Software Engineer
Open Source - Common AD
India Software Labs - IBM India Pvt Ltd.
Tel(Off) +91 80-402-55240
Tel(Mob) +91 9916422069
Office Addr • D2. First Floor. • Manyata Embassy Business Park • Outer Ring Road, Nagavara • Bangalore: 560 045 •
Jassin MEKNASSI <jassin....@googlemail.com> 25/02/2010 21:21
ToRahul Priyadarshi2/India/IBM@IBMIN cc SubjectRe: django Views bug
Thanks it works.
I noticed that sqlalchemy and django drivers are really slow compared to the ibm_db driver which seems to use the underlying client.
I m using db2 9.5 express-c.
populating values from django on a simple table (PK, char(10)) with 10k rows takes more than 10 min . with ibm_db this is almost instantaneous.
Using native sql and using execute and fetchone with django does not improve performance.
Any idea? this would be very helpful since i m now stuck with only ibm_db. Move to 9.7?
Regards
On Feb 23, 2010, at 6:15 AM, Rahul Priyadarshi2 wrote:
Hi Jassin,
You got this error not because of any django adaptor bug. This is coming due to your model definition.
In your Sheetcount model you are not specifying any attribute as primary key. So that Django automatically adds an “ID” primary key attribute. And in you db2 view there is no field of name “ID”.
If you specify any one of attribute as primary key then you will not get this error.
Please let me know this will help you or not.
By next time can you please post your issues in our forum (http://groups.google.com/group/ibm_db). Because your issues will may help our other users also. If they will get same problem.
Thanks & Regards,
Rahul Priyadarshi
System Software Engineer
Open Source - Common AD
India Software Labs - IBM India Pvt Ltd.
Tel(Off) +91 80-402-55240
Tel(Mob) +91 9916422069
Office Addr • D2. First Floor. • Manyata Embassy Business Park • Outer Ring Road, Nagavara • Bangalore: 560 045 •
Jassin MEKNASSI <jassin....@googlemail.com> 23/02/2010 03:26
ToRahul Priyadarshi2/India/IBM@IBMIN cc Subjectdjango Views bug
hi im using Django-1.1.1 and and ibm_db_django-0.1.4-py2.5.egg and get SQL0206N when i use a db2 view
from django.db import models
class Sheetcount(models.Model):
name = models.CharField(max_length=50,db_column='SHEETNAME')
total = models.IntegerField(db_column='TOTAL')
samplefile = models.CharField(max_length=50,db_column='SAMPLEFILE')
class Meta:
db_table = '"TEST"."TESTTABLE"'
managed=False
[IBM][CLI Driver][DB2/LINUXX8664] SQL0206N "TEST.TESTTABLE.ID" is not valid in the context where it is used. SQLSTATE=42703 SQLCODE=-206
thnks
<ibm_db_dbi.py>
>>
Hi Jassin,
We have a grasp of the issues (they are not related to server
version), which we suspected and are right. I am sorry that the patch
we sent was wrong (didnt cover what we wanted to cover.)
Here is the gist of the problem.
- the dbi driver is looking to implement the dbi spec which requires
that the # of rows affected by the query be immeditaely available
after cursor.execute()
To support this requirment, the dbi driver switches the cursor
type to static as well as request the server to prefetch the rowcount.
This is certainly a overhead for the dbi that can account for more
than half the extra time the dbi takes.
- there are couple of other small things that we have identified for
improving in the dbi wrapper
Rahul has a use-case local with the performance measurements for these
issues. He can fill you in with the details and plans on this and
send you a temp dbi wrapper to verify that these are the same issues
you are hitting.
The good point is we have identified the issues. We have to spend a
little time thinking how we want to remove these wrapper bottlenecks
as well as support someone who depends on this spec behaviour.
P.S. We will move this email chain to the google-groups forum and
continue the discussion there.
thanks
Mario
<<
On Mar 10, 10:40 pm, Jassin MEKNASSI <jassin.mekna...@googlemail.com>
wrote:
> Begin forwarded message:
On Wed, 2010-03-10 at 21:49 -0800, Mario Briggs wrote:
> Adding the response from the email thread
>
> >>
> Hi Jassin,
>
> We have a grasp of the issues (they are not related to server
> version), which we suspected and are right. I am sorry that the patch
> we sent was wrong (didnt cover what we wanted to cover.)
>
> Here is the gist of the problem.
> - the dbi driver is looking to implement the dbi spec which requires
> that the # of rows affected by the query be immeditaely available
> after cursor.execute()
> To support this requirment, the dbi driver switches the cursor
> type to static as well as request the server to prefetch the rowcount.
> This is certainly a overhead for the dbi that can account for more
> than half the extra time the dbi takes.
I suggest you interpret the spec a bit differently. To quote from
PEP-249:
"""
.rowcount
This read-only attribute specifies the number of rows that the
last .execute*() produced (for DQL statements like 'select') or affected
(for DML statements like 'update' or 'insert').
The attribute is -1 in case no .execute*() has been
performed on the cursor or the rowcount of the last operation
is cannot be determined by the interface. [7]
"""
The way I've seen most database drivers handle this issue is with the
following train of thought:
* Static cursors are too slow for most practical purposes. Forward only
cursors are by far the most efficient, so let's use them.
* However, forward only cursors can't determine (in advance) the number
of rows in a result set before it's been completely enumerated (which is
one of the reasons they're so fast), so just return -1 in such cases
with the justification that it "cannot be determined by the interface".
* In the case of DML statements (INSERT, UPDATE, etc.) you can still
return the rowcount because it's easily determined either by retrieving
the sqlerrd(3) value from the SQLCA after execution, or by calling
SQLRowCount().
In fact, SQLRowCount() is particularly convenient. From the docs: "when
you use SQLRowCount() on a non-scrollable SELECT-only cursor, the
function sets the contents of RowCountPtr to -1 because the number of
rows is not available until all of the data has been fetched", which is
exactly the behaviour we're after.
If you want to get fancy you could update the rowcount property after
the result set is completely enumerated (e.g. after fetchall()) by
having the cursor object count the number of rows as they're retrieved
(but obviously not "publish" this result until the retrieval is
complete). This would be supported under footnote [7] from PEP-249:
"""
[7] The rowcount attribute may be coded in a way that updates its value
dynamically. This can be useful for databases that return usable
rowcount values only after the first call to a .fetch*() method.
"""
Side note: row counts are subtle things and all too often people rely on
them without thinking. In some cases they can even be somewhat
subjective. For example, if I delete 2 rows from a table which is the
parent of a foreign key with a CASCADE rule, which then causes 5 more
rows to be deleted, should rowcount return 2 or 7? The latter figure can
be obtained from sqlerrd(5), but most systems I've come across consider
the first value to be the answer.
Anyway this is one of the reasons that I don't think drivers should try
too hard to provide row counts (and especially not at the huge cost to
performance that static cursors bring). If people actually need a row
count, they can go and explicitly query for them (which will also allow
them to determine exactly what they mean by a rowcount without any of
the ambiguity alluded to above).
Cheers,
Dave.
regards
Mario
> --
> You received this message because you are subscribed to the Google Groups "ibm_db" group.
> To post to this group, send email to ibm...@googlegroups.com.
> To unsubscribe from this group, send email to ibm_db+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/ibm_db?hl=en.
>
I executed the ibm_db_dbi test code directly from the server. This takes 3seconds.
Unfortunately it still takes forever when used from a remote client.
Stats are:
remote ibm_db : 2.9sec
remote ibm_db_dbi: forever ( >10min)
server ibm_db : 1.9sec
server ibm_db_dbi:3.3sec
remote client used ibm_db 1.0 on Macosx 10.6 (ibm_db-1.0-py2.6-macosx-10.6-universal.egg)
server is used ibm_db 1.0 on a debian box (ibm_db-1.0-py2.5-linux-x86_64.egg)
Thanks,
Rahul Priyadarshi
On Mar 25, 9:56 pm, Jassin MEKNASSI <jassin.mekna...@googlemail.com>
wrote:
|
Office Addr • D2. First Floor. • Manyata Embassy Business Park • Outer Ring Road, Nagavara • Bangalore: 560 045 • |
| From: | Rahul Priyadarshi2/India/IBM@IBMIN |
| To: | ibm_db <ibm...@googlegroups.com> |
| Date: | 30/03/2010 01:01 |
| Subject: | [ibm_db:414] Re: Fwd: django Views bug /PERFORMACE ISSUE |
| Sent by: | ibm...@googlegroups.com |
<ibm_db_dbi.py>
<ibm_db_dbi.py>