Insert into Select from query taking time.

52 views
Skip to first unread message

Siddhesh Naik

unread,
Jan 27, 2020, 2:39:33 AM1/27/20
to ibm_db
Hi,

I have a table which have 100M records in it. When I am performing Insert into NewTable Select * from mytable , ibm_db_sa (odbc driver) take 2 hours to complete this operation. But when I perform same with any java application (jdbc driver) its take hardly 10 minutes to finish this.

Could you please help me to find out why ibm_db_sa (odbc driver) taking that much time?

kad...@us.ibm.com

unread,
Jan 27, 2020, 9:13:27 AM1/27/20
to ibm_db
Which ODBC driver?

Siddhesh Naik

unread,
Jan 27, 2020, 10:09:36 AM1/27/20
to ibm_db
Checked  with "IBM i Access ODBC Driver"  , "iSeries Access ODBC Driver" and "Client Access ODBC Driver(32 bit)". 

Version of drivers: 13.64.22.00

Platform: Windows Server 2019 , Windows 10 Desktop

On Monday, January 27, 2020 at 7:43:27 PM UTC+5:30, kad...@us.ibm.com wrote:
Which ODBC driver?

Siddhesh Naik

unread,
Jan 27, 2020, 10:09:36 AM1/27/20
to ibm_db
Hi,

I have same issue with ibm_db also. Its also taking 2 hours to finish.

Siddhesh Naik

unread,
Feb 13, 2020, 9:02:32 PM2/13/20
to ibm_db
Is there any update on this issue?  I have recreated this issue on 3 i-series database , to check its database issue or driver issue and every database behaving same.It's seems to ibm_db/ibm_db_sa library issue.Please help.


On Monday, January 27, 2020 at 1:09:33 PM UTC+5:30, Siddhesh Naik wrote:

Saba Kauser

unread,
Feb 13, 2020, 9:04:35 PM2/13/20
to ibm_db
Kevin, is there any performance trace in ODBC driver to dig this further. As I understand, ibm_db is using iSeries driver and not Db2 LUW driver.

kad...@us.ibm.com

unread,
Feb 13, 2020, 9:18:22 PM2/13/20
to ibm_db
My understanding is that Siddhesh is using the ibm_db_sa with either pyodbc + IBM i Access ODBC driver or ibm_db + Db2 Connect driver and they both take 2 hours to complete while a JDBC application takes about 10 minutes instead. We have some internal tools to analyze performance on our driver. Since both drivers seem to display the same behavior, perhaps it's something in ibm_db_sa.

Siddhesh, you could look at performance keywords on this page: https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_74/rzaik/connectkeywords.htm You could also try using PyODBC directly to determine whether it's something in the driver or in ibm_db_sa that's the problem.

Siddhesh Naik

unread,
Feb 14, 2020, 1:50:24 AM2/14/20
to ibm_db
Hi Kadler,

I am using ibm_db_sa+pyodbc dialect. Also I have tried this with 3 drivers IBM i Access ODBC Driver, iSeries Access ODBC driver and Client Access ODBC Driver(32 bit).  Version of all drivers - 13.64.22.00

I have checked the connection string with 'Performance' keyword as well as other few keywords.But there is no difference in time. I got this performance issue for 5M records too. JDBC takes 10-12 seconds and ODBC driver takes 30-35 Minutes to execute the same query. But surprisingly if table row count is around 2M records , then JDBC and ODBC takes same time ( less than 2 secs).

My few Observations that might help you to diagnose the issue:(Observations done on 3 different databases, which are located in 3 different geographical regions.)
 
1)When I analyse the query stats in running jobs tool on I series navigator, I saw that when I am submitting the query using ODBC driver,that stats get updates around 6-8K records by per second but in the case of JDBC,  around 85-90K stats get updated per second which I found strange behaviour from the database. 

2) Another observation is , when I analyse the query in running jobs tool on I series navigator, I series navigator not allow me to see query execution details for the query which executed by ODBC driver but it allow to see query execution for JDBC driver queries.( this observation I have mentioned just FYI).

I will create some POCs as you mentioned above and share with you within few hours.

Siddhesh Naik

unread,
Feb 16, 2020, 9:05:09 PM2/16/20
to ibm_db
Hi Kadler,

I have checked with pyODBC directly and took same time as ibm_db_sa. My code snippet is as below:

import pyodbc
from datetime import datetime
conn = pyodbc.connect("DRIVER={iSeries Access ODBC Driver};SYSTEM=100.000.000.001;DATABASE=B10E46D4;UID=******;PWD=*****")
SELECT_QUERY
= "CREATE TABLE ARCENGINE.T_PERFORMANCE AS (SELECT * FROM JDE_SOURCE.F0011) WITH DATA"
print("Start query execution: ", datetime.now())
a = conn.execute(SELECT_QUERY)
print("End query execution: ", datetime.now())
conn
.execute("DROP TABLE ARCENGINE.T_PERFORMANCE")
conn
.close()



On Monday, January 27, 2020 at 1:09:33 PM UTC+5:30, Siddhesh Naik wrote:

kad...@us.ibm.com

unread,
Feb 17, 2020, 12:00:26 PM2/17/20
to ibm_db
That query doesn't retrieve any data, so I'm not sure how the client would impact the performance. I'm not sure how JDBC would take any less time. Seems that if there is a performance issue, it's on the server.

Siddhesh Naik

unread,
Feb 18, 2020, 1:04:13 AM2/18/20
to ibm_db
Hi Kevin,

I have tested this on 3 different iseries server(versions v7r2, v7r1). And all the servers responded same way. My observation is server treating ODBC request and JDBC request differently. There might be difference in default parameters values (or protocol) which used by both drivers for the communication with server. 
Reply all
Reply to author
Forward
0 new messages