simple query takes to long

146 views
Skip to first unread message

Trainer Go

unread,
Jun 7, 2022, 5:46:31 AM6/7/22
to sqlalchemy
Hello guys,

Im executing 2 queries in my python program with sqlalchemy using the pyodbc driver.
The database is a Adaptive SQL Anywhere Version 7 32 Bit.

When im executing the queries in a DB UI it takes 5-6 seconds for both together and when im using the same queries in my python programm it takes 5-6 minutes instead of 6 seconds. What im doing wrong? Im new at this.

would the connection string or query help?
And i only selecting some datas from the db and converting it into two dataframes so i dont inserting, updating or deleting datas.

I hope somebody can help me.

Best regards Manuel

Trainer Go

unread,
Jun 7, 2022, 7:23:46 AM6/7/22
to sqlalchemy
this is my connection string and query:

connection_string = ('DRIVER={Adaptive Server Anywhere 7.0};Commlinks=tcpip{ip=;Port=port};ENG=dbName;UID=uerID;PWD=PW')
connection_url = URL.create(
    "sybase+pyodbc",
    query={"odbc_connect": connection_string}
    )
engine = create_engine(connection_url)
conn = engine.connect()

res = conn.execute("select ").fetchall()

table_df = pd.DataFrame(res)

its an simple query..

Philip Semanchuk

unread,
Jun 7, 2022, 11:13:28 AM6/7/22
to sqlal...@googlegroups.com


> On Jun 7, 2022, at 5:46 AM, Trainer Go <maego...@gmail.com> wrote:
>
> Hello guys,
>
> Im executing 2 queries in my python program with sqlalchemy using the pyodbc driver.
> The database is a Adaptive SQL Anywhere Version 7 32 Bit.
>
> When im executing the queries in a DB UI it takes 5-6 seconds for both together and when im using the same queries in my python programm it takes 5-6 minutes instead of 6 seconds. What im doing wrong? Im new at this.

To start, debug one query at a time, not two.

Second, when you test a query in your DB UI, you’re probably already connected to the database. Your Python program has to make the connection — that’s an extra step, and it might be slow. If you step through the Python program in the debugger, you can execute one statement at a time (the connection and the query) to understand how long each step takes. That will help to isolate the problem.

Third, keep in mind that receiving results takes time too. If your DB UI is written in C or some other language that allocates memory very efficiently, it might be a lot faster than building a Pandas dataframe.

You might want to eliminate Pandas entirely so you don’t have to question whether or not that’s the source of your slowdown. You could do this instead -

for row in conn.execute(my_query).fetchall():
pass

That will force your Python program to iterate over the result set without being forced to allocate memory for all the results.

Hope this helps
Philip






>
> would the connection string or query help?
> And i only selecting some datas from the db and converting it into two dataframes so i dont inserting, updating or deleting datas.
>
> I hope somebody can help me.
>
> Best regards Manuel
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
> ---
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/b306121e-913c-4ca5-bc2d-6308d76d1b76n%40googlegroups.com.

Trainer Go

unread,
Jun 8, 2022, 3:27:04 AM6/8/22
to sqlalchemy
thank you Philip,

I will test it today.


Greetings Manuel

Trainer Go

unread,
Jun 8, 2022, 4:46:18 AM6/8/22
to sqlalchemy
So, the problem is the select itself and not the db conncection.

Im creating an interactive dashboard with a timed query every day to have the data in the dashboard up to date and irs easier to work with an pandas dataframe in combination with the data visualisation.
when im selecting only one year instead multiple the query takes 17 seconds instead of 5 minutes. So the problem must be something else. But I have no clue.

this is my query :

select Filiale, datum_auftrag as Datum, datepart(wk, datum_auftrag) as Woche, count(auftragsnummer) as Anzahl, sum(endpreis_kunde + kassenanteil) as Umsatz from view where year(datum_auftrag) >= 2010  group by Filiale, datum_auftrag order by  Filiale, datum_auftrag

Trainer Go

unread,
Jun 8, 2022, 8:29:32 AM6/8/22
to sqlalchemy
When im using pandas with pd.read_sql_query()
with chunksize to minimiza the memory usage there is no difference between both runtimes..

table_df = pd.read_sql_query('''select , engine, chunksize = 30000)

for df in table_df:
  print(df)

the runtime is nearly the same like 5 minutes



#print(table_df) result: #generator object SQLDatabase._query_iterator at 0x0DC69C30>
I dont know if the query will be triggered by using print(table_df) the result is generator object SQLDatabase._query_iterator at 0x0DC69C30>

but the runtime is 6 seconds like in the DBUI im using.

I have no clue what to do.

Greetings Manuel

Trainer Go schrieb am Mittwoch, 8. Juni 2022 um 09:27:04 UTC+2:

Philip Semanchuk

unread,
Jun 8, 2022, 9:04:08 AM6/8/22
to sqlal...@googlegroups.com


> On Jun 8, 2022, at 8:29 AM, Trainer Go <maego...@gmail.com> wrote:
>
> When im using pandas with pd.read_sql_query()
> with chunksize to minimiza the memory usage there is no difference between both runtimes..

Do you know that, or is that speculation?

>
> table_df = pd.read_sql_query('''select , engine, chunksize = 30000)
>
> for df in table_df:
> print(df)
>
> the runtime is nearly the same like 5 minutes

Printing to the screen also takes time, and your terminal probably buffers the results, which requires memory allocation. I’m not saying this is your problem (it probably isn’t), but your test still involves pandas and your terminal, both of which cloud the issue. You would benefit from simplifying your tests.

Did you try this suggestion from my previous email?


> for row in conn.execute(my_query).fetchall():
> pass

Also, are you 100% sure you’re executing the same query from SQLAlchemy that you’re pasting into your DB UI?

Cheers
Philip

Trainer Go

unread,
Jun 8, 2022, 9:28:38 AM6/8/22
to sqlalchemy
Hello Phil,

i tested both and without printing the result.

table_df = pd.read_sql_query(''SELECT, engine)
#print(table_df)
#query = "SELECT"
#for row in conn.execute(query).fetchall():
#    pass


both have nearly the same runtime. So this is not my problem. And yes, they are the same queries cause i copy pasted the select from my DBUI where is tested first the results and the runtime and i expected the same runtime in my program but no ;)

Greeting Manuel

Jonathan Vanasco

unread,
Jun 8, 2022, 11:28:01 AM6/8/22
to sqlalchemy
When you select in the database ui tool, you are just displaying raw data.

When you select within your code snippets above, Python is creating pandas' DataFrame objects for the results.

These two concepts are not comparable at all.  Converting the SQL data to Python data structures in Pandas (and SQLAlchemy's ORM) is a lot of overhead - and that grows with the result size.

You can use memory and code profiling tools to explore this and see where the issues are. The best approach is what Philip suggested above though, and not use pandas, so you can see how Python/SqlAlchemy handles the raw data.


Trainer Go

unread,
Jun 9, 2022, 3:37:25 AM6/9/22
to sqlalchemy
Hello Jonathan,

i already executed the query without using pandas in my programm

query = "SELECT"
for row in conn.execute(query).fetchall():
    pass

the result was the same runtime with pandas.

So this cant be the problem. I think so.

Greetings Manuel

Simon King

unread,
Jun 9, 2022, 9:17:04 AM6/9/22
to sqlal...@googlegroups.com
How many rows are you fetching, and how many columns in each row?

--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
http://www.sqlalchemy.org/
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages