SQLAlchemy update on mysql updates less row than directly from IDE or command line

134 views
Skip to first unread message

Vincent T

unread,
Feb 14, 2021, 8:36:17 AM2/14/21
to sqlalchemy
Mysql v8.0.23 on ubuntu 20.04
SQLAlchemy 1.3.22
Python 3.8.5

I have an SQL query as shown below that updates row in database.
When run by command line or using DBeaver SQL editor, I get 583k lines updated.
When using SQLAlchemy I get only 450k lines updated. i.e. some rows get updated but not all.

I checked the user permissions, the charset and collation, everything is ok.
I managed to narrow it down to a few IDs where it fails with SQL alchemy but works in IDE or CLI. However the underlying data has absolutely nothing different.

I've also tried to copy the query from the echo log and it works fine in IDE or CLI despite failing in SQLAlchemy python script.

Please see below output logs, script with query:

import time
import sqlalchemy
from datetime import *
from sqlalchemy import create_engine, exc
from dateutil.parser import parse

connect_args = {'init_command':"SET @@collation_connection='utf8mb4_0900_ai_ci'"}
# instantiate db connection
eng = create_engine('mysql+mysqldb://xxxxxxx@localhost:3306/mydb?charset=utf8mb4', connect_args=connect_args, echo=True)


try:
    query = """UPDATE magic_users mu
        JOIN (
            SELECT
                min.user_id as user_id,
                (max.total_followers - min.total_followers) as progression
            FROM(
                select
                  user_id, measurement_date, total_followers
                from followers_totals ft
                where measurement_date = (
                    select max(measurement_date)
                    from followers_totals as f
                    where f.user_id = ft.user_id
                    and date_format(ft.measurement_date, '%%Y-%%m-%%d') >= date_format(date_sub(CURDATE(), interval 14 day), '%%Y-%%m-%%d'))
                 ) max
            JOIN (
                select
                user_id,measurement_date,total_followers
                from followers_totals ft
                where measurement_date = (
                    select min(measurement_date)
                    from followers_totals as f
                    where f.user_id = ft.user_id
                    and date_format(ft.measurement_date, '%%Y-%%m-%%d') >= date_format(date_sub(CURDATE(), interval 14 day), '%%Y-%%m-%%d'))
                 ) min
            ON max.user_id = min.user_id
            WHERE min.user_id = '43691' and max.user_id = '43691') progression
        ON progression.user_id = mu.user_id
        SET mu.followers_count_progress_14D = progression.progression
        WHERE progression.user_id is not null;"""

with eng.connect() as con:
    rs = con.execute(query)
    print(rs.rowcount)
    con.close()
except exc.SQLAlchemyError as e:
    print (e)


2021-02-14 13:27:59,593 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'sql_mode'
2021-02-14 13:27:59,593 INFO sqlalchemy.engine.base.Engine ()
2021-02-14 13:27:59,595 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'lower_case_table_names'
2021-02-14 13:27:59,595 INFO sqlalchemy.engine.base.Engine ()
2021-02-14 13:27:59,598 INFO sqlalchemy.engine.base.Engine SELECT DATABASE()
2021-02-14 13:27:59,598 INFO sqlalchemy.engine.base.Engine ()
2021-02-14 13:27:59,599 INFO sqlalchemy.engine.base.Engine show collation where `Charset` = 'utf8mb4' and `Collation` = 'utf8mb4_bin'
2021-02-14 13:27:59,599 INFO sqlalchemy.engine.base.Engine ()
2021-02-14 13:27:59,601 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS CHAR(60)) AS anon_1
2021-02-14 13:27:59,601 INFO sqlalchemy.engine.base.Engine ()
2021-02-14 13:27:59,602 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS CHAR(60)) AS anon_1
2021-02-14 13:27:59,602 INFO sqlalchemy.engine.base.Engine ()
2021-02-14 13:27:59,602 INFO sqlalchemy.engine.base.Engine SELECT CAST('test collated returns' AS CHAR CHARACTER SET utf8mb4) COLLATE utf8mb4_bin AS anon_1
2021-02-14 13:27:59,602 INFO sqlalchemy.engine.base.Engine ()
2021-02-14 13:27:59,603 INFO sqlalchemy.engine.base.Engine UPDATE magic_users mu JOIN (SELECT min.user_id as user_id, (max.total_followers - min.total_followers) as progression FROM( select user_id, measurement_date, total_followers from followers_totals ft where measurement_date = (select max(measurement_date) from followers_totals as f where f.user_id = ft.user_id and date_format(ft.measurement_date, "%%Y-%%m-%%d") >= date_format(date_sub(CURDATE(), interval 14 day), "%%Y-%%m-%%d"))) max JOIN ( select user_id, measurement_date, total_followers from followers_totals ft where measurement_date = (select min(measurement_date) from followers_totals as f where f.user_id = ft.user_id and date_format(ft.measurement_date, "%%Y-%%m-%%d") >= date_format(date_sub(CURDATE(), interval 14 day), "%%Y-%%m-%%d"))) min ON max.user_id = min.user_id WHERE min.user_id = 43691 and max.user_id = 43691) progression ON progression.user_id = mu.user_id SET mu.followers_count_progress_14D = progression.progression WHERE progression.user_id is not null;
2021-02-14 13:27:59,603 INFO sqlalchemy.engine.base.Engine ()
2021-02-14 13:27:59,617 INFO sqlalchemy.engine.base.Engine COMMIT
0

Any help or idea about how to debug this would be very much appreciated.

Mike Bayer

unread,
Feb 14, 2021, 10:22:35 AM2/14/21
to noreply-spamdigest via sqlalchemy
looks like you are matching on timestamps.   is there a timezone or date truncation issue occurring ?  check the timestamps etc. and make sure they match up.     Create small test scripts using SQLAlchemy with echo='debug' to see that the rows you expect are matching.    your SQL query does not seem to use any Python values nor anything really specific to SQLAlchemy so try writing a simple mysqlclient script also.
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.

Vincent T

unread,
Feb 14, 2021, 10:48:30 PM2/14/21
to sqlalchemy
Hi Mike and thanks for taking the time to look into that.

The query itself works fine. As mentioned it updates perfectly in mysql CLI or using a sql client like DBeaver.
I followed your suggestion and tried using mysql.connector and the query works fine.
In the meantime I also tried to use the session API instead of the connection API of SQLAlchemy and it works.... funny.
I still wonder what the issue is with the connection API. I tried to force autocommit=True but same issue. And the commit message is anyway already showing in the logs.

Reply all
Reply to author
Forward
0 new messages