Inconsistent SQL Generation in 2.0

31 views
Skip to first unread message

Benjamin Taub

unread,
May 12, 2023, 4:30:29 PM5/12/23
to sqlalchemy
I have code that worked under SQLAlchemy 1.4 but recently upgraded to 2. I am using the add_columns() method to add columns to an existing SQL statement. The resultant queries sometimes, but not always, crash. I believe the issue happens when the schema/database name (I'm using MySQL) starts with a number. When the schema name starts with a letter, the result runs fine. However, when it starts with a number, the query double-quotes the schema name, causing the query to crash.

Here is an example...
My code: sql = sql.add_columns(self.tbl.c[field])

When the schema holding self.tbl.c[field] starts with a letter (c6961a19b7ed031ce902f056c725b3e3), the following SQL is generated:
SELECT NULL AS "Application Id", NULL AS "First Name", NULL AS "Last Name", NULL AS "Email", c6961a19b7ed031ce902f056c725b3e3.t_31392eb2e6980f4d5082b7861182f2b4.master_key
FROM c6961a19b7ed031ce902f056c725b3e3.t_31392eb2e6980f4d5082b7861182f2b4

However, when the schema name starts with a number (283ac7717fe770c5ed6d425c0c739cba), the following SQL results:
SELECT NULL AS "Application Id", NULL AS "First Name", NULL AS "Last Name", NULL AS "Email", "283ac7717fe770c5ed6d425c0c739cba".t_59a33cbea3617986d810e9fbae60ba19.master_key
FROM "283ac7717fe770c5ed6d425c0c739cba".t_59a33cbea3617986d810e9fbae60ba19


Note the double quotes around the schema name. This second SQL crashes as invalid. Back quotes (`) would probably work fine in this situation, and could be helpful, but double quotes (") are, I think, the cause of my problem.

Is there some parameter or assumption that I'm not understanding, or did I find a bug?

Thank you!
Ben


S Mahabl

unread,
May 12, 2023, 4:38:23 PM5/12/23
to sqlal...@googlegroups.com
Is there alias you can give for

283ac7717fe770c5ed6d425c0c739cba".t_59a33cbea3617986d810e9fbae60ba19 XYZ

?

--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
http://www.sqlalchemy.org/
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/76d38390-e17f-4b90-a438-ee078944b5ffn%40googlegroups.com.

Benjamin Taub

unread,
May 12, 2023, 4:48:14 PM5/12/23
to sqlalchemy
Thanks for the suggestion. 
I thought about using an alias but came up with two problems:
1) The FROM clause is auto-generated from SELECT clause, so I'm not sure where I would add this alias
2) All the aliases in the SELECT clause are also double-quoted. I believe a single backquote would be more standard here. Perhaps there is a general use on double quotes in SQL generation when backquotes would be more appropriate?

In my case, I can solve this by making sure that schema names always start with a letter. This will prevent similar situations lurking elsewhere in my code. But, I'll have to be careful that I don't mess up any code that deals with previously-created schemas. Shouldn't be much of an issue today, however.

Mike Bayer

unread,
May 12, 2023, 4:48:45 PM5/12/23
to noreply-spamdigest via sqlalchemy
The quoting, if it were the correct quoting format, should be fine.   As to why it's the quote char and not the backtick, are you compiling these queries manually?    You would want to make sure a MySQL dialect is in use, which would be using backticks for quoting, unless that dialect were initialized against a MySQL database that has ANSI_QUOTES set.

TL;DR quoting is a new thing here but SQLAlchemy should render the correct quotes when used with the MySQL dialect.

Here's a demo:

from sqlalchemy import Column
from sqlalchemy import create_engine
from sqlalchemy import Integer
from sqlalchemy import select
from sqlalchemy import String
from sqlalchemy.orm import declarative_base
Base = declarative_base()


class A(Base):
    __tablename__ = 't_59a33cbea3617986d810e9fbae60ba19'

    __table_args__ = {
        "schema": "283ac7717fe770c5ed6d425c0c739cba"
    }
    id = Column(Integer, primary_key=True)
    data = Column(String)

e = create_engine("mysql://")

stmt = select(A)

print(stmt.compile(e))

output:

SELECT `283ac7717fe770c5ed6d425c0c739cba`.t_59a33cbea3617986d810e9fbae60ba19.id, `283ac7717fe770c5ed6d425c0c739cba`.t_59a33cbea3617986d810e9fbae60ba19.data 
FROM `283ac7717fe770c5ed6d425c0c739cba`.t_59a33cbea3617986d810e9fbae60ba19

S Mahabl

unread,
May 12, 2023, 4:59:44 PM5/12/23
to sqlal...@googlegroups.com

Another idea is to create a view in another schema pointing to the same table ... 

Benjamin Taub

unread,
May 13, 2023, 5:12:11 PM5/13/23
to sqlalchemy
Thank you, Mike, but aren't I using the correct dialect with this create_engine() command?

        qry_engine = create_engine('mysql+pymysql://' + db_uid + ':' + db_pw + '@' + db_addr, connect_args=connect_args,
                                   pool_recycle=3600, echo=False, future=True)

Or, am I missing something? (BTW, I'm using core, not ORM).
Thanks again!
Ben 

Benjamin Taub

unread,
May 13, 2023, 5:58:00 PM5/13/23
to sqlalchemy
For what it's worth, I believe this double quoting issue is also happening with the .label() method of the column object (assuming that I do have the dialect appropriately set). The following code...

tbl_col = self.tbl.c[formula_col['table_column']].label(tmplt_col['name'])

Results in the following result in the Pycharm debugger (notice the double quotes)...

's_42331a8d8a8f3ca73660cc078731dc72.t_ec37b6f3271c0f97429d5301a41be3da."ApplicationId"'

and then running...
sql = sql.add_columns(tbl_col)

Results in the following...
SELECT s_42331a8d8a8f3ca73660cc078731dc72.t_ec37b6f3271c0f97429d5301a41be3da."ApplicationId" AS "Application Id"
FROM s_42331a8d8a8f3ca73660cc078731dc72.t_ec37b6f3271c0f97429d5301a41be3da

Any suggestions? Thanks for your help!
Ben

Mike Bayer

unread,
May 13, 2023, 11:33:03 PM5/13/23
to noreply-spamdigest via sqlalchemy


On Sat, May 13, 2023, at 5:12 PM, Benjamin Taub wrote:
Thank you, Mike, but aren't I using the correct dialect with this create_engine() command?

        qry_engine = create_engine('mysql+pymysql://' + db_uid + ':' + db_pw + '@' + db_addr, connect_args=connect_args,
                                   pool_recycle=3600, echo=False, future=True)

Or, am I missing something? (BTW, I'm using core, not ORM).

looks fine.  I would need to see a complete example in how you are seeing it generate quotes.  if you try the program I attached previously, you should note that it generates backticks.

Benjamin Taub

unread,
May 14, 2023, 12:39:56 PM5/14/23
to sqlalchemy
Hi, Mike!
I tried your code, and it worked after I set the dialect to mysql+pymysql.

Given this, in my case, I believe the problem stems from the fact that I am starting with a generic SELECT call that isn't moored to a dialect. I start with 
        sql = select()

Which I have now changed to (self.tbl is a sqlalchemy table object attached to MySQL)...
        sql = self.tbl.select()

The problem now is that I have a null column object with a label that isn't attached to self.tbl, and I can't figure out how to tell it to generate as MySQL...
            tbl_col = null().label(tmplt_col['name'])
 
It is tbl_col that eventually gives me the quote problem in SQL generation. Do you have any ideas for me on how to have this generate in the right dialect? Am I going about this the wrong way?

Thanks again for your help!
Ben

Mike Bayer

unread,
May 14, 2023, 11:26:27 PM5/14/23
to noreply-spamdigest via sqlalchemy


On Sun, May 14, 2023, at 12:39 PM, Benjamin Taub wrote:
Hi, Mike!
I tried your code, and it worked after I set the dialect to mysql+pymysql.

Given this, in my case, I believe the problem stems from the fact that I am starting with a generic SELECT call that isn't moored to a dialect. I start with 
        sql = select()

Modern SQLAlchemy SQL constructs like select() have no connection to any dialect.    Only if you are using 1.x and using the very dated "bound metadata" concept which is now removed, is there any truth to this.   But any SQL construct can be compiled with any dialect at all times (Assuming it contains no dialect-specific constructs).


Which I have now changed to (self.tbl is a sqlalchemy table object attached to MySQL)...
        sql = self.tbl.select()
The problem now is that I have a null column object with a label that isn't attached to self.tbl, and I can't figure out how to tell it to generate as MySQL...
            tbl_col = null().label(tmplt_col['name'])
 
It is tbl_col that eventually gives me the quote problem in SQL generation. Do you have any ideas for me on how to have this generate in the right dialect? Am I going about this the wrong way?

I would need to see an example of what you're doing as this does not really indicate why there would be any problem.   as mentioned previously, any SQL construct when compiled is always given a dialect with which to compile with, and you can always pass this in.

it's still not clear here if you are using .compile() directly or not as I dont really have an understanding of what you're doing.


Benjamin Taub

unread,
May 15, 2023, 1:28:21 PM5/15/23
to sqlalchemy
Hey, Mike!
I just wanted to let you know I figured this out. Your sample code led me to something I hadn't considered. After the routine I was stuck in, I take the SQLAlchemy-generated code, turn it into a string, and add some manually-generated SQL. I used a command like str(SQL) to get this string. I am pretty sure that this worked in v1.4. However, it turns out that I now have to use str(sql.compile(common.qry_engine)).

I'm not sure if this jives with the behavior you would expect, but it seems to be working. Please let me know if you have any questions.

Thank you so much for your help!
Ben

Mike Bayer

unread,
May 15, 2023, 2:26:02 PM5/15/23
to noreply-spamdigest via sqlalchemy


On Mon, May 15, 2023, at 1:28 PM, Benjamin Taub wrote:
Hey, Mike!
I just wanted to let you know I figured this out. Your sample code led me to something I hadn't considered. After the routine I was stuck in, I take the SQLAlchemy-generated code, turn it into a string, and add some manually-generated SQL. I used a command like str(SQL) to get this string.

yes, it was fairly clear you were likely doing something like this.





I am pretty sure that this worked in v1.4.

in only "worked" out of luck, that the generic SQL happened to match what MySQL wanted.   There are many points at which this would fail with SQL constructs that have some idiosyncratic rendering in MySQL.   

That said, it's not exactly familiar to me how identifiers that start with numbers started getting quoted for all dialects; there was definitely something there regarding the Oracle backend but I dont know offhand if/how that got generalized.





However, it turns out that I now have to use str(sql.compile(common.qry_engine)).

this is definitely the way to do this for stringifying SQL, as every backend has lots of idiosyncracies in its SQL format.

now there is a bigger issue which is that it's not the best way to add text to SQL by running str() on it, when you run str(stmt) and then just run the text, a whole lot of valuable internal data is lost when you execute the statement which helps with things like applying the correct types to result rows, fetching ORM objects, etc.     The str() process is also not that cheap as there is no caching used, running statements directly allows the engine to take advantage of SQL caching.

There's other ways to add literal strings inside of SQL, with two general categories of injection of strings as the object is constructed, and mutation of the final compiled SQL string as it is executed.  depending on what you are doing, one or the other approach is appropriate and will retain all the advantages of compiled execution and caching.




Reply all
Reply to author
Forward
0 new messages