Cannot insert strings with a length greater than 2000 into columns with a datatype of Text, VARCHAR, or NVARCHAR using MS SQL Server 2017 and 2019

Nicolas Lykke Iversen

unread,

Oct 13, 2020, 10:50:50 AM10/13/20

to sqlalchemy

Hi SQLAlchemy,

System information:

Mac OS X v. 10.15.7
Python v. 3.8.5
SQLAlchemy v. 1.3.19
MS SQL Server 2017 and 2019 (both Enterprise and Docker images e.g. mcr.microsoft.com/mssql/server:2019-latest)

Problem:

I have an issue with inserting strings with a length greater than 2000 into columns with a datatype of Text, VARCHAR, or NVARCHAR using MS SQL Server 2017 and 2019.

I've checked the MS SQL Server and it creates it column properly with e.g. a datatype of varchar(max)for Text and VARCHAR, which should be able to store strings with a size up to 2 GB according to Microsoft documentation.

Furthermore, I've tried using other collations, but I need _SC (supplementary character) support for my applications, so I cannot drop it, and adding _UTF8 (UTF-8) doesn't solve the problem either.

Why am I not allowed to store strings with a size greater than 2000? And why is SQLAlchemy displaying that error message, when trying to insert plain ASCII text ("AAAA...")?

I would really appreciate some guidance on how to solve this problem. What could be causing it?

Error:

sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Cannot convert to text/ntext or collate to 'Latin1_General_100_CI_AS_SC' because these legacy LOB types do not support UTF-8 or UTF-16 encodings. Use types varchar(max), nvarchar(max) or a collation which does not have the _SC or _UTF8 flags. (4189) (SQLParamData)")

[SQL: INSERT INTO msg (content) OUTPUT inserted.id VALUES (?)]

[parameters: ('AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ... (1703 characters truncated) ... AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA',)]

(Background on this error at: http://sqlalche.me/e/13/f405)

Program (POC):

import logging
import sqlalchemy
from sqlalchemy import Column, Text, Integer, NVARCHAR, VARCHAR
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

logging.root.setLevel(logging.DEBUG)

Base = declarative_base()

class Msg(Base):
__tablename__ = 'msg'

id = Column(Integer, primary_key=True, autoincrement=True)
content = Column(VARCHAR, nullable=False)

user = 'sa'
pwd = 'P@ssw0rd'
host = 'localhost'
port = 1433
db = 'test'

logging.info('started.')

engine = sqlalchemy.create_engine(
f'mssql+pyodbc://{user}:{pwd}@{host}:{port}/{db}?driver=ODBC+Driver+17+for+SQL+Server',
pool_size=5,
max_overflow=10,
pool_pre_ping=True,
isolation_level='READ_UNCOMMITTED',
pool_recycle=900,
echo=False,
connect_args={'connect_timeout': 10})

Base.metadata.create_all(engine)

Session = sessionmaker(bind=engine, autoflush=False, autocommit=False)
session = Session()

for i in range(10000):
try:
msg = Msg(content='A' * i)
session.add(msg)
session.commit()
except Exception as exc:
logging.exception(f'fail: {i=}: {exc}')
break
else:
logging.info(f'success: {i=}')

logging.info('done.')

Mike Bayer

unread,

Oct 13, 2020, 4:22:02 PM10/13/20

to noreply-spamdigest via sqlalchemy

On Tue, Oct 13, 2020, at 10:50 AM, Nicolas Lykke Iversen wrote:

Hi SQLAlchemy,

System information:
Mac OS X v. 10.15.7
Python v. 3.8.5
SQLAlchemy v. 1.3.19
MS SQL Server 2017 and 2019 (both Enterprise and Docker images e.g. mcr.microsoft.com/mssql/server:2019-latest)
Problem:
I have an issue with inserting strings with a length greater than 2000 into columns with a datatype of Text, VARCHAR, or NVARCHAR using MS SQL Server 2017 and 2019.

I've checked the MS SQL Server and it creates it column properly with e.g. a datatype of varchar(max)for Text and VARCHAR, which should be able to store strings with a size up to 2 GB according to Microsoft documentation.

Furthermore, I've tried using other collations, but I need _SC (supplementary character) support for my applications, so I cannot drop it, and adding _UTF8 (UTF-8) doesn't solve the problem either.

Why am I not allowed to store strings with a size greater than 2000?

I'm not really sure, this has to do with a behavior of SQL Server and/or your ODBC driver. a google search finds fairly scant results but there is a long discussion regarding this error here: https://www.sqlservercentral.com/forums/topic/collation-error-when-adding-distributer

And why is SQLAlchemy displaying that error message, when trying to insert plain ASCII text ("AAAA...")?

SQLAlchemy runs SQL statements using a method called "cursor.execute()", which is a feature of the DBAPI (database driver) in use. As your error message indicates you're using the pyodbc driver, this method is documented here: https://github.com/mkleehammer/pyodbc/wiki/Cursor#executesql-parameters

The DBAPI execute() method, and most of the other DBAPI methods, can throw exceptions if something goes wrong. SQLAlchemy has the policy that if it encounters such an exception when it calls cursor.execute(), it wraps it in a SQLAlchemy-specific version of that exception (named the same) and then throws it.

In other words you're using a driver called pyodbc that's creating this error. SQLAlchemy just propagates it for you but otherwise has nothing to do with how it's produced.

--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/ca549391-4360-480e-8c58-06577f6d92dan%40googlegroups.com.

Nicolas Lykke Iversen

unread,

Oct 13, 2020, 4:57:37 PM10/13/20

to sqlal...@googlegroups.com

Thank you, Mike - very much appreciated!

Just to be clear, pyodbc is not a driver, it’s a ODBC-compliant DBAPI, right? I separately downloaded a driver for SQL Server from Microsoft, which pyodbc makes use of.

Do you suggest that changing pyodbc to another SQL Server DPAPI would solve the problem? If so, can you recommend another DBAPI for SQL Server? Or do you think that the problem is caused by Microsoft’s driver?

I’m pretty sure SQL Server works fine when accessed using .NET, otherwise the Internet would be full of complaints regarding not being able to insert +2000 characters in a varchar(max).

Best regards

Nicolas

You received this message because you are subscribed to a topic in the Google Groups "sqlalchemy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sqlalchemy/Kk6DkPNWlR4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sqlalchemy+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/b4a135ef-1c1e-476c-a3f0-60326dca76ea%40www.fastmail.com.

Mike Bayer

unread,

Oct 13, 2020, 6:43:25 PM10/13/20

to noreply-spamdigest via sqlalchemy

On Tue, Oct 13, 2020, at 4:57 PM, Nicolas Lykke Iversen wrote:

Thank you, Mike - very much appreciated!

Just to be clear, pyodbc is not a driver, it’s a ODBC-compliant DBAPI, right? I separately downloaded a driver for SQL Server from Microsoft, which pyodbc makes use of.

right the pyodbc is the DBAPI in this case, which we usually refer to as a "driver" but in the case of ODBC the "driver" is more specifically the separate ODBC driver component.

Do you suggest that changing pyodbc to another SQL Server DPAPI would solve the problem?

I suggest that if there is no issue with the query you're running outside of the context of pyodbc that you submit an issue to pyodbc at https://github.com/mkleehammer/pyodbc/issues . However I was able to find a discussion thread about your error message that seemed to be independent of ODBC.

If so, can you recommend another DBAPI for SQL Server? Or do you think that the problem is caused by Microsoft’s driver?

pyodbc is the only supported driver for SQL Server that exists now for Python. You also definitely want to use Microsoft's ODBC drivers so you're already there.

I’m pretty sure SQL Server works fine when accessed using .NET, otherwise the Internet would be full of complaints regarding not being able to insert +2000 characters in a varchar(max).

you'd want to see if the same ODBC driver and options are in use in that scenario.

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/CAE%2B5CJABrO_459MHtDePMXah8mb67TFvQB8rUwgVgk6%2By4v-ow%40mail.gmail.com.

Nicolas Lykke Iversen

unread,

Oct 14, 2020, 5:35:53 AM10/14/20

to sqlalchemy

Hi Mike,

I've now tested inserting strings with more than 2000 characters using Azure Data Studio (SQL Server GUI) and everything works.

Furthermore, I've tested pyodbc (DBAPI) directly, and it only fails when inserting such strings using parameterised SQL queries (it succeeds without using parametrised queries).

You can see my POC below, if you have any interest.

I guess it should be submitted as a bug to pyodbc... Do you know if I can disable parametrisation for certain SQL queries in SQLAlchemy?

Best regards (and thanks for your help and support!!!)

Nicolas

System info:

python v. 3.8.5

pyodbc v. 4.0.30

msodbcsql17 v. 17.6.1.1

POC:

import sys

import pyodbc

host = 'tcp:127.0.0.1,1433'

db = 'pyodbc_test'

user = 'sa'

pwd = 'P@ssw0rd'

print('started')

cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+host+';DATABASE='+'master'+';UID='+user+';PWD='+ pwd, autocommit=True)

cursor = cnxn.cursor()

try:

cursor.execute(f'CREATE DATABASE {db} COLLATE Latin1_General_100_CI_AS_SC')

except pyodbc.ProgrammingError as e:

pass # database exists

cursor.execute(f'USE {db}')

try:

cursor.execute("""

CREATE TABLE msg (

id int identity(1,1) not null,

content varchar(max) not null

);""")

except pyodbc.ProgrammingError as exc:

pass # table exists

content = 2000 * 'A'

cursor.execute(f"""

INSERT INTO msg (content)

VALUES ('{content}')""")

print(f'non-param: {len(content)=}: success')

sql = f"""

INSERT INTO msg (content)

VALUES (?)"""

cursor.execute(sql, (content))

print(f'param: {len(content)=}: success')

content = 2001 * 'A'

cursor.execute(f"""

INSERT INTO msg (content)

VALUES ('{content}')""")

print(f'non-param: {len(content)=}: success')

# this fails!

sql = f"""

INSERT INTO msg (content)

VALUES (?)"""

cursor.execute(sql, (content))

print(f'param: {len(content)=}: success')

#cursor.execute('SELECT * FROM msg')

#rows = cursor.fetchall()

#for r in rows:

# print(r)

print('finished')

Mike Bayer

unread,

Oct 14, 2020, 9:36:11 AM10/14/20

to noreply-spamdigest via sqlalchemy

On Wed, Oct 14, 2020, at 5:35 AM, Nicolas Lykke Iversen wrote:

Hi Mike,

I've now tested inserting strings with more than 2000 characters using Azure Data Studio (SQL Server GUI) and everything works.

Furthermore, I've tested pyodbc (DBAPI) directly, and it only fails when inserting such strings using parameterised SQL queries (it succeeds without using parametrised queries).

that would be expected because all the datatype-related issues occur when bound parameters are passed.

You can see my POC below, if you have any interest.

I guess it should be submitted as a bug to pyodbc... Do you know if I can disable parametrisation for certain SQL queries in SQLAlchemy?

there is not and this is definitely an issue that has to be solved at the pyodbc level, either a bug on their end or something in your configuration that has to be changed.

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/60f82633-887a-43a3-ac52-c5541058e0bcn%40googlegroups.com.

Nicolas Lykke Iversen

unread,

Oct 15, 2020, 3:26:58 AM10/15/20

to sqlalchemy

Hi Mike,

I have created an issue for pyodbc: https://github.com/mkleehammer/pyodbc/issues/835#issuecomment-708930870

I've gotten really good feedback there from Microsoft, and a fix has been proposed that works:

"You can try to use setinputsizes on your parameter to tell it to send as varchar(max): cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),])"

I'm by no means an SQLAlchemy expert, but shouldn't the pyodbc dialect be updated to support varchar(max)using the proposed method? If not, how can I execute cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),]) using SQLAlchemy, so that I can make use of varchar(max)in my application?

Can you recommend a hotfix for using varchar(max)in current SQLAlchemy applications that need to handle Unicode supplementary characters (_SC)?

I appreciate really appreciate your help.

Best regards

Nicolas

Simon King

unread,

Oct 15, 2020, 5:05:32 AM10/15/20

to sqlal...@googlegroups.com

You could call 'setinputsizes' in a handler for the
'before_cursor_execute' event, something like this:

from sqlalchemy import event

@event.listens_for(SomeEngine, 'before_cursor_execute')
def receive_before_cursor_execute(conn, cursor, statement,
parameters, context, executemany):
cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),])

https://docs.sqlalchemy.org/en/13/core/events.html#sqlalchemy.events.ConnectionEvents
https://docs.sqlalchemy.org/en/13/core/events.html#sqlalchemy.events.ConnectionEvents.before_cursor_execute

Hope that helps,

Simon

> To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/f3e2a277-8529-4fd5-83be-26445616f6c3n%40googlegroups.com.

Nicolas Lykke Iversen

unread,

Oct 15, 2020, 5:08:41 AM10/15/20

to sqlalchemy

Thank you, Simon.

I'm curious whether this is the way to do it in the future, or whether SQLAlchemy should implement varchar(max)properly?

What would the argument be for not implementing varchar(max)in the pyodbc dialect?

Simon King

unread,

Oct 15, 2020, 7:06:29 AM10/15/20

to sqlal...@googlegroups.com

Do you know if there is a downside to calling setinputsizes like that?
To put it another way, why doesn't pyodbc configure itself like that
automatically? Why do you think this belongs in SQLAlchemy rather than
pyodbc?

I suspect the answer is that most applications don't need it and there
is a downside (perhaps a performance implication?).

I've never used SQL Server, but the fact that the error message refers
to these collations as "legacy" suggests that an alternative collation
might be better.
https://docs.microsoft.com/en-us/sql/relational-databases/collations/collation-and-unicode-support?view=sql-server-ver15#Supplementary_Characters
says:

SQL Server 2019 (15.x) extends supplementary character support to
the char and varchar data types with the new UTF-8 enabled collations
(_UTF8). These data types are also capable of representing the full
Unicode character range.

If you can restrict yourself to SQL Server 2019, that might be a better option.

Simon

On Thu, Oct 15, 2020 at 10:08 AM Nicolas Lykke Iversen

> To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/38a5e081-6e34-42fa-bf99-3d27f445f727n%40googlegroups.com.

Nicolas Lykke Iversen

unread,

Oct 15, 2020, 7:24:22 AM10/15/20

to sqlalchemy

I already tested the _UTF8 encoding using SQL Server 2019 in Docker: mcr.microsoft.com/mssql/server:2019-latest

The POC fails here too with the error, when the database is created with the LATIN1_GENERAL_100_CI_AS_SC_UTF8 collation:

Traceback (most recent call last):

File "test.py", line 56, in <module>

cursor.execute(sql, (content,))

pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Cannot convert to text/ntext or collate to 'Latin1_General_100_CI_AS_SC_UTF8' because these legacy LOB types do not support UTF-8 or UTF-16 encodings. Use types varchar(max), nvarchar(max) or a collation which does not have the _SC or _UTF8 flags. (4189) (SQLParamData)")

However, it works when using cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),]).

You are not right about the collation being legacy - LATIN1_GENERAL_100_CI_AS_SC is used extensively, due to its support for supplementary characters. It's text datatype that's legacy:

If you store character data that reflects multiple languages in SQL Server (SQL Server 2005 (9.x) and later), use Unicode data types (nchar, nvarchar, and ntext) instead of non-Unicode data types (char, varchar, and text).

I'm not saying it's SQLAlchemy fault or that SQLAlchemy should fix this issue. But if you read my pyodbc issue on Github, you will see that they argue that's SQLAlchemy that's using pyodbc incorrectly :(

Mike Bayer

unread,

Oct 15, 2020, 1:39:08 PM10/15/20

to noreply-spamdigest via sqlalchemy

On Thu, Oct 15, 2020, at 3:26 AM, Nicolas Lykke Iversen wrote:

Hi Mike,

I have created an issue for pyodbc: https://github.com/mkleehammer/pyodbc/issues/835#issuecomment-708930870

I've gotten really good feedback there from Microsoft, and a fix has been proposed that works:

"You can try to use setinputsizes on your parameter to tell it to send as varchar(max): cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),])"
I'm by no means an SQLAlchemy expert, but shouldn't the pyodbc dialect be updated to support varchar(max)using the proposed method? If not, how can I execute cursor.setinputsizes([(pyodbc.SQL_WVARCHAR,0,0),]) using SQLAlchemy, so that I can make use of varchar(max)in my application?

SQLAlchemy has some dialects that make use of setinputsizes() out of necessity, but it's an area that is fraught with issues as it means SQLAlchemy is second-guessing what the DBAPI is coming up with.

It's actually news to me that pyodbc supports setinputsizes() as historically, the cx_Oracle DBAPI was the only DBAPI that ever did so and this method is usually not supported by any other DBAPI. We have a hook that calls upon setinputsizes() but right now it's hardcoded to cx_Oracle's argument style, so the hook would need alterations to support different calling styles on different dialects.

In https://github.com/mkleehammer/pyodbc/issues/835#issuecomment-709385941 it is suggested that there should be no need to use this "_SC" collation - then in https://github.com/mkleehammer/pyodbc/issues/835#issuecomment-709428774, you stated "I previously experimented with non-_SC in my application, and it caused errors.". Can you be more specific of these errors? At the moment, this is suggesting a major architectural rework of the pyodbc dialect to support a use case which has other workarounds. The architecture of SQLAlchemy's set_input_sizes() hook has changed and at best this would be part of 1.4 which is not in beta release yet, a production release is not for several months.

From that point, there's an event hook at do_setinputsizes(): https://docs.sqlalchemy.org/en/13/core/events.html?highlight=do_setinputsizes#sqlalchemy.events.DialectEvents.do_setinputsizes that would be usable so that you could set up rules like these on your own, and we eventually would document the known workarounds for various unusual issues.

This issue is definitely unusual, it's never been reported and was difficult to find in google searches, so I don't believe we are using pyodbc incorrectly and it would be nice if pyodbc could someday realize that MS SQL Server is the only database anyone really uses their driver with, and they could perhaps add a SQL Server ruleset directly. If this were a problem that occurred frequently, then we would begin looking into turning on some of this behavior by default but we need to be very conservative on that as this is an area where things break quite a lot.

Below is the recipe that includes a directly vendored version of the set_input_sizes() hook to suit your immediate use case. that's what I can get you for now and it will allow you to set the input sizes any way you'd like.

import pyodbc

from sqlalchemy import Column

from sqlalchemy import create_engine

from sqlalchemy import event

from sqlalchemy import Integer

from sqlalchemy import String

from sqlalchemy.ext.declarative import declarative_base

from sqlalchemy.orm import Session

Base = declarative_base()

e = create_engine(

"mssql+pyodbc://scott:tiger^5HHH@mssql2017:1433/test?driver=ODBC+Driver+13+for+SQL+Server",

echo=True,

)

@event.listens_for(e, "before_cursor_execute")

def before_cursor_execute(

conn, cursor, statement, parameters, context, executemany

):

if not hasattr(context.compiled, "bind_names"):

return

inputsizes = {}

for bindparam in context.compiled.bind_names:

# check for the specific datatype you care about here

if bindparam.type._type_affinity is String:

inputsizes[bindparam] = ((pyodbc.SQL_WLONGVARCHAR, 0, 0),)

else:

inputsizes[bindparam] = None

positional_inputsizes = []

for key in context.compiled.positiontup:

bindparam = context.compiled.binds[key]

dbtype = inputsizes.get(bindparam, None)

positional_inputsizes.append(dbtype)

cursor.setinputsizes(positional_inputsizes)

class A(Base):

__tablename__ = "a"

id = Column(Integer, primary_key=True)

x = Column(Integer)

data = Column(String)

y = Column(Integer)

Base.metadata.drop_all(e)

Base.metadata.create_all(e)

s = Session(e)

s.add(A(data="some data", x=1, y=4))

s.commit()

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/f3e2a277-8529-4fd5-83be-26445616f6c3n%40googlegroups.com.

Nicolas Lykke Iversen

unread,

Oct 16, 2020, 3:53:22 AM10/16/20

to sqlalchemy

Thank you, Mike.

pyODBC has the following to say about the issue:

SQLAlchemy. pyODBC is generic and does not know about special handling of varchar(max), whereas SQLAlchemy appears to have code for specific database types. It needs to call setinputsizes as you described, when the length is more than maximum for non-max types (2K wide characters or 4K bytes).

Let me be clear, I'm not exactly sure whether omitting _SC is the cause of the following error, which I have no idea about how to handle:

...

return connection._execute_clauseelement(self, multiparams, params)

File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement

ret = self._execute_context(

File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context

self._handle_dbapi_exception(

File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1514, in _handle_dbapi_exception

util.raise_(exc_info[1], with_traceback=exc_info[2])

File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_

raise exception

File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1256, in _execute_context

self.dialect.do_executemany(

File "/usr/local/lib/python3.8/site-packages/sqlalchemy/dialects/mssql/pyodbc.py", line 446, in do_executemany

super(MSDialect_pyodbc, self).do_executemany(

File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 590, in do_executemany

cursor.executemany(statement, parameters)

UnicodeEncodeError: 'utf-16-le' codec can't encode character '\udce5' in position 11: surrogates not allowed

It complains about UTF-16-LE and surrogates.

However, since _SC is needed for string operations on the SQL Server backend, don't you think that any real-world application would need it down the road?

Is it really necessary to use your very-subtle vendored version of the set_input_sizes() hook? Why use it compared to Simon King's simple version?

Using Simon King's version I experience a weird issue: it works perfectly, when using a single-threaded application, but when using multiprocessing it doesn't work.

In particular, if I execute:

engine = sqlalchemy.create_engine(
f'mssql+pyodbc://{user}:{pwd}@{host}:{port}/{db}?driver=ODBC+Driver+17+for+SQL+Server',
pool_size=5,
max_overflow=10,
pool_pre_ping=True,
isolation_level='READ_UNCOMMITTED',
pool_recycle=900,

echo=debug,
connect_args={'connect_timeout': 10}
)

@sqlalchemy.event.listens_for(engine, 'before_cursor_execute')

def receive_before_cursor_execute(conn, cursor, statement, parameters, context, executemany):
cursor.setinputsizes([(pyodbc.SQL_WVARCHAR, 0, 0), ])

in the main application and:

def db_init():
engine = common.db.Session.get_bind()
engine.dispose()

in all the children, then the hook gets called in the children, but somehow doesn't affect the INSERTs - the original error is produced for strings with a. length longer than 2000 characters.

Best regards

Nicolas

Mike Bayer

unread,

Oct 16, 2020, 8:14:28 AM10/16/20

to noreply-spamdigest via sqlalchemy

On Fri, Oct 16, 2020, at 3:53 AM, Nicolas Lykke Iversen wrote:

Is it really necessary to use your very-subtle vendored version of the set_input_sizes() hook? Why use it compared to Simon King's simple version?

yes, because cursor.setinputsizes() must be passed an entry for every bound parameter in your statement, in the order that they will be passed to cursor.execute(). this includes for all the numerics, dates, etc for which you certainly don't want to pass those as "varchar". so if the third parameter in your statement was the textual version, you'd need to pass cursor.setinputsizes([None, None, (pyodbc.SQL_WVARCHAR, None, None), ...]). Also in my experimenation with this value you want to pass "None" for length, if not otherwise specified, and not 0.

Simon's version is hardcoding to passing varchar in all cases for a single bound parameter, and I would not expect that recipe to work at all.

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/f963fe8a-a595-48c5-92a8-597046e199c6n%40googlegroups.com.

Simon King

unread,

Oct 16, 2020, 10:25:36 AM10/16/20

to sqlal...@googlegroups.com

Yep, I misunderstood what setinputsizes was doing. I thought it told
pyodbc how it should handle a particular datatype, rather than telling
it how to handle the set of parameters it is about receive in the next
execute() call...

Sorry for adding to the confusion,

Simon

> To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/783293d6-6130-449a-a77f-28118ef3ef20%40www.fastmail.com.

Mike Bayer

unread,

Oct 16, 2020, 4:22:45 PM10/16/20

to noreply-spamdigest via sqlalchemy

On Fri, Oct 16, 2020, at 10:25 AM, Simon King wrote:

Yep, I misunderstood what setinputsizes was doing. I thought it told
pyodbc how it should handle a particular datatype,

that would be great if it worked that way :) however alas...

rather than telling
it how to handle the set of parameters it is about receive in the next
execute() call...

Sorry for adding to the confusion,

no worries at all. this has to be the first time I've ever seen an inaccuracy from your part, looking forward to the next one 15 years from now :)

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/CAFHwexfqcGGMsYpn4y192qncs7Ka0a0CssLfdnUHJCoWGdMRCQ%40mail.gmail.com.

Nicolas Lykke Iversen

unread,

Oct 17, 2020, 6:13:11 AM10/17/20

to sqlalchemy

Let's close this issue. Since pyODBC isn't going to fix anything and claims that SQLAlchemy should use it correctly, Mike's custom set_input_sizes()is the way to solve this problem.

@Mike, having read your description of set_input_sizes(), I guess the reason why Simon's version worked for my single-threaded POC application and failed for my multiprocessing application, is that the POC inserts only a single value (the string), while the other inserts multiple values - sounds right?

I will drop the _SC collation until I explicitly needs in my application for string operations, while probably will happen in the future.

Thanks Mike and Simon for your great support.

Mike Bayer

unread,

Oct 17, 2020, 1:52:50 PM10/17/20

to noreply-spamdigest via sqlalchemy

We also have a reproduction case and at least plans to document using the new hook to workaround, up at https://github.com/sqlalchemy/sqlalchemy/issues/5651 .

however what is quite unusual is that while we can reproduce the error connecting directly to the database with the collation, if we run an INSERT to that table from *another* database, specifying the table as otherdb.dbo.table, then the error does *not* occur. which indicates there's some client-side thing going on that can globally "fix" the problem, or perhaps the data is going in incorrectly, not sure.

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/c2246a7e-9e30-4eb9-8a9a-fe53e3557651n%40googlegroups.com.

Varun Madiath

unread,

Oct 17, 2020, 5:29:41 PM10/17/20

to sqlal...@googlegroups.com

I just want to comment that I have used pyodbc with Exasol before, so there is at least once other ODBC driver that is used in conjunction with pyodbc.

However I later switched to using turbodbc since the performance was much better that with pyodbc, so maybe Mike is right about the real world use cases.

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/84ea9dfd-216a-4c76-980c-ac03db4d9ddb%40www.fastmail.com.

Nicolas Lykke Iversen

unread,

Oct 18, 2020, 2:03:55 AM10/18/20

to sqlal...@googlegroups.com

Regarding the client side issue, people have been reporting problems with pyODBC’s behavior compared to native .NET in my issue for pyODBC.

Worth a look, but quite hard to understand.

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/CADm-oGnuWycLYx-AWN_SxVcgcGSdamEwcHKcThRxbN8UqJ2hFQ%40mail.gmail.com.

Mike Bayer

unread,

Oct 18, 2020, 2:37:24 PM10/18/20

to noreply-spamdigest via sqlalchemy

Yes Gord is with us. I can reproduce the behavior here but it does not always occur depending on how you access the table. our workaround recipe for 1.4 is merged. still seems much too esoteric for us to hardcode the workaround for now until more is understood about this.

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/CAE%2B5CJAT6s3UfNFB9Oz%2BM%2BoMPmQZN13-xgS%2BGo7946%2Beo8H51g%40mail.gmail.com.

Reply all

Reply to author

Forward