CHAR and trailing spaces

69 views
Skip to first unread message

Dimitry Sibiryakov

unread,
May 15, 2025, 10:27:36 AMMay 15
to firebir...@googlegroups.com
Hello All,

according to SQL standard or a common sense how important are trailing spaces
in CHAR (SQL_TEXT) data?
Can they be trimmed (silently or with a warning) if don't fit target buffer
or field?

--
WBR, SD.

Mark Rotteveel

unread,
May 15, 2025, 10:37:44 AMMay 15
to firebir...@googlegroups.com
On 15/05/2025 16:27, 'Dimitry Sibiryakov' via firebird-devel wrote:
>   according to SQL standard or a common sense how important are
> trailing spaces in CHAR (SQL_TEXT) data?
>   Can they be trimmed (silently or with a warning) if don't fit target
> buffer or field?

IIRC, they are not considered significant so I believe silent truncation
is allowed (but I currently don't have the energy to double-check and
locate the exact rules in the standard).

In fact, again IIRC, in some situations where Firebird currently throws
errors, even truncation of significant characters (i.e. not trailing
spaces) is allowed with a warning or even silently.

Mark
--
Mark Rotteveel

Alex Peshkoff

unread,
May 15, 2025, 11:20:32 AMMay 15
to firebir...@googlegroups.com
The only exception is probably octets charset, where \0, not spaces, may
be trimmed.


Jim Starkey

unread,
May 15, 2025, 11:21:17 AMMay 15
to firebir...@googlegroups.com

Whether or not the standard allows truncating of trailing blanks, I consider the idea to be exceedingly unwise for a variety of reasons:

  1. A user can reasonably expected that first storing a string then fetching it should result in the same string.
  2. An application developer may decide to encode something about a string (its provenance or reliability, as examples) by trailing blanks.
  3. Why in hell would you want to do that?  Have you not noticed the trend in storage costs?
  4. If you're going to do it, you would have to document exactly how and when a string might be truncated.
  5. It would be seen by most of the world as a bug not a feature.
--
Jim Starkey, AmorphousDB, LLC

Dimitry Sibiryakov

unread,
May 17, 2025, 5:48:37 AMMay 17
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 15.05.2025 16:37:
> IIRC, they are not considered significant so I believe silent truncation is
> allowed (but I currently don't have the energy to double-check and locate the
> exact rules in the standard).

So, what `CHAR_LENGTH('abc ')` must return? 3 or 10?

--
WBR, SD.

Mark Rotteveel

unread,
May 17, 2025, 7:19:42 AMMay 17
to firebir...@googlegroups.com
It should return 10, because 'abc ' is a literal with 10
characters, and that is by definition a CHAR(10) in 5.3 <literal> Syntax
Rules 17:

"""
The declared type of a <character string literal> is fixed-length
character string. The length of a <character string literal> is the
number of <character representation>s that it contains. Each <quote
symbol> contained in <character string literal> represents a single
<quote> in both the value and the length of the <character string
literal>. The two <quote>s contained in a <quote symbol> shall not be
separated by any <separator>.
NOTE 129 — <character string literal>s are allowed to be zero-length
strings (i.e., to contain no characters) even though it is not permitted
to declare a <data type> that is CHARACTER with <character length> 0 (zero).
"""

For CHAR_LENGTH, 6.31 <numeric value function>, General Rules 5 - sub a
- sub i applies:

"""
If the most specific type of S is character string, then the result is
the number of characters in the value of S.
"""

Related 4.3.1 Introduction to character strings

"""
A character string has a length, which is the number of characters in
the sequence.
"""

And the length of CHAR is defined in 6.1 <data type> Syntax Rules 16 -
sub a - sub i:

"""
If neither VARYING nor LARGE OBJECT is specified in <character string
type>, then the length in characters of the character string is fixed
and is the value of <length>.
"""

As an aside, your original question is answered by 4.3.1 Introduction to
character strings:

"""
If a store assignment would result in the loss of non-<truncating
whitespace> characters due to truncation, then an exception condition is
raised. If a retrieval assignment or evaluation of a <cast
specification> would result in the loss of characters due to truncation,
then a warning condition is raised.
"""

Where <truncating whitespace> is defined in Syntax Rules 5.2 <token> and
<separator>:

"""
3) <whitespace> is any consecutive sequence of characters each of which
satisfies the definition of whitespace found in Clause 3, “Terms and
definitions”.

4) <truncating whitespace> is an implementation-defined (IV074) subset
of the characters included in <whitespace>; the subset shall always
include at least <space>.
NOTE 114 — Since the Unicode definition of White_Space is subject to the
addition of new characters, this definition prevents an existing
conforming SQL-implementation from being made non-conforming by such a
change. However, SQL-implementations are expected to align themselves
with the most recent Unicode definition in a timely manner.
"""

For binary/varbinary the similar rule is in 4.4.1 Introduction to binary
strings:

"""
If a store assignment would result in the loss of non-zero octets due to
truncation, then an exception condition is raised. If a retrieval
assignment would result in the loss of octets due to truncation, then a
warning condition is raised.
"""

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 17, 2025, 7:36:12 AMMay 17
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 17.05.2025 13:19:
>
> It should return 10, because 'abc       ' is a literal with 10 characters, and
> that is by definition a CHAR(10) in 5.3 <literal> Syntax Rules 17:

Great. Many thanks, Mark.

--
WBR, SD.

Dimitry Sibiryakov

unread,
May 17, 2025, 4:04:53 PMMay 17
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 17.05.2025 13:19:
> It should return 10, because 'abc       ' is a literal with 10 characters

And here is the design problem since Borland's Interbase 3.1: API has no idea
of character length, only byte length.
Example: query "select char_length(?) from rdb$database" (I know that
currently it is impossible, but let's imagine that it is allowed) and
application code is like this:

char buffer[] = "a ";
sqltype = SQL_TEXT;
sqlsubtype = 4;
sqldata = buffer;
sqllen = strlen(buffer);

How this data should be interpreted in the engine? Value is 'a ' or 'a'?
Retuned value should be 4 or 1?
Say, we consider it to be one character. Then what is a buffer of value {
0x40 }? A user mistake that should raise error? Currently (since I removed an
intermediate buffer that padded the user-provided value to expected maximum
length) INTL_adjust_text_descriptor() recognises it as '' without any diagnostic.

--
WBR, SD.

Mark Rotteveel

unread,
May 18, 2025, 2:58:55 AMMay 18
to firebir...@googlegroups.com
On 17/05/2025 22:04, 'Dimitry Sibiryakov' via firebird-devel wrote:
> 'Mark Rotteveel' via firebird-devel wrote 17.05.2025 13:19:
>> It should return 10, because 'abc       ' is a literal with 10 characters
>
>   And here is the design problem since Borland's Interbase 3.1: API has
> no idea of character length, only byte length.
>   Example: query "select char_length(?) from rdb$database" (I know that
> currently it is impossible, but let's imagine that it is allowed) and
> application code is like this:
>
>   char buffer[] = "a   ";
>   sqltype = SQL_TEXT;
>   sqlsubtype = 4;
>   sqldata = buffer;
>   sqllen = strlen(buffer);
>
>   How this data should be interpreted in the engine? Value is 'a   ' or
> 'a'? Retuned value should be 4 or 1?

You use the subtype & 0xFF to determine the character set, and use its
(max) bytes per character to determine the length in characters.

Given 4 is UTF8, which has max 4 bytes per character, this is actually a
CHAR(1) (sqllen / 4) with the value 'a'.

>   Say, we consider it to be one character. Then what is a buffer of
> value { 0x40 }? A user mistake that should raise error? Currently (since
> I removed an intermediate buffer that padded the user-provided value to
> expected maximum length) INTL_adjust_text_descriptor() recognises it as
> '' without any diagnostic.

With UTF8, plain { 0x40 } is not a correct value, but I don't know how
the engine actually handles it. The correct value would be { 0x40, 0x20,
0x20, 0x20 } for the CHAR(1) string '@'.

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
May 18, 2025, 3:04:18 AMMay 18
to firebir...@googlegroups.com
On 18/05/2025 08:58, 'Mark Rotteveel' via firebird-devel wrote:
> With UTF8, plain { 0x40 } is not a correct value, but I don't know how
> the engine actually handles it. The correct value would be { 0x40, 0x20,
> 0x20, 0x20 } for the CHAR(1) string '@'.

And to be clear, the CHAR(10) literal 'abc ' of your original
question in UTF8 would have sqllen 40, and 37x 0x20 in its buffer, not
7x 0x20 of the literal.

Yet another reason why you generally should choose VARCHAR instead of CHAR.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 18, 2025, 4:40:44 AMMay 18
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 18.05.2025 8:58:
> With UTF8, plain { 0x40 } is not a correct value

Generally speaking UTF-8 is a variable-length encoding and an one byte
sequence is perfectly valid from its point of view. A lot of applications
(including several of mine) and Python driver used to send this value with
sqllen = 1 to the server and the server padded received value with spaces (zeros
for OCTETS).

> but I don't know how the engine actually handles it.

Currently the engine sees it as an empty string, i.e. ''.
If the call to INTL_adjust_text_descriptor() is removed then the error
"String right truncation. Expected length 0, actual length 1" is raised. You
said that this should be a correct behavior but unfortunately it broke Python
applications including our own QA framework.

> The correct value would be { 0x40, 0x20, 0x20, 0x20 } for the CHAR(1) string '@'.

Then what is a correct return from OCTET_LENGTH() for this value? Currently
it returns 1.


--
WBR, SD.

Mark Rotteveel

unread,
May 18, 2025, 5:28:35 AMMay 18
to firebir...@googlegroups.com
On 18/05/2025 10:40, 'Dimitry Sibiryakov' via firebird-devel wrote:
> 'Mark Rotteveel' via firebird-devel wrote 18.05.2025 8:58:
>> With UTF8, plain { 0x40 } is not a correct value
>
>   Generally speaking UTF-8 is a variable-length encoding and an one
> byte sequence is perfectly valid from its point of view. A lot of
> applications (including several of mine) and Python driver used to send
> this value with sqllen = 1 to the server and the server padded received
> value with spaces (zeros for OCTETS).

I'd rather assume you were using UNICODE_FSS instead of UTF8 with
Firebird 3.0 or older (which allowed UNICODE_FSS to store more
characters than the declared length). However, it could also be that
older version did something weird in this regard, and the engine was
corrected.

And yes, UTF-8 is a variable length encoding, but the way Firebird
encodes it, especially for CHAR, gets in the way here.

>> but I don't know how the engine actually handles it.
>
>   Currently the engine sees it as an empty string, i.e. ''.
>   If the call to INTL_adjust_text_descriptor() is removed then the
> error "String right truncation. Expected length 0, actual length 1" is
> raised. You said that this should be a correct behavior but
> unfortunately it broke Python applications including our own QA framework.

Well, I can't speak for the entirety of how the Firebird engine behaves,
but given for a CHAR in UTF8, the length in characters is determined by
sqllen / 4, that means a value of N characters, must be encoded in N * 4
bytes.

You could say, then why not just use ceil(sqllen / 4.0), but that would
then result in a CHAR(2) value like 'ab' sent as { 0x61, 0x62 } to be a
`CHAR(1)`, exceeding the length and resulting in a string truncation error.

Again, a reason to use VARCHAR instead of CHAR.

>> The correct value would be { 0x40, 0x20, 0x20, 0x20 } for the CHAR(1)
>> string '@'.
>
>   Then what is a correct return from OCTET_LENGTH() for this value?
> Currently it returns 1.

And that is the correct value. You're conflating the length of the value
(length = 1) with the length of how it must be encoded to be correct for
Firebird; those are *not* the same thing.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 18, 2025, 7:22:15 AMMay 18
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 18.05.2025 11:28:
> I'd rather assume you were using UNICODE_FSS instead of UTF8 with Firebird 3.0
> or older (which allowed UNICODE_FSS to store more characters than the declared
> length).

No, it worked with any charset (including UTF-8) before PR#8145.

> Again, a reason to use VARCHAR instead of CHAR.

Actually, VARCHAR has the same problem but at least with it one can let
sqllen to be much bigger than actual buffer without an unavoidable crash.

>> Then what is a correct return from OCTET_LENGTH() for this value? Currently it returns 1.
>
> And that is the correct value. You're conflating the length of the value (length = 1) with the length of how it must be encoded to be correct for Firebird; those are *not* the same thing.

Fortunately, we are in position to change some things, including rules of
encoding, to a certain extent.
Unfortunately, current metadata is not enough to tell if buffer "a " with
sqllen 4 is a standard UTF-8 CHAR value (thus 4 is a real byte length of data)
or a padded Firebird CHAR representation (thus 4 is a maximum byte length of
data). With non-ASCII data it becomes even more messy.
Because of this I dropped the checks for the input data in PR#8566 opening a
can of worms, but I hope to hit the worms according to the standard you so
nicely quoted.

--
WBR, SD.

Adriano dos Santos Fernandes

unread,
May 18, 2025, 8:37:52 AMMay 18
to firebir...@googlegroups.com
There was a problem in the original architecture of how SQL_TEXT is
represented in the API, and even internally. MBCS was a pure shit. Then,
the architecture problem was workarounded and this is no longer a
problem in the last ~20 years.

Even you were not seem it as a problem and don't was seeing my advice to
make things better changing clients:

----
Subject: Re: [Firebird-devel] String error with actual snapshot of 3.0

07.01.2016 18:00, Adriano dos Santos Fernandes wrote:
> Instead of arguing on that and wait 10 years for a Firebird change, I'd
> want to ask you and any component developer to switch SQL_TEXT described
> by the engine to a SQL_VARYING

And to do all work for adjusting every buffer length and handling
vary struct instead
of simple providing real data length in sqllen? No, thanks.

--
WBR, SD.
----

So, this was not a problem in the last ~20 years, until some weeks ago
when https://github.com/FirebirdSQL/firebird/pull/8145 was merged.

I'm not wise enough to understand this PR and I hope I don't need to, as
I think it should be reverted. It broke MBCS, it broke WHERE CURRENT OF.
And worse, you're trying to find ways to broke even more things while
fix the already reported problems.


Adriano

Mark Rotteveel

unread,
May 18, 2025, 8:44:45 AMMay 18
to firebir...@googlegroups.com
On 18/05/2025 13:22, 'Dimitry Sibiryakov' via firebird-devel wrote:
> 'Mark Rotteveel' via firebird-devel wrote 18.05.2025 11:28:
>> I'd rather assume you were using UNICODE_FSS instead of UTF8 with
>> Firebird 3.0 or older (which allowed UNICODE_FSS to store more
>> characters than the declared length).
>
>   No, it worked with any charset (including UTF-8) before PR#8145.
>
>> Again, a reason to use VARCHAR instead of CHAR.
>
>   Actually, VARCHAR has the same problem but at least with it one can
> let sqllen to be much bigger than actual buffer without an unavoidable
> crash.

Again, not that familiar with the engine internals, but from the
perspective of the outside API (e.g. wire protocol), there is a pretty
big difference between using CHAR vs VARCHAR in that you can (or even:
should) send VARCHAR without caring about any padding.

I suspect that the server undoes that and materializes something that is
padded.

>>>    Then what is a correct return from OCTET_LENGTH() for this value?
>>> Currently it returns 1.
>>
>> And that is the correct value. You're conflating the length of the
>> value (length = 1) with the length of how it must be encoded to be
>> correct for Firebird; those are *not* the same thing.
>
>   Fortunately, we are in position to change some things, including
> rules of encoding, to a certain extent.
>   Unfortunately, current metadata is not enough to tell if buffer "a
> " with sqllen 4 is a standard UTF-8 CHAR value (thus 4 is a real byte
> length of data) or a padded Firebird CHAR representation (thus 4 is a
> maximum byte length of data). With non-ASCII data it becomes even more
> messy.

I've been very adamant that it must be sqllen = N * <max-byte-per-char>
always, but looking at Jaybird's sources, it seems I relied on being
able to write the row blr and row data with the actual data length in
bytes on execute as well.

And Jaybird's test involving CHAR indeed now do break (tested against
Firebird-6.0.0.783-0-3732012-windows-x64), while the version from the
5th of May (Firebird-6.0.0.770-0-82c4a08-windows-x64) works fine.

So, thanks for that /s ;)

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
May 18, 2025, 8:49:33 AMMay 18
to firebir...@googlegroups.com
On 18/05/2025 13:22, 'Dimitry Sibiryakov' via firebird-devel wrote:
> 'Mark Rotteveel' via firebird-devel wrote 18.05.2025 11:28:
>> I'd rather assume you were using UNICODE_FSS instead of UTF8 with
>> Firebird 3.0 or older (which allowed UNICODE_FSS to store more
>> characters than the declared length).
>
>   No, it worked with any charset (including UTF-8) before PR#8145.

Doesn't that mean that change should be backed out (reverted) until it
can be fixed properly?

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 18, 2025, 8:59:56 AMMay 18
to firebir...@googlegroups.com
Adriano dos Santos Fernandes wrote 18.05.2025 14:37:
> So, this was not a problem in the last ~20 years

"workarounded problem" != "solved problem".

--
WBR, SD.

Dimitry Sibiryakov

unread,
May 18, 2025, 9:17:29 AMMay 18
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 18.05.2025 14:44:
>
> Again, not that familiar with the engine internals, but from the perspective of
> the outside API (e.g. wire protocol), there is a pretty big difference between
> using CHAR vs VARCHAR in that you can (or even: should) send VARCHAR without
> caring about any padding.

It is not quite so.

Consider following code:

param_vary* buffer = { 2, "a " };
sqltype = SQL_VARYING;
sqlsubtype = 4; // charset UTF-8
sqldata = buffer;
sqllen = 6;

This will end up in any kind of error because server expect sqllen for this
data to be at least 10 (2 + 2*4). You still have to pad VARCHAR parameter to max
byte length, just padding is not whitespaces but unused space.

> And Jaybird's test involving CHAR indeed now do break (tested against Firebird-6.0.0.783-0-3732012-windows-x64), while the version from the 5th of May (Firebird-6.0.0.770-0-82c4a08-windows-x64) works fine.

Could you test artifacts from
https://github.com/FirebirdSQL/firebird/pull/8566, please?

> Doesn't that mean that change should be backed out (reverted) until it can be fixed properly?

Normally yes, but a) the only idea of "a proper fix" was another workaround
that would make things slower; b) this PR caused ~5% performance gain for
database restore (and any DSQL-level bulk insert in common). The latter is the
reason why I would prefer to properly fix consequences than to revert the PR.

--
WBR, SD.

Mark Rotteveel

unread,
May 18, 2025, 10:16:28 AMMay 18
to firebir...@googlegroups.com
On 18/05/2025 15:17, 'Dimitry Sibiryakov' via firebird-devel wrote:
> 'Mark Rotteveel' via firebird-devel wrote 18.05.2025 14:44:
>>
>> Again, not that familiar with the engine internals, but from the
>> perspective of the outside API (e.g. wire protocol), there is a pretty
>> big difference between using CHAR vs VARCHAR in that you can (or even:
>> should) send VARCHAR without caring about any padding.
>
>   It is not quite so.
>
>   Consider following code:
>
>   param_vary* buffer = { 2, "a " };
>   sqltype = SQL_VARYING;
>   sqlsubtype = 4; // charset UTF-8
>   sqldata = buffer;
>   sqllen = 6;
>
>   This will end up in any kind of error because server expect sqllen
> for this data to be at least 10 (2 + 2*4). You still have to pad VARCHAR
> parameter to max byte length, just padding is not whitespaces but unused
> space.

Not in the wire protocol. There you can send the BLR for this column as
{ blr_varying2, 4, 0, 2, 0 } (length = 2, not 6!) and the row data would
be {0, 0, 0, 2, 0x61, 0x20 } + padding { 0x20, 0x20 } (or maybe { 0x00,
0x00 }) so it's a multiple of 4.

And even in native, you should be able to use:

char* buffer = { 2, 0, 0x61, 0x20, 0 };
sqltype = SQL_VARYING;
sqlsubtype = 4; // charset UTF-8
sqldata = buffer;
sqllen = 2;

I've used char* as I don't know what param_vary* is, nor could I find it
in Firebird sources. The length prefix is an Int16LE ox x86/AMD64, and
the buffer needs a nul-terminator

Also not that the sqllen is 2, not 6 (the length prefix and
nul-terminator are not part of the sqllen!).

That said, Jaybird's native implementation does use a buffer that was
allocated to the original length (sqllen + 3 for SQL_VARYING), even
though it modifies the sqllen on execute, so that could make a difference.

However, if I understand it correctly, previously Firebird would copy
the data into a buffer of the correct length with the original sqllen
value, allowing it to correctly calculate the CHAR(n) using sqllen/4,
and your change broke that because you now simply reuse the user buffer
(which, BTW, I wonder is actually secure to do in case of embedded), and
you update the sqllen to a value that breaks the assumption that the
engine can calculate character length as sqllen/4.

>> And Jaybird's test involving CHAR indeed now do break (tested against
>> Firebird-6.0.0.783-0-3732012-windows-x64), while the version from the
>> 5th of May (Firebird-6.0.0.770-0-82c4a08-windows-x64) works fine.
>
>   Could you test artifacts from https://github.com/FirebirdSQL/
> firebird/pull/8566, please?

Tests still fail, even tests that don't seem to involve CHAR, like a
test with TIME WITH TIME ZONE
(TimeWithTimeZoneSupportTest.testSelectCondition()):

Stack overflow. The resource requirements of the runtime stack have
exceeded the memory available to it. [SQLState:HY001, ISC error
code:335544781]

However, the Jaybird tests seem to have issues with recovering from some
of the test failures, and I can't easily discern if a failure is due to
a server-side issue, or due to failure to recover from a previous test
failure.

>> Doesn't that mean that change should be backed out (reverted) until it
>> can be fixed properly?
>
>   Normally yes, but a) the only idea of "a proper fix" was another
> workaround that would make things slower; b) this PR caused ~5%
> performance gain for database restore (and any DSQL-level bulk insert in
> common). The latter is the reason why I would prefer to properly fix
> consequences than to revert the PR.

A change that breaks things is not a good change, no matter the
performance improvement.

I'd say this needs to be backed out, until it can be fixed in its entirety

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 18, 2025, 10:41:28 AMMay 18
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 18.05.2025 16:16:
> However, if I understand it correctly, previously Firebird would copy the data
> into a buffer of the correct length with the original sqllen value, allowing it
> to correctly calculate the CHAR(n) using sqllen/4, and your change broke that
> because you now simply reuse the user buffer (which, BTW, I wonder is actually
> secure to do in case of embedded),

Yes, this is exactly what I did. I considered that embedded case cannot be
made worse from security POV because the engine is already in user memory.

> and you update the sqllen to a value that
> breaks the assumption that the engine can calculate character length as sqllen/4.

No, this is not what I did. This is the real code in Python driver, which is
there from kinterbase time, BTW.

>> Could you test artifacts from https://github.com/FirebirdSQL/firebird/pull/8566, please?
>
> Tests still fail, even tests that don't seem to involve CHAR

Yes, I know that a number of things are broken, I already have a list of some
from Pavel Zotov. I wonder if that particular test you referred in the message
above hasn't get better.

> A change that breaks things is not a good change, no matter the performance improvement.
>
> I'd say this needs to be backed out, until it can be fixed in its entirety

I would agree if we talked about any stable branch but HEAD is expected to be
unstable. The problem is that firebirdtest.com only covers HEAD, so, being
reverted, this patch will take no testing at all. Of course, I run the QA suite
before offering this PR, but obviously it wasn't enough because I've not found
the regressions that are reported now.

--
WBR, SD.

Mark Rotteveel

unread,
May 18, 2025, 10:46:27 AMMay 18
to firebir...@googlegroups.com
On 18/05/2025 16:16, 'Mark Rotteveel' via firebird-devel wrote:
> On 18/05/2025 15:17, 'Dimitry Sibiryakov' via firebird-devel wrote:
>>    Could you test artifacts from https://github.com/FirebirdSQL/
>> firebird/pull/8566, please?
>
> Tests still fail, even tests that don't seem to involve CHAR, like a
> test with TIME WITH TIME ZONE
> (TimeWithTimeZoneSupportTest.testSelectCondition()):
>
> Stack overflow.  The resource requirements of the runtime stack have
> exceeded the memory available to it. [SQLState:HY001, ISC error
> code:335544781]
>
> However, the Jaybird tests seem to have issues with recovering from some
> of the test failures, and I can't easily discern if a failure is due to
> a server-side issue, or due to failure to recover from a previous test
> failure.

I needed to add isc_exception_stack_overflow to the broken connection
errors to be able to recover, and now I have 2 failing tests with that PR

The two tests that fail use isc_dpb_set_bind set to "TIME ZONE TO EXTENDED".

The TimeWithTimeZoneSupportTest.testSelectCondition() test that fails
executes the query:

select id, timetz from withtimetz where timetz = ? order by id

with the parameter set to the equivalent of Java's
OffsetTime.parse("13:25:32.1234+01:00").

on a table

create table withtimetz (
id integer,
timetz TIME WITH TIME ZONE
)

and populated with:

insert into withtimetz(id, timetz) values (1, time'13:25:32.1234+01:00')
insert into withtimetz(id, timetz) values (2, time'13:25:32.1235
Europe/Amsterdam')
insert into withtimetz(id, timetz) values (3, time'23:59:59.9999+13:59')
insert into withtimetz(id, timetz) values (3, null)

(i.e. the test intends to select the rows with id=1 and id=2)

The other test (TimestampWithTimeZoneSupportTest.testSelectCondition())
is very similar, but then for TIMESTAMP WITH TIME ZONE and selecting
with the equivalent of Java's
OffsetDateTime.parse("2019-03-09T13:25:32.1234+01:00").

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
May 18, 2025, 10:50:03 AMMay 18
to firebir...@googlegroups.com
On 18/05/2025 16:41, 'Dimitry Sibiryakov' via firebird-devel wrote:
> 'Mark Rotteveel' via firebird-devel wrote 18.05.2025 16:16:
>> However, if I understand it correctly, previously Firebird would copy
>> the data into a buffer of the correct length with the original sqllen
>> value, allowing it to correctly calculate the CHAR(n) using sqllen/4,
>> and your change broke that because you now simply reuse the user
>> buffer (which, BTW, I wonder is actually secure to do in case of
>> embedded),
>
>   Yes, this is exactly what I did. I considered that embedded case
> cannot be made worse from security POV because the engine is already in
> user memory.

Well, what is the lifetime of the use of buffer in the server? If it -
potentially - exceeds the lifetime of the user buffer, you've just
created a use after free problem.

>>  and you update the sqllen to a value that breaks the assumption that
>> the engine can calculate character length as sqllen/4.
>
>   No, this is not what I did. This is the real code in Python driver,
> which is there from kinterbase time, BTW.

You're conflating what happens in the server with what happens in a
driver. I was talking about what happens in the server after your change
(and possibly making wrong assumptions there).

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 18, 2025, 10:58:13 AMMay 18
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 18.05.2025 16:49:
> Well, what is the lifetime of the use of buffer in the server? If it -
> potentially - exceeds the lifetime of the user buffer, you've just created a use
> after free problem.

I don't quite understand what you mean. User buffer is provided to exactly
one API call and the engine cannot access them outside of this call.

> You're conflating what happens in the server with what happens in a driver. I was talking about what happens in the server after your change (and possibly making wrong assumptions there).

My changes are limited to the engine, they don't affect network server,
protocol, client or API.

--
WBR, SD.

Mark Rotteveel

unread,
May 18, 2025, 11:02:47 AMMay 18
to firebir...@googlegroups.com
Well, they did, otherwise my tests of the pure Java implementation of
Jaybird wouldn't have started failing.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 18, 2025, 11:28:52 AMMay 18
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 18.05.2025 17:02:
> Well, they did, otherwise my tests of the pure Java implementation of Jaybird
> wouldn't have started failing.

Your tests are calling engine via network server. If only these two are
failing, then it is nothing bad, most likely I'll be able to fix them without
undo of whole patch.
BTW, how exactly do they fail? I.e. what did they expect and what did they get?

--
WBR, SD.

Mark Rotteveel

unread,
May 18, 2025, 11:41:51 AMMay 18
to firebir...@googlegroups.com
They expected to be able to execute the query and then fetch two rows,
and instead they receive isc_exception_stack_overflow (335544781) on
execute (pure Java) or fetch (native) (I guess fbclient defers reading
the response to the execute until the fetch?).

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
May 18, 2025, 11:50:21 AMMay 18
to firebir...@googlegroups.com
On 18/05/2025 17:28, 'Dimitry Sibiryakov' via firebird-devel wrote:
>   Your tests are calling engine via network server. If only these two
> are failing, then it is nothing bad, most likely I'll be able to fix
> them without undo of whole patch.

The build from master (Firebird-6.0.0.783-0-3732012-windows-x64) is
significantly worse: 56 failures, most of them isc_string_truncation
(335544914) or expectation mismatches.

So, I guess you're right.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 21, 2025, 11:40:54 AMMay 21
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 18.05.2025 17:50:
>>    Your tests are calling engine via network server. If only these two are
>> failing, then it is nothing bad, most likely I'll be able to fix them without
>> undo of whole patch.
>
> The build from master (Firebird-6.0.0.783-0-3732012-windows-x64) is
> significantly worse: 56 failures, most of them isc_string_truncation (335544914)
> or expectation mismatches.
>
> So, I guess you're right.

Fixed in
https://github.com/FirebirdSQL/firebird/pull/8571/commits/f1d69400e056a8ae0defc44952a5a6d613da9a16
though I'm not sure if this fix is done in a right way. May be an adjustment of
types' priorities should be done instead.

--
WBR, SD.

Mark Rotteveel

unread,
May 21, 2025, 11:44:18 AMMay 21
to firebir...@googlegroups.com
Shouldn't those have already been converted to the "normal" tz types at
that point (not sure, just asking)?

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 21, 2025, 11:49:00 AMMay 21
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 21.05.2025 17:44:
>>    Fixed in https://github.com/FirebirdSQL/firebird/pull/8571/commits/
>> f1d69400e056a8ae0defc44952a5a6d613da9a16 though I'm not sure if this fix is
>> done in a right way. May be an adjustment of types' priorities should be done
>> instead.
>
> Shouldn't those have already been converted to the "normal" tz types at that
> point (not sure, just asking)?

They are still in user's buffer, there is no place to convert them into, AFAIK.

--
WBR, SD.

Mark Rotteveel

unread,
May 30, 2025, 5:25:31 AMMay 30
to firebir...@googlegroups.com
When are these changes going to land in master?

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 30, 2025, 5:36:39 AMMay 30
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 30.05.2025 11:25:
>> I'm not sure if this fix is done in a right way. May be an adjustment of
>> types' priorities should be done instead.
>
> When are these changes going to land in master?

I've just committed a fix which I hope to be the last one but confirmation
needs more test results from Pavel Zotov.
Dmitrii Yemanov prefers to have all fixes in a single PR.

--
WBR, SD.

Mark Rotteveel

unread,
Jun 14, 2025, 4:39:37 AMJun 14
to firebir...@googlegroups.com
I think these changes are still causing big problems. In Jaybird, I now
get java.sql.SQLException: arithmetic exception, numeric overflow, or
string truncation; string right truncation; expected length 1, actual 4
[SQLState:22001, ISC error code:335544914]

when executing

```
statement.prepare("INSERT INTO keyvalue (thekey, theUTFVarcharValue)
VALUES (?, ?)");
var differentRowValue = RowValue.of(
db.getDatatypeCoder().encodeInt(4097),
db.getEncoding().encodeToCharset("test"));
statement.execute(differentRowValue);
```

On a table defined as

```
CREATE TABLE keyvalue (
thekey INTEGER PRIMARY KEY,
thevalue VARCHAR(5),
theUTFVarcharValue VARCHAR(5) CHARACTER SET UTF8,
theUTFCharValue CHAR(5) CHARACTER SET UTF8
)
```

When sending the parameter value `"test"`, Jaybird will send a VARCHAR
buffer sized at 4, it seems Firebird now expects a VARCHAR buffer sized
at 16 with 12 trailing spaces (as I can make the test pass by using
`"test "` (test + 12 spaces). That is really not an
acceptable change.

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
Jun 14, 2025, 4:48:32 AMJun 14
to firebir...@googlegroups.com
BTW: Maybe these issues were already there in May (I am not in the mood
to retrace my steps here and pull out an older version to double check),
but I think it is unacceptable that it takes a month to fix it on
master. I rely on snapshots to do my work on Jaybird (e.g., say
implement support for schemas), and it is damn annoying to get
side-tracked by issues that don't get fixed in a timely manner.

These changes should be reverted until it can be completely fixed.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
Jun 14, 2025, 5:08:17 AMJun 14
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 14.06.2025 10:39:
> I think these changes are still causing big problems. In Jaybird, I now get
> java.sql.SQLException: arithmetic exception, numeric overflow, or string
> truncation; string right truncation; expected length 1, actual 4
> [SQLState:22001, ISC error code:335544914]

Do you test it on master branch or posfix8145 branch?

> When sending the parameter value `"test"`, Jaybird will send a VARCHAR buffer sized at 4, it seems Firebird now expects a VARCHAR buffer sized at 16 with 12 trailing spaces (as I can make the test pass by using `"test "` (test + 12 spaces). That is really not an acceptable change.

This is not a change, Firebird always expected buffer size 22 for varchar(5)
character set UTF-8.

> I think it is unacceptable that it takes a month to fix it on master. I rely on snapshots to do my work on Jaybird (e.g., say implement support for schemas), and it is damn annoying to get side-tracked by issues that don't get fixed in a timely manner.

You have write access to the repo. Just press "merge" button for PR#8571
because obviously nobody else care to do it.

--
WBR, SD.

Mark Rotteveel

unread,
Jun 14, 2025, 6:01:19 AMJun 14
to firebir...@googlegroups.com
On 14/06/2025 11:08, 'Dimitry Sibiryakov' via firebird-devel wrote:
> 'Mark Rotteveel' via firebird-devel wrote 14.06.2025 10:39:
>> I think these changes are still causing big problems. In Jaybird, I
>> now get java.sql.SQLException: arithmetic exception, numeric overflow,
>> or string truncation; string right truncation; expected length 1,
>> actual 4 [SQLState:22001, ISC error code:335544914]
>
>   Do you test it on master branch or posfix8145 branch?

I tested it on master, or at least, with a snapshot build, which is from
master. I can't keep track of partial fixes on PR branches.

>> When sending the parameter value `"test"`, Jaybird will send a VARCHAR
>> buffer sized at 4, it seems Firebird now expects a VARCHAR buffer
>> sized at 16 with 12 trailing spaces (as I can make the test pass by
>> using `"test            "` (test + 12 spaces). That is really not an
>> acceptable change.
>
>   This is not a change, Firebird always expected buffer size 22 for
> varchar(5) character set UTF-8.

Not in the wire protocol. There it is a buffer (int32 length + data +
0-3 bytes padding), and that buffer could declare the actual bytes needed.

>> I think it is unacceptable that it takes a month to fix it on master.
>> I rely on snapshots to do my work on Jaybird (e.g., say implement
>> support for schemas), and it is damn annoying to get side-tracked by
>> issues that don't get fixed in a timely manner.
>
>   You have write access to the repo. Just press "merge" button for
> PR#8571 because obviously nobody else care to do it.

Given I'm not a core developer, I'm not generally willing to merge
other's changes to master.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
Jun 14, 2025, 6:15:19 AMJun 14
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 14.06.2025 12:01:
>
> I tested it on master, or at least, with a snapshot build, which is from master.
> I can't keep track of partial fixes on PR branches.

The issue is fixed in the posfixes branch.

> Given I'm not a core developer, I'm not generally willing to merge other's changes to master.

In this case this PR will wait another year to get merged.

--
WBR, SD.

Mark Rotteveel

unread,
Jun 14, 2025, 6:20:03 AMJun 14
to firebir...@googlegroups.com
Have you *asked* one of the core developers to review it again and merge
it? If not, you should do that. If you did, you should ping them again,
or ask someone else.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
Jun 14, 2025, 6:23:36 AMJun 14
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 14.06.2025 12:19:
> Have you *asked* one of the core developers to review it again and merge it?

Yes.

> If you did, you should ping them again, or ask someone else.

I avoid to be annoying so I asked you because you are interested in merging
of this PR.

--
WBR, SD.

Mark Rotteveel

unread,
Jun 14, 2025, 6:27:07 AMJun 14
to firebir...@googlegroups.com
I see no indication on the PR[1] that you did, so I asked Vlad and
Dmitry (although Dmitry is on vacation right now).

Mark

[1]: https://github.com/FirebirdSQL/firebird/pull/8571
--
Mark Rotteveel
Reply all
Reply to author
Forward
0 new messages