I was writing a test just now for disabled escape processing in Jaybird,
which would result in certain JDBC escapes to be passed without
transformation to Firebird.
Specifically, it sent the statement
```
select {fn exp(2)} from RDB$DATABASE
```
I had expected this to fail with a "Token unknown - line 1, column 8; {"
but instead it failed with "Token unknown - line 1, column 15; (". Some
experimentation shows that regular identifiers accept { and } (even as
first character), which I did not know nor expect.
As far as is documented, Firebird defines regular identifiers as
```
<name> ::=
<letter> | <name><letter> | <name><digit> | <name>_ | <name>$
<letter> ::= <upper letter> | <lower letter>
<upper letter> ::= A | B | C | D | E | F | G | H | I | J | K | L | M |
N | O | P | Q | R | S | T | U | V | W | X | Y | Z
<lower letter> ::= a | b | c | d | e | f | g | h | i | j | k | l | m |
n | o | p | q | r | s | t | u | v | w | x | y | z
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
```
(see
https://firebirdsql.org/file/documentation/chunk/en/refdocs/fblangref50/fblangref50-structure-identifiers.html)
The above syntax is closer to the definition of <regular identifier> in
SQL-86 than to its definition in SQL:2023; compared to SQL-86, it only
adds the $. It's also the syntax defined in the InterBase 6.0 Language
Reference (page 16).
This makes me wonder:
1) Is this a Firebird bug or a documentation bug?
2) What are the actual characters supported in a regular identifier in
Firebird 5?
For comparison, SQL:2023 defines:
"""
```
<regular identifier> ::=
<identifier body>
<identifier body> ::=
<identifier start> [ <identifier part>... ]
<identifier part> ::=
<identifier start>
| <identifier extend>
<identifier start> ::=
!! See the Syntax Rules.
<identifier extend> ::=
!! See the Syntax Rules.
```
Syntax Rules
1) An <identifier start> is any character in the Unicode General
Category classes “Lu”, “Ll”, “Lt”, “Lm”, “Lo”, or “Nl”.
NOTE 112 — The Unicode General Category classes “Lu”, “Ll”, “Lt”, “Lm”,
“Lo”, and “Nl” are assigned to Unicode characters that are,
respectively, upper-case letters, lower-case letters, title-case
letters, modifier letters, other letters, and letter numbers.
2) An <identifier extend> is U+00B7, “Middle Dot”, or any character in
the Unicode General Category classes “Mn”, “Mc”, “Nd”, or “Pc”.
NOTE 113 — The Unicode General Category classes “Mn”, “Mc”, “Nd”, and
“Pc”, are assigned to Unicode characters that are, respectively,
non-spacing marks, spacing combining marks, decimal numbers, and
connector punctuations
"""
Mark
--
Mark Rotteveel