Dear Colleagues,
I am pleased to inform you that I am finalizing the development of a SQL-standard compliant ROW data type. As we approach the final stages, I would like to present a proposal outlining the syntax and logic for this feature. Once we reach a consensus and address any remaining concerns, I will prepare and submit a pull request for review. I look forward to your feedback and any suggestions you might have.
GoalsImplement the ROW data type, a structured PSQL object which contains internal fields
Create declaration syntax of ROW type objects based on existing tables, cursors and user defined types
Nested fields values of ROW type objects should be accessible via qualified name of object plus field name
Add the ability to define local ROW types inside procedures, functions, etc., as well as custom ROW types inside packages
Add <row value constructor> as defined by the SQL standard to create ROW type values on the fly
Make it possible to receive and conveniently process ROW type values on client side
Adhere to the SQL standard as closely as possible
Make the implementation open for future expansions like nested composite fields
The syntax description is divided into different use cases for ROW types. New features and changes are highlighted in bold. Not all use cases of ROW types are covered in this section, only those that use a previously non-existent syntax. More variants of using ROW type objects can be found in the examples section.
Since the basic declaration syntax has not changed, there are new options for type definition:
<domain_or_non_array_type> ::=
<scalar_datatype>
| <blob_datatype>
| [TYPE OF] domain
| TYPE OF COLUMN rel.col
| TYPE OF TABLE <fields_definition_source>
| package_name.user_defined_packaged_type_name
| local_row_type_name
<fields_definition_source> := relation_name | cursor_name
As well as a default initialization:
<inparam> ::= <param_decl> [{= | DEFAULT} <value>]
<value> ::= {<literal> | NULL | <context_var> | <row_type_nested_field> | <row_value_constructor>}
DECLARE [VARIABLE] varname
<domain_or_non_array_type> [NOT NULL] [COLLATE collation]
[{DEFAULT | = } <initvalue>];
<initvalue> ::= <literal> | <context_var> | <row_type_nested_field> | <row_value_constructor>
Introduced ability to define local ROW types inside routines:
<psql-module-body> ::=
AS
[<forward-declarations>]
[<declarations>]
BEGIN
[<PSQL_statements>]
END
<declarations> ::=
<declare-item> [<declare-item> ...]
<declare-item> ::=
<declare-var>
| <declare-cursor>
| <subfunc-def>
| <subproc-def>
| <row_type_definition>
Furthermore, ROW types can be defined in packages either in the header or body and can be used globally or only within the same package, respectively:
<package_item> ::=
<function_decl>;
| <procedure_decl>;
| <row_type_definition>;
<row_type_definition> ::=
DECLARE TYPE typename ( <nested_field> [ , <nested_field> … ] )
<nested_field> ::= <field_declaration> [{= | DEFAULT} <value>]
<field_declaration> ::=
field_name <domain_or_non_array_type>
[NOT NULL]
[CHECK (<condition>)]
[COLLATE collation]
Access to a nested field value via ROW type object qualified name:
<row_type_nested_field> ::= <row_type_object_name>.nested_field_name
<row_type_object_name> ::=
row_type_variable_name
| row_type_parameter_name
Specify a value or list of values to be constructed into a row value that could be used to initialize variables, in WHERE conditions, as a result of select statements or full record insert/update values.
<row_value_constructor> ::= [ROW] ( <value-expression> [ , <value-expression> … ] )
<value-expression> ::=
[table-alias.]col_name
| [table-alias.]selectable_SP_outparm
| <literal>
| <context-variable>
| <function-call>
| <single-value-subselect>
| <CASE-construct>
| <row_type_nested_field>
| any other expression returning a single
value of a Firebird data type or NULL
New way of updating whole record using ROW type value:
UPDATE target [[AS] alias]
SET <set_list>
[WHERE {<search-conditions> | CURRENT OF cursorname}]
[PLAN <plan_items>]
[ORDER BY <sort_items>]
[ROWS m [TO n]]
[SKIP LOCKED]
[RETURNING <returning_list> [INTO <variables>]]
<set_list> ::=
col_name = <upd_value> [, col_name = <upd_value> ...]
| ROW = <row_type_upd_value>
<row_type_upd_value> ::= <row_type_value_expression> | DEFAULT
<row_type_value_expression> ::=
<row_value_constructor>
| <context-variable>
| any other expression returning a ROW type value or NULL
Let's assume that we have prepared two tables:
CREATE TABLE test_table (id int, num smallint, name char(10));
INSERT INTO test_table VALUES (1, 2, 'John Doe');
CREATE TABLE another_table (text varchar(30) default 'some text');
INSERT INTO another_table VALUES ('inserted text');
EXECUTE BLOCK
RETURNS (out VARCHAR(30))
AS
DECLARE VARIABLE row_var TYPE OF TABLE test_table;
BEGIN
row_var.name = 'Terence';
out = row_var.name;
SUSPEND;
END;
OUT
==============================
Terence
EXECUTE BLOCK
RETURNS (out_1 int, out_2 char(10))
AS
DECLARE test_cursor CURSOR FOR (SELECT id, name FROM test_table);
DECLARE local_row_var TYPE OF TABLE test_cursor;
BEGIN
local_row_var.id = 42;
local_row_var.name = 'Adams';
out_1 = local_row_var.id;
out_2 = local_row_var.name;
SUSPEND;
END;
OUT_1 OUT_2
============ ==========
42 Adams
CREATE PROCEDURE row_returning_proc
RETURNS (out_1 TYPE OF TABLE test_table, out_2 TYPE OF TABLE test_table)
AS
BEGIN
out_1.id = 1;
out_1.num = 2;
out_1.name = 'David';
out_2 = (3, 4, 'John');
SUSPEND;
END;
SELECT * FROM row_returning_proc;
OUT_1 OUT_2
======================================== ========================================
(1, 2, David) (3, 4, John)
CREATE FUNCTION func_with_row_types(in_param TYPE OF TABLE test_table)
RETURNS TYPE OF TABLE test_table
AS
BEGIN
RETURN in_param;
END;
SELECT func_with_row_types(ROW(1, 2, 'Pratchett')) FROM rdb$database;
FUNC_WITH_ROW_TYPES
========================================
(1, 2, Pratchett)
CREATE PROCEDURE proc_with_row_types (in_param TYPE OF TABLE test_table = (1, 2, 'Dan'))
RETURNS (out_1 TYPE OF TABLE test_table)
AS
DECLARE VARIABLE local_row_var TYPE OF TABLE test_table DEFAULT (3, NULL, 'Abnett');
BEGIN
in_param.id = local_row_var.id;
in_param.num = local_row_var.num;
out_1 = in_param;
SUSPEND;
END;
SELECT * FROM proc_with_row_types;
OUT_1
========================================
(3, <null>, Dan)
EXECUTE BLOCK
RETURNS (out_1 TYPE OF TABLE test_table)
AS
BEGIN
SELECT * FROM test_table INTO out_1;
SUSPEND;
end;
OUT_1
========================================
(1, 2, John Doe)
UPDATE test_table SET ROW = (3,4,'William');
SELECT * FROM test_table;
ID NUM NAME
============ ======= ==========
3 4 William
UPDATE another_table SET ROW = ROW(DEFAULT);
SELECT * FROM another_table;
TEXT
==============================
some text
UPDATE another_table SET ROW = DEFAULT;
SELECT * FROM another_table;
TEXT
==============================
some text
EXECUTE BLOCK
RETURNS (out_1 int, out_2 smallint)
AS
DECLARE TYPE LocalRowType (f1 int, f2 smallint default 5);
DECLARE VARIABLE row_var LocalRowType;
BEGIN
out_1 = row_var.f1;
out_2 = row_var.f2;
SUSPEND;
END;
OUT_1 OUT_2
============ =======
<null> 5
SELECT 1 FROM RDB$DATABASE WHERE (1, 2, 3) > (1, 2, 3);
SELECT 1 FROM RDB$DATABASE WHERE (1, 2, 3+1) > (1, 2, 3);
CONSTANT
============
1
CREATE PROCEDURE p1 (inpar TYPE OF TABLE test_table)
RETURNS (outpar TYPE OF TABLE test_table)
AS
BEGIN
outpar = inpar;
SUSPEND;
END;
SELECT * FROM p1((1, 3, null)) WHERE (outpar > (1, 2, 3)) IS NOT null;
OUTPAR
========================================
(1, 3, <null>)
CREATE PACKAGE test_pack AS
BEGIN
DECLARE TYPE PackagedType (f1 INT CHECK (VALUE > 10), f2 SMALLINT);
END;
EXECUTE BLOCK
RETURNS (out_1 test_pack.PackagedType)
AS
DECLARE VARIABLE local_var test_pack.PackagedType = (341, 561);
BEGIN
out_1 = local_var;
SUSPEND;
END;
OUT_1
====================
(341, 561)
CREATE PACKAGE private_test_pack AS
BEGIN
DECLARE TYPE PublicType (f0 INT, f1 VARCHAR(10));
PROCEDURE p1 RETURNS (out_1 PublicType);
END;
CREATE PACKAGE BODY private_test_pack AS
BEGIN
DECLARE TYPE PrivateType (f0 INT NOT NULL DEFAULT 10, f1 VARCHAR(10) DEFAULT 'Ford');
PROCEDURE p1 RETURNS (out_1 PublicType)
AS
DECLARE VARIABLE local_var PrivateType;
BEGIN
out_1 = local_var;
SUSPEND;
END
END;
OUT_1
========================================
(10, Ford)
There are other ways to utilize ROW type objects and <row value constructor>, but the examples provided cover all possible syntax scenarios. Any other use cases would simply be combinations of these examples.
Additionally, the examples do not include error cases, such as mismatched number of internal fields or incompatible types when assigning ROW objects. These situations are governed by the SQL standard and result in either compile-time or run-time errors. The section on comparisons (e.g., greater than, less than) is also omitted here, as it is clearly covered by the standard's rules. I have not included these details to keep the description concise, but I can add them if necessary.
As you know, information about table field types, procedure and function parameters is stored in special system tables. Ultimately, the description of these types is linked to the system table RDB$FIELDS via the RDB$FIELD_SOURCE field.
To support the nested structure of field type descriptions within ROW types, a new field, RDB$RELATION_NAME, has been added to RDB$FIELDS. This field points to RDB$RELATION_FIELDS.RDB$RELATION_NAME, where information and references to the internal fields of ROW types could be retrieved according to the structure of the base table. For ROW type parameters of procedures and functions declared using the TYPE OF TABLE clause, an additional record is created in RDB$FIELDS with the name of the corresponding table in RDB$RELATION_NAME, which is also duplicated in RDB$PROCEDURE_PARAMETERS or RDB$FUNCTION_ARGUMENTS. This record points out on RDB$RELATION_FIELDS.RDB$RELATION_NAME and thus subfields info could be retrieved.
Additionally, a new system table has been introduced for types defined within packages:
RDB$PACKAGED_TYPES
RDB$FIELD_SOURCE
RDB$TYPE_NAME
RDB$PACKAGE_NAME
RDB$PRIVATE_FLAG
For each new type, a record will be created in RDB$FIELDS with a unique name generated in the RDB$FIELD_NAME field. The “real” name of the packaged type will be recorded in the new RDB$PACKAGED_TYPES table in the RDB$TYPE_NAME field, along with the package name (RDB$PACKAGE_NAME) where the type is declared. Additionally, RDB$PACKAGED_TYPES will store a backreference to RDB$FIELDS.RDB$FIELD_NAME in RDB$FIELD_SOURCE, enabling the identification of the corresponding RDB$FIELDS entry using the “real” name (RDB$TYPE_NAME) and package name during query compilation.
For packaged ROW types, in addition to the entry in RDB$FIELDS, corresponding entries will be added to RDB$RELATION_FIELDS for each field of the type, similar to creating a new table, but without adding an entry to RDB$RELATIONS. The RDB$RELATION_FIELDS.RDB$RELATION_NAME will be generated automatically. Using this unique name, composite types could reference their fields, supporting an almost unlimited level of nesting.
In PSQL, if a package type is used without specifying the package name qualifier, and the domain name coincides with the package type name, the domain will be selected instead of the package type declared within the same package.
I hope this scheme will help to explain more clearly the structure and relations between these fields in system tables (new fields and table are bolded):
A new dependency type, obj_packaged_type, has been added. It is used when routines contain variables or parameters with a type declared in a package. Such dependencies prevent the removal of types from packages but do not restrict their modification. This dependency also prohibits deleting the package itself but allows altering it via ALTER PACKAGE.
If variables or parameters in a routine are declared with TYPE OF TABLE <relation name> as their type, the routine gains a dependency on the table. However, this does not prevent changes to the structure of its fields. When the types of the table's fields, upon which the routine depends, are modified, the cache is invalidated, and the procedure or function is recompiled to update the formats used in the corresponding variables or parameters.
To make it possible to retrieve composite types on the client, a serialized ROW type descriptor was added to the message metadata with the new IMessageMetadata::getCompositeDescriptor() function of the corresponding interface.
This descriptor is serialized in such a way that it is convenient to parse and supports almost unlimited nesting of described fields.
The structure of the descriptor passed to the client:
2 bytes - ordinal number of the described field inside the descriptor starting from 1
1 byte - length of the field name
0..255 bytes - field name
2 bytes - field type
4 bytes - field data length
2 bytes - subtype
2 bytes - scale
2 bytes - charset
1 byte - 0 value or 2 bytes - ordinal number of the subfield
Each field descriptor must end with a 0 value when all of its subfields have already been described. For example:
SQL> create table base_table (id int, num numeric(10,2), name char(10));
SQL> create procedure proc1 returns (outpar type of table base_table) as begin outpar = (1, 3.14, 'Gibson'); suspend; end;
SQL> select * from proc1;
OUTPAR
====================================================
(1, 3.14, Gibson)
Message metadata serialized descriptor of the query result will be:
Where:
01 00 - ordinal number of the top level ROW type object (OUTPAR procedure parameter)
06 - OUTPAR parameter name length
4f 55 54 50 41 52 - OUTPAR name
EA 7F - SQL_ROWTYPE (32746) field type
16 00 00 00 - length of the OUTPAR data including all subfields data and offsets
00 00 - no subtype
00 00 - no scale
00 00 - no charset
02 00 - ordinal number of the ID subfield
02 - ID subfield name length
49 44 - ID name
F0 01 - SQL_LONG (496) field type
04 00 00 00 - length of the ID subfield data
00 00 - no subtype
00 00 - no scale
00 00 - no charset
00 - 0 value subfield descriptor terminator
03 00 - ordinal number of the NUM subfield
03 - NUM subfield name length
4E 55 4D - NUM name
44 02 - SQL_INT64 (580) field type
08 00 00 00 - length of the NUM subfield data
10 00 - BIGINT subtype
FE FF - -2 scale
00 00 - no charset
00 - 0 value subfield descriptor terminator
04 00 - ordinal number of the NAME subfield
04 - NAME subfield name length
4E 41 4D 45 - NAME name
C4 01 - SQL_TEXT (452) field type
0A 00 00 00 - length of the NAME subfield data
00 00 - no subtype
00 00 - no scale
00 00 - no charset
00 - 0 value subfield descriptor terminator
00 - 0 value OUTPAR parameter descriptor terminator
Using this descriptor we can parse the data buffer and get the values for each subfield.
Two BLR codes were needed for implementation:
blr_row_value_expression - related to a <row value constructor> and implemented as the RowValueExpressionNode
blr_dcl_composite_type - declaration of local composite type implemented as the DeclareLocalTypeNode
In addition, I used 4 additional type codes for ROW types:
blr_rowtype - to declare ROW type objects with description of all internal fields. It is used when generating input and output messages of the requests
blr_rowtype2 - used if ROW type is declared in the package
blr_rowtype3 - used if the object is declared with TYPE OF TABLE
blr_rowtype4 - used if the object type is declared locally inside a PSQL block
In this proposal I haven't yet touched on the specific implementation in the code, as I believe this should be discussed in detail later. I think it’s best to break the discussion into several parts. Once we’ve confirmed that the main proposal is clear and without major concerns, I’ll move on to discussing the implementation details.
If you think any important details are missing (which I'm pretty sure they are), I’m happy to provide additional information.
<span style="font-size:11.5pt;font-family:"Roboto Mono",monospace;color:rgb(0,0,0);background-color:tran<BR><BR>...<BR><BR>Email został obcięty<BR>
Hi,
I have serious concerns about the client interface.
First, only some extension to IMessageMetadata is mentioned. However,
there are drivers that use the legacy API and those that use network
protocol directly, and some that support multiple methods
(native+netwrok or legacy+oo). How these would handle it? While I
support the idea of gradually abandoning the legacy API in favor of an
OO API, it increasingly looks that Firebird 6 will be another big
upgrade barrier of version 3 proportions, just of different kind. There
are very popular and widely used connection libraries like FireDAC in
Delphi. If they would not adapt or take long time to do so, Firebird
users will be effectively locked out from using such v6 features for
years or even forever (till they change the access provider).
Second, getCompositeDescriptor returning a descriptor to be parsed is a
completely bad idea. The biggest advantage of the OO API is that it
frees developers from buffer parsing and construction with all those
Builder interfaces. And now this. Phrases like "convenient parse" in
relation to the OO API are nonsense, and along with "supports almost
unlimited nesting" plainly scare me. Driver developers (like me) rely on
either directly accessing data items through the interface, or using
auxiliary data structures with native types for the language to avoid
expensive calls across language boundaries. So I would at least expect
getCompositeDescriptor to return a specialized interface that would save
us from writing and maintaining a parser on potentially evolving
recursive structure.
regards
Pavel Cisar
1. How are dependencies tracked? For example, you declared the type
DECLARE VARIABLE local_row_var TYPE OF TABLE test_table;
And then did
ALTER TABLE test_table
ADD NewField BIGINT;
What will happen to the procedure that uses such a type?
If ALTER TABLE is blocked, it is not very good. Maybe think about INVALID statuses for procedures/functions?
2. Are NEW and OLD context variables in DML triggers of the ROW type? Can the same operations be performed with them?
3. Is there support for destructuring ROW types?
(:A, :B) = local_row_var ;
Will the ROW TYPE be decomposed into these variables?
4. Is there a signature of the IMessageMetadata::getCompositeDescriptor() function and others related to it, or is it still hypothetical?
Please note that in your format you may have to add a schema after the schemas are merged with the main branch.
--
WBR, SD.
I agree with you. Considering that row types might become a table field type in the future, blocking their modification seems like a fairly logical approach. Unfortunately, however, it seems that this approach does not have much support among users. That's why I chose to allow modifications to base tables as the initial approach in the implementation.
10 дек. 2024 г., в 10:50, 'liviuslivius' via firebird-devel <firebir...@googlegroups.com> написал(а):
You are correct; the old API is deprecated and does not support ROW types. The main reason is that it would require extending the XSQLVAR structure, which would automatically break all drivers that use it.
You are absolutely right that users of compiled client libraries need a good interface for sequential access to the internal fields of composite types. The description of the serialized descriptor is necessary for implementing the network protocol.
Hi,
I would say yes, but I'm not entirely sure what exactly you mean in your question.
To discuss such a possibility, we first need to define what "unnested" means in this context. Does it mean accessing the internal fields as simple types?
If so, additional questions arise:
I'm not rejecting the idea; I just need a bit of clarification from your side.
Is this standard? "TYPE OF CURSOR" would look better for me.
Is it order (of input and output) based in column IDs or positions?
Looks weird when considering that this was valid with single column and
had different behavior/semantics.
This looks like different than standard scope rules. Why?
This is implemented this way because a package type can be explicitly called using the package name qualifier, whereas domains do not have such a capability.
There were no direct dependency for packaged items before and
verification of possible delete were done in the package commands. Why
this is different?
This looks no extensible at all. There must be tags.
Look at blr_invoke_function. Please, don't create so high number of top
level codes.
Thank you, I’ll definitely consider your example. Initially, I settled on using unambiguous type values, no matter how odd that may seem, for the sake of clarity.
Can a row type be used in a table column?
Can circular dependencies exist?
Are two similar TABLE row types from different tables compatible?
In fact, the initial implementation approach was exactly that—nothing more than syntactic sugar. Groups of simple values emulated the behavior of ROW types. Composite values were decomposed into sequences of fields and endlessly recalculated to match both sides of assignment and comparison operations. The real "fun" was stuffing such decomposed sequences into system tables as the return type for functions.
> If so, additional questions arise:
> * What should be done with the description of the top-level composite type field?
It just disappear. Composite type field is transformed into separate simple
types during parse and compilation.
> * How can we determine the start and end of simple type fields that belong to
> a single composite field?
By position as usual.
> * If we are considering (and we are) the future implementation of nested
> composite fields, where do they start and end? How can we determine how many
> fields are actually inside a composite type?
By count and position on declaration.
I.e. your query "SELECT * FROM row_returning_proc" instead produce following
result:
> OUT_1.ID OUT_1.NUM OUT_1.NAME OUT_2.ID OUT_2.NUM OUT_2.NAME
> ============ ============== ============== ========== ============ =============
> 1 2 David 3 4 John
With such an approach, the user would not be able to understand where different composite fields begin and end. To illustrate, consider the query:
SELECT 1, 2, ROW(3, 4, 5), 6, 7, ROW(8, 9), 10 FROM RDB$DATABASE;Without additional metadata, the user has no way to determine where the grouped sets of values are, even though the SQL standard allows them to reference these groups.
Both OO API and ISC API would have no problem with it and most of envelope
libraries would not notice anything.
But if you accept my idea to unnest row data type at BLR level as well, this
procedure will have input message containing three plain fields and no new BLR
verbs will be needed because for following example
2 bytes - ordinal number of the described field inside the descriptor starting from 1
1 byte - length of the field name
0..255 bytes - field name
2 bytes - field type
4 bytes - field data length
2 bytes - subtype
2 bytes - scale
2 bytes - charset
1 byte - 0 value or 2 bytes - ordinal number of the subfield
for something like that
isc_rec_order <2 bytes> - ordinal number of the described field inside the descriptor starting from 1
isc_rec_field_length <1 byte> - length of the field name
isc_rec_field_name <0..255 bytes> - field name
isc_rec_field_type <2 bytes> - field type
isc_rec_data_length 4 bytes - field data length
isc_rec_sub_type <2 bytes> - subtype
isc_rec_field_scale <2 bytes> - scale
isc_rec_charset <2 bytes> - charset
isc_rec_subfield_number <1 byte - 0 value or 2 bytes> - ordinal number of the subfield
isc_rec_description_end
In this case, you can expand this description wherever you want and ignore anything unknown.
On 10/12/2024 15:02, Alexey Mochalov wrote:
> Is it order (of input and output) based in column IDs or positions?
>
>
> Positions
>
Are cached compiled routines invalidated correctly when new columns are
added not in the tail of the table?
If they depend on tables that change, the cache is cleared and updated on the next routine call, regardless of what was changed and how. Or am I missing something in your question?
>
> This looks like different than standard scope rules. Why?
>
> This is implemented this way because a package type can be explicitly
> called using the package name qualifier, whereas domains do not have
> such a capability.
>
The same for packaged procedure vs table, no? So I still think a new
rule should not be created.
Yes, I think it’s worth making it consistent if everyone agrees. I’ll add this to the discussion summary.
>
> This looks no extensible at all. There must be tags.
>
>
> Honestly, I can't say I support this idea, but I also can't argue
> against it. If examples or analogies could be provided to help me better
> understand what is meant, that would be great.
>
Just image you have that structure and someone is adding schemas.
That's easy to imagine now, but think something like that can appear in
the future again.
I have another question. We have so-called cursor variables.
FOR SELECT
...
AS CURSOR C
DO BEGIN
A = :C.Field;
END
How does this relate to ROW TYPE? Are there any mechanisms to turn C into ROW TYPE?
Currently, there is no such mechanism. However, it is possible to specify a ROW type variable derived from a table in the INTO list and use the fields of that variable.
Can the ROW constructor work as ROW (table_alias.*) or ROW (C.*)?
This syntax is not implemented.
Thank you for the explanation; I think this is an excellent idea and worth implementing.
Denis Simonov wrote 11.12.2024 13:09:
> Firstly, the code for parsing a ROW type descriptor can be executed once, and
> then put into a structure and worked with, as is done for a flat field buffer.
If it can - it must be already done behind of scene in Y-valve. Current
proposal in API part is a complete nightmare.
I agree, if you take a closer look, the interface does indeed seem clunky. This discussion is aimed precisely at highlighting the questionable aspects.
We can steer the discussion in a more productive direction, for example, by suggesting interface design options. Something like this:
interface MessageMetadata : ReferenceCountedSo, your suggestion is to completely hide the composite type descriptor from users in the OO API. I think that might make sense. I have absolutely no objections, especially once we define an extension to the OO API in this area that the majority will find satisfactory.
1. Just noticed that the definition of <row_type_definition> only covers the
DECLARE TYPE typename ( <nested_field> [ , <nested_field> … ] ) variant but does not cover the DECLARE TYPE typename TYPE OF TABLE <table_name> variant used in the examples.
We need to decide whether `ROW TYPES` can be nested. I did not see this in the current proposal, but could this be possible in the future? If not, then it is quite sufficient to simply extend IMessageMetadata.
2. The API problem revisited. It all boils down to whether we think that in the future this feature could somehow be used as part of composite data types stored in the database or not.
Denis Simonov wrote on 11.12.2024 at 6:09 PM:
> There is no need to get a description of the ROW type for each record. This
> description cannot change.
Agree. In this case, it brings us back to a slightly modified initial suggestion:
IMessageMetadata* IMessageMetadata::getRowMetadata(IStatus* status, unsigned index);
which returns nullptr for any other type.
Hi,
1. Just noticed that definition of <row_type_definition> cover only
DECLARE TYPE typename ( <nested_field> [ , <nested_field> … ] ) variant,
but does not cover DECLARE TYPE typename TYPE OF TABLE <table_name>
variant used in examples.
2. The API problem revisited. It all boils down whether we think that in
future this feature could be somehow used as part of composite data
types stored in the database or not.
a) If there is such possibility, then it's certainly possible that these
structures could be nested, and some more general API solution would be
better. For example something like Jim suggested.
b) If not, then it's more a syntactic sugar than anything else. It
couldn't be nested, and there is no real reason why to complicate the
API much by one sub-level. We can flatten it at IMessageMetadata, and
just add one item there - GroupId, that will link the field with
particular ROW group. Those who would want to work with whole groups can
easily do that with this information, and those who don't care will not
need to do anything. The API just need to handle the case when input
parameters will not have GroupId filled, but that's not a big deal
either as mapping could be done by sequence order. It will also solve
the problem with legacy API, as only grouping will be lost / unused.
regards
Pavel Cisar
--
WBR, SD.
If columns could be rows, then there is interesting recursion of definition.
Anyway, if Firebird will support such composite type with unlimited nesting for storage, it enters the
same nightmare league of "unbound" data like Array, JSON and XML (just
with fixed structure), and should be probably handled differently in
BLOB-like fashion?
Actually, the original Interbase implementation of arrays
supported fetching slices of arbitrary size and dimension. Even
got a patent on it (long since expired).
--
You received this message because you are subscribed to the Google Groups "firebird-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebird-deve...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/firebird-devel/4f3a142a-ef4f-46f0-a090-daf88f0c325cn%40googlegroups.com.
Goals
Implement the ROW data type, a structured PSQL object which contains internal fields
Create declaration syntax of ROW type objects based on existing tables, cursors and user defined types
Nested fields values of ROW type objects should be accessible via qualified name of object plus field name
Add the ability to define local ROW types inside procedures, functions, etc., as well as custom ROW types inside packages
Add <row value constructor> as defined by the SQL standard to create ROW type values on the fly
Make it possible to receive and conveniently process ROW type values on client side
Adhere to the SQL standard as closely as possible
Make the implementation open for future expansions like nested composite fields
What exactly is meant with point 6? Do you want to expose ROW
type values as a composite type to the client? If so, I think that
goes to far, and significantly raises complexity for drivers.
Syntax[..]The syntax description is divided into different use cases for ROW types. New features and changes are highlighted in bold. Not all use cases of ROW types are covered in this section, only those that use a previously non-existent syntax. More variants of using ROW type objects can be found in the examples section.
1. ROW type variables, parameters and return values declaration
2. Local and packaged ROW type definition[..]
3. Accessing fields in ROW type objects[..]
4. <row value constructor>Specify a value or list of values to be constructed into a row value that could be used to initialize variables, in WHERE conditions, as a result of select statements or full record insert/update values.
<row_value_constructor> ::= [ROW] ( <value-expression> [ , <value-expression> … ] )
This syntax is ambiguous and does not match the syntax defined in SQL:2023, 7.1 <row value constructor> is:
<row value constructor> ::=
<common value expression>
| <boolean value expression>
| <explicit row value constructor>
<explicit row value constructor> ::=
<left paren> <row value constructor element>
<comma>
<row value constructor element list> <right
paren>
| ROW <left paren> <row value constructor element
list> <right paren>
| <row subquery>
<row value constructor element list> ::=
<row value constructor element> [ { <comma> <row
value constructor element> }... ]
<row value constructor element> ::=
<value expression>
That is, ROW is only optional if there are at least *two*
elements in the list, for a list of one element, ROW is required,
otherwise there is an ambiguity with <value expression>
through:
<value expression primary> ::=
<parenthesized value expression>
| <non-parenthesized value expression primary>
<parenthesized value expression> ::=
<left paren> <value expression> <right paren>
Also, the row subquery derivation is missing, as are <common
value expression> and <boolean value expression>.
5. ROW clause in UPDATE statementNew way of updating whole record using ROW type value:
UPDATE target [[AS] alias]
SET <set_list>
[WHERE {<search-conditions> | CURRENT OF cursorname}]
[PLAN <plan_items>]
[ORDER BY <sort_items>]
[ROWS m [TO n]]
[SKIP LOCKED]
[RETURNING <returning_list> [INTO <variables>]]
<set_list> ::=
col_name = <upd_value> [, col_name = <upd_value> ...]
| ROW = <row_type_upd_value>
<row_type_upd_value> ::= <row_type_value_expression> | DEFAULT
<row_type_value_expression> ::=
<row_value_constructor>
| <context-variable>
| any other expression returning a ROW type value or NULL
This is not the syntax described in the SQL:2023 standard, please lets not invent our own syntax and follow the standard. The syntax in the standard is defined in 14.15 <set clause list>, and is specified as
<set clause list> ::=
<set clause> [ { <comma> <set clause> }... ]
<set clause> ::=
<multiple column assignment>
| <set target> <equals operator> <update
source>
<set target> ::=
<update target>
| <mutated set clause>
<multiple column assignment> ::=
<set target list> <equals operator> <assigned
row>
<set target list> ::=
<left paren> <set target> [ { <comma> <set
target> }... ] <right paren>
<assigned row> ::=
<contextually typed row value expression>
In other words, row types used in an update are like
UPDATE ... SET (colum1, column2, column3) = <contextually typed row value expression>
or even (... stands for <contextually typed row value expression>)
UPDATE ... SET (colum1, column2) = ..., (column3, column4) = ...
Where <contextually typed row value expression> is defined
in 7.2 <row value expression>:
<contextually typed row value expression> ::=
<row value special case>
| <contextually typed row value constructor>
<row value special case> ::=
<non-parenthesized value expression primary>
And <contextually typed row value constructor> in 7.1
<row value constructor>
<contextually typed row value constructor> ::=
<common value expression>
| <boolean value expression>
| <contextually typed value specification>
| <left paren> <contextually typed value
specification> <right paren>
| <left paren> <contextually typed row value
constructor element> <comma>
<contextually typed row value constructor element
list> <right paren>
| ROW <left paren> <contextually typed row value
constructor element list> <right paren>
etc, etc.
Examples[..]
6) updating records using a ROW type variable
UPDATE test_table SET ROW = (3,4,'William');
SELECT * FROM test_table;
ID NUM NAME
============ ======= ==========
3 4 William
UPDATE another_table SET ROW = ROW(DEFAULT);
SELECT * FROM another_table;
TEXT
==============================
some text
UPDATE another_table SET ROW = DEFAULT;
SELECT * FROM another_table;
TEXT
==============================
some text
As I said before under section 5, I don't like this invented, non-standard syntax.
[..]
8) comparing ROW-type objects with each otherIIRC, the comparison rules for < and > are rather complex; this will probably need to get a lot of attention during testing.
SELECT 1 FROM RDB$DATABASE WHERE (1, 2, 3) > (1, 2, 3);
SELECT 1 FROM RDB$DATABASE WHERE (1, 2, 3+1) > (1, 2, 3);
CONSTANT
============
1
CREATE PROCEDURE p1 (inpar TYPE OF TABLE test_table)
RETURNS (outpar TYPE OF TABLE test_table)
AS
BEGIN
outpar = inpar;
SUSPEND;
END;
SELECT * FROM p1((1, 3, null)) WHERE (outpar > (1, 2, 3)) IS NOT null;
OUTPAR
========================================
(1, 3, <null>)
9) packaged ROW types declaration9.1) declaration of a public packaged ROW type (can be used outside packages via the package name qualifier)[..]
9.2) declaration of a private packaged ROW type (can be used in the same package where the type is declared)[..]
[..]Key points underpinning implementationODS changes
This image was absent or not rendered for me.I hope this scheme will help to explain more clearly the structure and relations between these fields in system tables (new fields and table are bolded):
Dependencie[..]
Message metadataAs I said, I don't like this because it raises the complexity for drivers. At minimum, I think there must be an option to enable/disable this in firebird.conf/databases.conf and in the DPB, which when "disabled" will unpack row types to "normal" top-level columns in the messages, so that older drivers can still work, or drivers that don't need or don't want the complexity of row-types can simply disable it and handle them as normal columns.To make it possible to retrieve composite types on the client, a serialized ROW type descriptor was added to the message metadata with the new IMessageMetadata::getCompositeDescriptor() function of the corresponding interface.
This descriptor is serialized in such a way that it is convenient to parse and supports almost unlimited nesting of described fields.
The structure of the descriptor passed to the client:
2 bytes - ordinal number of the described field inside the descriptor starting from 1
1 byte - length of the field name
0..255 bytes - field name
2 bytes - field type
4 bytes - field data length
2 bytes - subtype
2 bytes - scale
2 bytes - charset
1 byte - 0 value or 2 bytes - ordinal number of the subfield
This image is missing or not rendered for me.Each field descriptor must end with a 0 value when all of its subfields have already been described. For example:
SQL> create table base_table (id int, num numeric(10,2), name char(10));
SQL> create procedure proc1 returns (outpar type of table base_table) as begin outpar = (1, 3.14, 'Gibson'); suspend; end;
SQL> select * from proc1;
OUTPAR
====================================================
(1, 3.14, Gibson)
Message metadata serialized descriptor of the query result will be:
[..]Where:
Please provide an example with multiple columns, combining normal columns and row-type columns.Using this descriptor we can parse the data buffer and get the values for each subfield.
BLR codes
[..]
Mark
-- Mark Rotteveel
On 09/12/2024 16:13, Alexey Mochalov wrote:[..]
What exactly is meant with point 6? Do you want to expose ROW type values as a composite type to the client? If so, I think that goes to far, and significantly raises complexity for drivers.
[..]
This syntax is ambiguous and does not match the syntax defined in SQL:2023, 7.1 <row value constructor> is:
[..]
That is, ROW is only optional if there are at least *two* elements in the list, for a list of one element, ROW is required, otherwise there is an ambiguity with <value expression> through:
[..]
Also, the row subquery derivation is missing, as are <common value expression> and <boolean value expression>.
[..]
This is not the syntax described in the SQL:2023 standard, please lets not invent our own syntax and follow the standard. The syntax in the standard is defined in 14.15 <set clause list>, and is specified as
[..]
[..]
As I said, I don't like this because it raises the complexity for drivers. At minimum, I think there must be an option to enable/disable this in firebird.conf/databases.conf and in the DPB, which when "disabled" will unpack row types to "normal" top-level columns in the messages, so that older drivers can still work, or drivers that don't need or don't want the complexity of row-types can simply disable it and handle them as normal columns.
[..]
Please provide an example with multiple columns, combining normal columns and row-type columns.
On 09/12/2024 16:13, Alexey Mochalov wrote:[..]
This image was absent or not rendered for me.
[..]
This image is missing or not rendered for me.
On 16/12/2024 10:05, Alexey Mochalov wrote:
> пятница, 13 декабря 2024 г. в 13:11:37 UTC+3, Mark Rotteveel:
> This is not the syntax described in the SQL:2023 standard, please
> lets not invent our own syntax and follow the standard. The syntax
> in the standard is defined in 14.15 <set clause list>, and is
> specified as
>
> [..]
>
>
> You are absolutely right. Unfortunately, I overlooked this particular
> UPDATE query syntax, and it definitely needs to be added to the
> implementation.
> Regarding the SET ROW syntax—it was implemented as an analog from
> another popular DBMS. When manipulating composite type objects derived
> from a specific table, it can be quite convenient to specify UPDATE
> without repeating all the fields of the same table.
I think `set ROW` is a bad idea on principle, for the same reasons why
using INSERT without an explicit column list is generally regarded as a
bad idea. It is very ambiguous, and can easily lead to assigning the
wrong values to the wrong columns.
> [..]
>
> As I said, I don't like this because it raises the complexity for
> drivers. At minimum, I think there must be an option to enable/
> disable this in firebird.conf/databases.conf and in the DPB, which
> when "disabled" will unpack row types to "normal" top-level columns
> in the messages, so that older drivers can still work, or drivers
> that don't need or don't want the complexity of row-types can simply
> disable it and handle them as normal columns.
>
>
> This approach creates significant challenges in the internal
> implementation for SELECT ... INTO queries, RETURNING, and routine
> parameters.
That depends on *where* the unpacking is done. I'm only talking about
conversion on the interface between client and server, similar to how
DataTypeCompatibility works (in fact, as others suggested, it probably
needs to be implemented as part of the DataTypeCompatibility option).