Row types proposal

433 views
Skip to first unread message

Alexey Mochalov

unread,
Dec 9, 2024, 12:33:21 PM12/9/24
to firebird-devel

Dear Colleagues,

I am pleased to inform you that I am finalizing the development of a SQL-standard compliant ROW data type. As we approach the final stages, I would like to present a proposal outlining the syntax and logic for this feature. Once we reach a consensus and address any remaining concerns, I will prepare and submit a pull request for review. I look forward to your feedback and any suggestions you might have.

Goals
  1. Implement the ROW data type, a structured PSQL object which contains internal fields

  2. Create declaration syntax of ROW type objects based on existing tables, cursors and user defined types

  3. Nested fields values of ROW type objects should be accessible via qualified name of object plus field name

  4. Add the ability to define local ROW types inside procedures, functions, etc., as well as custom ROW types inside packages

  5. Add <row value constructor> as defined by the SQL standard to create ROW type values on the fly

  6. Make it possible to receive and conveniently process ROW type values on client side

  7. Adhere to the SQL standard as closely as possible

  8. Make the implementation open for future expansions like nested composite fields


Syntax

The syntax description is divided into different use cases for ROW types. New features and changes are highlighted in bold. Not all use cases of ROW types are covered in this section, only those that use a previously non-existent syntax. More variants of using ROW type objects can be found in the examples section.


1. ROW type variables, parameters and return values declaration

Since the basic declaration syntax has not changed, there are new options for type definition:


<domain_or_non_array_type> ::=

    <scalar_datatype>

  | <blob_datatype>

  | [TYPE OF] domain

  | TYPE OF COLUMN rel.col

  | TYPE OF TABLE <fields_definition_source>

  | package_name.user_defined_packaged_type_name

  | local_row_type_name


<fields_definition_source> := relation_name | cursor_name


As well as a default initialization:


<inparam> ::= <param_decl> [{= | DEFAULT} <value>]

<value> ::= {<literal> | NULL | <context_var> | <row_type_nested_field> | <row_value_constructor>}


DECLARE [VARIABLE] varname

  <domain_or_non_array_type> [NOT NULL] [COLLATE collation]

  [{DEFAULT | = } <initvalue>];


<initvalue> ::= <literal> | <context_var> | <row_type_nested_field> | <row_value_constructor> 


2. Local and packaged ROW type definition

Introduced ability to define local ROW types inside routines:


<psql-module-body> ::=

  AS

    [<forward-declarations>]

    [<declarations>]

  BEGIN

    [<PSQL_statements>]

  END


<declarations> ::=

  <declare-item> [<declare-item> ...]


<declare-item> ::=

    <declare-var>

  | <declare-cursor>

  | <subfunc-def>

  | <subproc-def>

  | <row_type_definition>


Furthermore, ROW types can be defined in packages either in the header or body and can be used globally or only within the same package, respectively:


<package_item> ::=

    <function_decl>;

  | <procedure_decl>;

  | <row_type_definition>;


<row_type_definition> ::=

  DECLARE TYPE typename ( <nested_field> [ , <nested_field> … ] )

  <nested_field> ::= <field_declaration> [{= | DEFAULT} <value>]

  <field_declaration> ::=

    field_name <domain_or_non_array_type>

    [NOT NULL]

    [CHECK (<condition>)]

    [COLLATE collation]


3. Accessing fields in ROW type objects

Access to a nested field value via ROW type object qualified name:


<row_type_nested_field> ::= <row_type_object_name>.nested_field_name


<row_type_object_name> ::=

    row_type_variable_name

  | row_type_parameter_name


4. <row value constructor>

Specify a value or list of values to be constructed into a row value that could be used to initialize variables, in WHERE conditions, as a result of select statements or full record insert/update values.


<row_value_constructor> ::= [ROW] ( <value-expression> [ , <value-expression> … ] )


<value-expression> ::=

    [table-alias.]col_name

  | [table-alias.]selectable_SP_outparm

  | <literal>

  | <context-variable>

  | <function-call>

  | <single-value-subselect>

  | <CASE-construct>

  | <row_type_nested_field>

  | any other expression returning a single

    value of a Firebird data type or NULL


5. ROW clause in UPDATE statement

New way of updating whole record using ROW type value:


UPDATE target [[AS] alias]

  SET <set_list>

  [WHERE {<search-conditions> | CURRENT OF cursorname}]

  [PLAN <plan_items>]

  [ORDER BY <sort_items>]

  [ROWS m [TO n]]

  [SKIP LOCKED]

  [RETURNING <returning_list> [INTO <variables>]]


<set_list> ::=

  col_name = <upd_value> [, col_name = <upd_value> ...]

  | ROW = <row_type_upd_value>


<row_type_upd_value> ::= <row_type_value_expression> | DEFAULT


<row_type_value_expression> ::=

    <row_value_constructor>

  | <context-variable>

  | any other expression returning a ROW type value or NULL


Examples

Let's assume that we have prepared two tables:


CREATE TABLE test_table (id int, num smallint, name char(10));

INSERT INTO test_table VALUES (1, 2, 'John Doe');


CREATE TABLE another_table (text varchar(30) default 'some text');

INSERT INTO another_table VALUES ('inserted text');


1) declaring a variable of composite type specifying the base table. The variable will contain the same fields as the base table, which can be used through the qualified name of the base variable:

EXECUTE BLOCK

RETURNS (out VARCHAR(30))

AS

DECLARE VARIABLE row_var TYPE OF TABLE test_table;

BEGIN

row_var.name = 'Terence';

out = row_var.name;

SUSPEND;

END;


OUT                        

==============================

Terence


2) declaring a ROW type variable specifying a cursor as a base:

EXECUTE BLOCK

RETURNS (out_1 int, out_2 char(10))

AS

DECLARE test_cursor CURSOR FOR (SELECT id, name FROM test_table);

DECLARE local_row_var TYPE OF TABLE test_cursor;

BEGIN

local_row_var.id = 42;

local_row_var.name = 'Adams';

out_1 = local_row_var.id;

out_2 = local_row_var.name;

SUSPEND;

END;


       OUT_1 OUT_2  

============ ==========

          42 Adams


3) declaring as PSQL block arguments and return values by specifying a base table. Arguments and return values will contain the same fields as the base table:

CREATE PROCEDURE row_returning_proc

RETURNS (out_1 TYPE OF TABLE test_table, out_2 TYPE OF TABLE test_table)

AS

BEGIN

out_1.id = 1;

out_1.num = 2;

out_1.name = 'David';

out_2 = (3, 4, 'John');

SUSPEND;

END;


SELECT * FROM row_returning_proc;


                                   OUT_1                                    OUT_2

======================================== ========================================

                           (1, 2, David)                             (3, 4, John)



CREATE FUNCTION func_with_row_types(in_param TYPE OF TABLE test_table)

RETURNS TYPE OF TABLE test_table

AS

BEGIN

RETURN in_param;

END;


SELECT func_with_row_types(ROW(1, 2, 'Pratchett')) FROM rdb$database;


                  FUNC_WITH_ROW_TYPES

========================================

                       (1, 2, Pratchett)




4) ROW type local variables and arguments can be initialized with default values using <row value constructor>

CREATE PROCEDURE proc_with_row_types (in_param TYPE OF TABLE test_table = (1, 2, 'Dan'))

RETURNS (out_1 TYPE OF TABLE test_table)

AS

DECLARE VARIABLE local_row_var TYPE OF TABLE test_table DEFAULT (3, NULL, 'Abnett');

BEGIN

in_param.id = local_row_var.id;

in_param.num = local_row_var.num;

out_1 = in_param;

SUSPEND;

END;


SELECT * FROM proc_with_row_types;


                                   OUT_1

========================================

                        (3, <null>, Dan)


5) you can "select into" to a ROW type object

EXECUTE BLOCK

RETURNS (out_1 TYPE OF TABLE test_table)

AS

BEGIN

SELECT * FROM test_table INTO out_1;

SUSPEND;

end;


                                   OUT_1

========================================

                        (1, 2, John Doe)


6) updating records using a ROW type variable

UPDATE test_table SET ROW = (3,4,'William');


SELECT * FROM test_table;


          ID     NUM NAME   

============ ======= ==========

           3       4 William  


UPDATE another_table SET ROW = ROW(DEFAULT);


SELECT * FROM another_table;


TEXT                       

==============================

some text


UPDATE another_table SET ROW = DEFAULT;


SELECT * FROM another_table;


TEXT                       

==============================

some text


7) local declaration of ROW type and its use in PSQL

EXECUTE BLOCK

RETURNS (out_1 int, out_2 smallint)

AS

DECLARE TYPE LocalRowType (f1 int, f2 smallint default 5);

DECLARE VARIABLE row_var LocalRowType;

BEGIN

out_1 = row_var.f1;

out_2 = row_var.f2;

SUSPEND;

END;


       OUT_1   OUT_2

============ =======

      <null>       5


8) comparing ROW-type objects with each other

SELECT 1 FROM RDB$DATABASE WHERE (1, 2, 3) > (1, 2, 3);


SELECT 1 FROM RDB$DATABASE WHERE (1, 2, 3+1) > (1, 2, 3);


    CONSTANT

============

           1


CREATE PROCEDURE p1 (inpar TYPE OF TABLE test_table)

RETURNS (outpar TYPE OF TABLE test_table)

AS

BEGIN

outpar = inpar;

SUSPEND;

END;


SELECT * FROM p1((1, 3, null)) WHERE (outpar > (1, 2, 3)) IS NOT null;


                                  OUTPAR

========================================

                          (1, 3, <null>)


9) packaged ROW types declaration9.1) declaration of a public packaged ROW type (can be used outside packages via the package name qualifier)

CREATE PACKAGE test_pack AS

BEGIN

DECLARE TYPE PackagedType (f1 INT CHECK (VALUE > 10), f2 SMALLINT);

END;


EXECUTE BLOCK

RETURNS (out_1 test_pack.PackagedType)

AS

DECLARE VARIABLE local_var test_pack.PackagedType = (341, 561);

BEGIN

out_1 = local_var;

SUSPEND;

END;


               OUT_1

====================

          (341, 561)


9.2) declaration of a private packaged ROW type (can be used in the same package where the type is declared)

CREATE PACKAGE private_test_pack AS

BEGIN

DECLARE TYPE PublicType (f0 INT, f1 VARCHAR(10));

PROCEDURE p1 RETURNS (out_1 PublicType);

END;


CREATE PACKAGE BODY private_test_pack AS

BEGIN

DECLARE TYPE PrivateType (f0 INT NOT NULL DEFAULT 10, f1 VARCHAR(10) DEFAULT 'Ford');

    

PROCEDURE p1 RETURNS (out_1 PublicType)

AS

     DECLARE VARIABLE local_var PrivateType;

BEGIN

     out_1 = local_var;

     SUSPEND;

END

END;


                                   OUT_1

========================================

                              (10, Ford)



There are other ways to utilize ROW type objects and <row value constructor>, but the examples provided cover all possible syntax scenarios. Any other use cases would simply be combinations of these examples.

Additionally, the examples do not include error cases, such as mismatched number of internal fields or incompatible types when assigning ROW objects. These situations are governed by the SQL standard and result in either compile-time or run-time errors. The section on comparisons (e.g., greater than, less than) is also omitted here, as it is clearly covered by the standard's rules. I have not included these details to keep the description concise, but I can add them if necessary.

Key points underpinning implementation
ODS changes

As you know, information about table field types, procedure and function parameters is stored in special system tables. Ultimately, the description of these types is linked to the system table RDB$FIELDS via the RDB$FIELD_SOURCE field.

To support the nested structure of field type descriptions within ROW types, a new field, RDB$RELATION_NAME, has been added to RDB$FIELDS. This field points to RDB$RELATION_FIELDS.RDB$RELATION_NAME, where information and references to the internal fields of ROW types could be retrieved according to the structure of the base table. For ROW type parameters of procedures and functions declared using the TYPE OF TABLE clause, an additional record is created in RDB$FIELDS with the name of the corresponding table in RDB$RELATION_NAME, which is also duplicated in RDB$PROCEDURE_PARAMETERS or RDB$FUNCTION_ARGUMENTS. This record points out on RDB$RELATION_FIELDS.RDB$RELATION_NAME and thus subfields info could be retrieved.


Additionally, a new system table has been introduced for types defined within packages:

  • RDB$PACKAGED_TYPES

    • RDB$FIELD_SOURCE

    • RDB$TYPE_NAME

    • RDB$PACKAGE_NAME

    • RDB$PRIVATE_FLAG


For each new type, a record will be created in RDB$FIELDS with a unique name generated in the RDB$FIELD_NAME field. The “real” name of the packaged type will be recorded in the new RDB$PACKAGED_TYPES table in the RDB$TYPE_NAME field, along with the package name (RDB$PACKAGE_NAME) where the type is declared. Additionally, RDB$PACKAGED_TYPES will store a backreference to RDB$FIELDS.RDB$FIELD_NAME in RDB$FIELD_SOURCE, enabling the identification of the corresponding RDB$FIELDS entry using the “real” name (RDB$TYPE_NAME) and package name during query compilation.


For packaged ROW types, in addition to the entry in RDB$FIELDS, corresponding entries will be added to RDB$RELATION_FIELDS for each field of the type, similar to creating a new table, but without adding an entry to RDB$RELATIONS. The RDB$RELATION_FIELDS.RDB$RELATION_NAME will be generated automatically. Using this unique name, composite types could reference their fields, supporting an almost unlimited level of nesting.


In PSQL, if a package type is used without specifying the package name qualifier, and the domain name coincides with the package type name, the domain will be selected instead of the package type declared within the same package.


I hope this scheme will help to explain more clearly the structure and relations between these fields in system tables (new fields and table are bolded):



Dependencies

A new dependency type, obj_packaged_type, has been added. It is used when routines contain variables or parameters with a type declared in a package. Such dependencies prevent the removal of types from packages but do not restrict their modification. This dependency also prohibits deleting the package itself but allows altering it via ALTER PACKAGE.


If variables or parameters in a routine are declared with TYPE OF TABLE <relation name> as their type, the routine gains a dependency on the table. However, this does not prevent changes to the structure of its fields. When the types of the table's fields, upon which the routine depends, are modified, the cache is invalidated, and the procedure or function is recompiled to update the formats used in the corresponding variables or parameters.


Message metadata

To make it possible to retrieve composite types on the client, a serialized ROW type descriptor was added to the message metadata with the new IMessageMetadata::getCompositeDescriptor() function of the corresponding interface.

This descriptor is serialized in such a way that it is convenient to parse and supports almost unlimited nesting of described fields.


The structure of the descriptor passed to the client:

2 bytes - ordinal number of the described field inside the descriptor starting from 1

1 byte - length of the field name

0..255 bytes - field name

2 bytes - field type

4 bytes - field data length

2 bytes - subtype

2 bytes - scale

2 bytes - charset

1 byte - 0 value or 2 bytes - ordinal number of the subfield


Each field descriptor must end with a 0 value when all of its subfields have already been described. For example:


SQL> create table base_table (id int, num numeric(10,2), name char(10));

SQL> create procedure proc1 returns (outpar type of table base_table) as begin outpar = (1, 3.14, 'Gibson'); suspend; end;

SQL> select * from proc1;


                                              OUTPAR

====================================================

                                   (1, 3.14, Gibson)


Message metadata serialized descriptor of the query result will be:



Where:


01 00 - ordinal number of the top level ROW type object (OUTPAR procedure parameter)

06 - OUTPAR parameter name length

4f 55 54 50 41 52 - OUTPAR name

EA 7F - SQL_ROWTYPE (32746) field type

16 00 00 00 - length of the OUTPAR data including all subfields data and offsets

00 00 - no subtype

00 00 - no scale

00 00 - no charset


02 00 - ordinal number of the ID subfield

02 - ID subfield name length

49 44 - ID name

F0 01 - SQL_LONG (496) field type

04 00 00 00 - length of the ID subfield data

00 00 - no subtype

00 00 - no scale

00 00 - no charset

00 - 0 value subfield descriptor terminator


03 00 - ordinal number of the NUM subfield

03 - NUM subfield name length

4E 55 4D - NUM name

44 02 - SQL_INT64 (580) field type

08 00 00 00 - length of the NUM subfield data

10 00 - BIGINT subtype

FE FF - -2 scale

00 00 - no charset

00 - 0 value subfield descriptor terminator


04 00 - ordinal number of the NAME subfield

04 - NAME subfield name length

4E 41 4D 45 - NAME name

C4 01 - SQL_TEXT (452) field type

0A 00 00 00 - length of the NAME subfield data

00 00 - no subtype

00 00 - no scale

00 00 - no charset

00 - 0 value subfield descriptor terminator


00 - 0 value OUTPAR parameter descriptor terminator


Using this descriptor we can parse the data buffer and get the values for each subfield.


BLR codes

Two BLR codes were needed for implementation:

blr_row_value_expression - related to a <row value constructor> and implemented as the RowValueExpressionNode

blr_dcl_composite_type - declaration of local composite type implemented as the DeclareLocalTypeNode


In addition, I used 4 additional type codes for ROW types:

blr_rowtype - to declare ROW type objects with description of all internal fields. It is used when generating input and output messages of the requests

blr_rowtype2 - used if ROW type is declared in the package

blr_rowtype3 - used if the object is declared with TYPE OF TABLE

blr_rowtype4 - used if the object type is declared locally inside a PSQL block


Conclusion

In this proposal I haven't yet touched on the specific implementation in the code, as I believe this should be discussed in detail later. I think it’s best to break the discussion into several parts. Once we’ve confirmed that the main proposal is clear and without major concerns, I’ll move on to discussing the implementation details.

If you think any important details are missing (which I'm pretty sure they are), I’m happy to provide additional information.

Denis Simonov

unread,
Dec 10, 2024, 1:37:56 AM12/10/24
to firebird-devel
Good job.

I have a few questions:
1. How are dependencies tracked? For example, you declared the type

DECLARE VARIABLE local_row_var TYPE OF TABLE test_table;

And then did

ALTER TABLE test_table
ADD NewField BIGINT;

What will happen to the procedure that uses such a type?

If ALTER TABLE is blocked, it is not very good. Maybe think about INVALID statuses for procedures/functions?

2. Are NEW and OLD context variables in DML triggers of the ROW type? Can the same operations be performed with them?

3. Is there support for destructuring ROW types?

(:A, :B) = local_row_var ;

Will the ROW TYPE be decomposed into these variables?

4. Is there a signature of the IMessageMetadata::getCompositeDescriptor() function and others related to it, or is it still hypothetical? 
Please note that in your format you may have to add a schema after the schemas are merged with the main branch.

понедельник, 9 декабря 2024 г. в 20:33:21 UTC+3, Alexey Mochalov:

liviuslivius

unread,
Dec 10, 2024, 2:50:14 AM12/10/24
to firebir...@googlegroups.com
>>What will happen to the procedure that uses such a type?

>>If ALTER TABLE is blocked, it is not very good. 
>>Maybe think about INVALID statuses for procedures/functions?

It should be blocked, this is the same as you change field type and this field is used in stored proc.



Regards,
Karol Bieniaszewski


-------- Oryginalna wiadomość --------
Od: Denis Simonov <sim....@gmail.com>
Data: 10.12.2024 07:37 (GMT+01:00)
Do: firebird-devel <firebir...@googlegroups.com>
Temat: [firebird-devel] Re: Row types proposal

<span style="font-size:11.5pt;font-family:"Roboto Mono",monospace;color:rgb(0,0,0);background-color:tran<BR><BR>...<BR><BR>Email został obcięty<BR>

Pavel Cisar

unread,
Dec 10, 2024, 4:29:14 AM12/10/24
to firebir...@googlegroups.com
Hi,

I have serious concerns about the client interface.

First, only some extension to IMessageMetadata is mentioned. However,
there are drivers that use the legacy API and those that use network
protocol directly, and some that support multiple methods
(native+netwrok or legacy+oo). How these would handle it? While I
support the idea of gradually abandoning the legacy API in favor of an
OO API, it increasingly looks that Firebird 6 will be another big
upgrade barrier of version 3 proportions, just of different kind. There
are very popular and widely used connection libraries like FireDAC in
Delphi. If they would not adapt or take long time to do so, Firebird
users will be effectively locked out from using such v6 features for
years or even forever (till they change the access provider).

Second, getCompositeDescriptor returning a descriptor to be parsed is a
completely bad idea. The biggest advantage of the OO API is that it
frees developers from buffer parsing and construction with all those
Builder interfaces. And now this. Phrases like "convenient parse" in
relation to the OO API are nonsense, and along with "supports almost
unlimited nesting" plainly scare me. Driver developers (like me) rely on
either directly accessing data items through the interface, or using
auxiliary data structures with native types for the language to avoid
expensive calls across language boundaries. So I would at least expect
getCompositeDescriptor to return a specialized interface that would save
us from writing and maintaining a parser on potentially evolving
recursive structure.

regards
Pavel Cisar

Denis Simonov

unread,
Dec 10, 2024, 5:06:29 AM12/10/24
to firebird-devel


вторник, 10 декабря 2024 г. в 12:29:14 UTC+3, Pavel Cisar:
Hi,

I have serious concerns about the client interface.

First, only some extension to IMessageMetadata is mentioned. However,
there are drivers that use the legacy API and those that use network
protocol directly, and some that support multiple methods
(native+netwrok or legacy+oo). How these would handle it? While I
support the idea of gradually abandoning the legacy API in favor of an
OO API, it increasingly looks that Firebird 6 will be another big
upgrade barrier of version 3 proportions, just of different kind. There
are very popular and widely used connection libraries like FireDAC in
Delphi. If they would not adapt or take long time to do so, Firebird
users will be effectively locked out from using such v6 features for
years or even forever (till they change the access provider).

No one is obligated to use the new data type for transmission to the client. 
This is not at all the same as it was in 4.0, when the old SQL code began returning INT128 or types with TIME ZONE. 
FireDAC uses a mixed approach (Legacy + OO API) and adapts to innovations quite quickly.
 
Second, getCompositeDescriptor returning a descriptor to be parsed is a
completely bad idea. The biggest advantage of the OO API is that it
frees developers from buffer parsing and construction with all those
Builder interfaces. And now this. Phrases like "convenient parse" in
relation to the OO API are nonsense, and along with "supports almost
unlimited nesting" plainly scare me. Driver developers (like me) rely on
either directly accessing data items through the interface, or using
auxiliary data structures with native types for the language to avoid
expensive calls across language boundaries. So I would at least expect
getCompositeDescriptor to return a specialized interface that would save
us from writing and maintaining a parser on potentially evolving
recursive structure.

I agree about the specialized interface. It should be there, but there should also be a low-level description of the type so that driver 
writers can implement it in the network protocol themselves. The second is there, but I unfortunately didn't see the first.
 


regards
Pavel Cisar

Dimitry Sibiryakov

unread,
Dec 10, 2024, 5:42:26 AM12/10/24
to firebir...@googlegroups.com
Pavel Cisar wrote 10.12.2024 10:28:
> First, only some extension to IMessageMetadata is mentioned. However, there are
> drivers that use the legacy API and those that use network protocol directly,
> and some that support multiple methods (native+netwrok or legacy+oo). How these
> would handle it?

Cannot ROW type be automatically unnested in API? I.e. set of plain fields is
created in IMessageMetadata for such field.

--
WBR, SD.

Adriano dos Santos Fernandes

unread,
Dec 10, 2024, 6:50:15 AM12/10/24
to firebir...@googlegroups.com
On 09/12/2024 12:13, Alexey Mochalov wrote:
>
> 2) declaring a ROW type variable specifying a cursor as a base:
>
> EXECUTE BLOCK
>
> RETURNS (out_1 int, out_2 char(10))
>
> AS
>
> DECLARE test_cursor CURSOR FOR (SELECT id, name FROM test_table);
>
> DECLARE local_row_var TYPE OF TABLE test_cursor;
>

Is this standard? "TYPE OF CURSOR" would look better for me.


>
> 3) declaring as PSQL block arguments and return values by specifying a
> base table. Arguments and return values will contain the same fields as
> the base table:
>
> CREATE PROCEDURE row_returning_proc
>
> RETURNS (out_1 TYPE OF TABLE test_table, out_2 TYPE OF TABLE test_table)
>
> AS
>
> BEGIN
>
> out_1.id = 1;
>
> out_1.num = 2;
>
> out_1.name = 'David';
>
> out_2 = (3, 4, 'John');
>
> SUSPEND;
>
> END;
>
>
> SELECT * FROM row_returning_proc;
>
>
>                                    OUT_1                               
>     OUT_2
>
> ========================================
> ========================================
>
>                            (1, 2, David)                             (3,
> 4, John)
>
>

Is it order (of input and output) based in column IDs or positions?


>
> 5) you can "select into" to a ROW type object
>
> EXECUTE BLOCK
>
> RETURNS (out_1 TYPE OF TABLE test_table)
>
> AS
>
> BEGIN
>
> SELECT * FROM test_table INTO out_1;
>

Looks weird when considering that this was valid with single column and
had different behavior/semantics.


>
> In PSQL, if a package type is used without specifying the package name
> qualifier, and the domain name coincides with the package type name, the
> domain will be selected instead of the package type declared within the
> same package.
>

This looks like different than standard scope rules. Why?


>
> I hope this scheme will help to explain more clearly the structure and
> relations between these fields in system tables (new fields and table
> are bolded):
>
>
>
> Dependencies
>
> A new dependency type, obj_packaged_type, has been added. It is used
> when routines contain variables or parameters with a type declared in a
> package. Such dependencies prevent the removal of types from packages
> but do not restrict their modification. This dependency also prohibits
> deleting the package itself but allows altering it via ALTER PACKAGE.
>

There were no direct dependency for packaged items before and
verification of possible delete were done in the package commands. Why
this is different?


>
> The structure of the descriptor passed to the client:
>
> 2 bytes - ordinal number of the described field inside the descriptor
> starting from 1
>
> 1 byte - length of the field name
>
> 0..255 bytes - field name
>
> 2 bytes - field type
>
> 4 bytes - field data length
>
> 2 bytes - subtype
>
> 2 bytes - scale
>
> 2 bytes - charset
>
> 1 byte - 0 value or 2 bytes - ordinal number of the subfield
>

This looks no extensible at all. There must be tags.


>
> In addition, I used 4 additional type codes for ROW types:
>
> blr_rowtype- to declare ROW type objects with description of all
> internal fields. It is used when generating input and output messages of
> the requests
>
> blr_rowtype2- used if ROW type is declared in the package
>
> blr_rowtype3- used if the object is declared with TYPE OF TABLE
>
> blr_rowtype4- used if the object type is declared locally inside a PSQL
> block
>

Look at blr_invoke_function. Please, don't create so high number of top
level codes.


>
> Conclusion
>
> In this proposal I haven't yet touched on the specific implementation in
> the code, as I believe this should be discussed in detail later. I think
> it’s best to break the discussion into several parts. Once we’ve
> confirmed that the main proposal is clear and without major concerns,
> I’ll move on to discussing the implementation details.
>
> If you think any important details are missing (which I'm pretty sure
> they are), I’m happy to provide additional information.
>

Can a row type be used in a table column?

Can circular dependencies exist?

Are two similar TABLE row types from different tables compatible?


Adriano

Alexey Mochalov

unread,
Dec 10, 2024, 9:25:17 AM12/10/24
to firebir...@googlegroups.com
On 10.12.2024 09:37, Denis Simonov wrote:
1. How are dependencies tracked? For example, you declared the type
DECLARE VARIABLE local_row_var TYPE OF TABLE test_table;
And then did
ALTER TABLE test_table
ADD NewField BIGINT;
What will happen to the procedure that uses such a type?
If ALTER TABLE is blocked, it is not very good. Maybe think about INVALID statuses for procedures/functions?
Currently, in procedures and functions where variables dependent on tables are explicitly assigned (e.g., <row value constructor>), a runtime error will occur if the structure or type of the base table fields changes. This happens due to a mismatch in the number of fields between the objects involved in the assignment. This is a drawback.

If there is no explicit assignment with potential conflicts, the procedure or function will continue to work correctly after the cache is cleared and reloaded, considering any changes to dependent parameters, if applicable. This is an advantage.

I decided that prohibiting changes to the format of tables on which procedures depend would be somewhat excessive.

If we add invalidation of procedures based on changes to table formats, we would also invalidate procedures which could work properly. The recompilation mechanism cannot distinguish between potentially functional and conflicting procedures based solely on BLR without running another dsqlPass. This is a drawback.

2. Are NEW and OLD context variables in DML triggers of the ROW type? Can the same operations be performed with them?
The NEW and OLD contexts have not been converted to the ROW type and remain unchanged.

3. Is there support for destructuring ROW types?
(:A, :B) = local_row_var ;
Will the ROW TYPE be decomposed into these variables?
No, such functionality is not currently available, but it seems like a great direction for future development.

4. Is there a signature of the IMessageMetadata::getCompositeDescriptor() function and others related to it, or is it still hypothetical?
Please note that in your format you may have to add a schema after the schemas are merged with the main branch.
The signature of this function is:
const char* getCompositeDescriptor(CheckStatusWrapper* status, unsigned index);

To make things more convenient, an auxiliary function has been added:
Array<MsgUtil::SubfieldData> MsgUtil::getSubfieldsData(CheckStatusWrapper* status, IMessageMetadata* msgMetadata, const UCHAR* msgDataBuffer, unsigned fieldIndex);

The structure MsgUtil::SubfieldData is almost identical to IsqlVar from isql.h.

And yes, the presence of schemas has not yet been considered in this proposal. I am confident that we can come up with an elegant solution for delivering metadata of internal fields, taking into account all suggestions and the functionality of schemas.

I welcome any suggestions for a more idiomatic and convenient design and structure for this part of the API. 

Dimitry Sibiryakov

unread,
Dec 10, 2024, 9:29:03 AM12/10/24
to firebir...@googlegroups.com
Alexey Mochalov wrote 10.12.2024 15:25:
> 4. Is there a signature of the IMessageMetadata::getCompositeDescriptor()
> function and others related to it, or is it still hypothetical?
>
> Please note that in your format you may have to add a schema after the
> schemas are merged with the main branch.
>
> The signature of this function is:
> const char* getCompositeDescriptor(CheckStatusWrapper* status, unsigned ind

Why it doesn't return MessageMetadata instead? This way you can avoid further
parsing and changes in API.

--
WBR, SD.

Denis Simonov

unread,
Dec 10, 2024, 9:50:53 AM12/10/24
to firebird-devel


вторник, 10 декабря 2024 г. в 17:29:03 UTC+3, Dimitry Sibiryakov:
+1

The message structure for RECORD TYPE is not much different from the original MessageMetadata message. 
In this case, you can simply get your own IMessageMetadata for each nesting level.
 


--
WBR, SD.

Pavel Cisar

unread,
Dec 10, 2024, 9:54:33 AM12/10/24
to firebir...@googlegroups.com
Dne 10. 12. 24 v 11:06 Denis Simonov napsal(a):
>
> No one is obligated to use the new data type for transmission to the
> client. This is not at all the same as it was in 4.0, when the old
> SQL code began returning INT128 or types with TIME ZONE.

Yes, certainly. But my comment wasn't about upgrade problems, but
inability to use the feature in application outside server side.

Also, the ROW data type is a precursor for table data type (list of
rows), so it would be nice if the ROW's design would take that possible
future extension into account.

> FireDAC uses a mixed approach (Legacy + OO API) and adapts to
> innovations quite quickly.

Well, in Python driver I will have no problems at all to present such
values as lists (even nested when necessary) on both input and output.
Transformation to dict wouldn't be a problem either. However, I'll have
to create a flattening workaround for DESCRIPTOR as its format defined
in Python DB API 2.0 does not support nesting, but that's trivial task
for such simple data structure. I will probably add some supplementary
nested structure as alternative without field name changes.

For FireDAC, IBX etc. it could be more problematic, as they would need
to find a way how to deal with DataSet model that powers data-aware
components. It's likely that some flattening would be needed. As Dimitry
suggested, there could be some support for such flattening in the
(IMessageMetadata or other) interface that could help with that,
otherwise everyone would have to reinvent their own method.

regards
Pavel

Dimitry Sibiryakov

unread,
Dec 10, 2024, 10:05:13 AM12/10/24
to firebir...@googlegroups.com
Pavel Cisar wrote 10.12.2024 15:53:
> It's likely that some flattening would be needed. As Dimitry suggested, there
> could be some support for such flattening in the (IMessageMetadata or other)
> interface that could help with that, otherwise everyone would have to reinvent
> their own method.

Actually it can be "flattened" at BLR level as well, solving all problems
with new BLR verbs and dependencies. In this case the type would be just a PSQL
syntax sugar and changes in the engine would be limited to the parser only.

--
WBR, SD.

Alexey Mochalov

unread,
Dec 10, 2024, 10:47:15 AM12/10/24
to firebir...@googlegroups.com

I agree with you. Considering that row types might become a table field type in the future, blocking their modification seems like a fairly logical approach. Unfortunately, however, it seems that this approach does not have much support among users. That's why I chose to allow modifications to base tables as the initial approach in the implementation.


10 дек. 2024 г., в 10:50, 'liviuslivius' via firebird-devel <firebir...@googlegroups.com> написал(а):

Alexey Mochalov

unread,
Dec 10, 2024, 12:03:04 PM12/10/24
to firebird-devel

You are correct; the old API is deprecated and does not support ROW types. The main reason is that it would require extending the XSQLVAR structure, which would automatically break all drivers that use it.

You are absolutely right that users of compiled client libraries need a good interface for sequential access to the internal fields of composite types. The description of the serialized descriptor is necessary for implementing the network protocol.


вторник, 10 декабря 2024 г. в 12:29:14 UTC+3, Pavel Cisar:
Hi,

Alexey Mochalov

unread,
Dec 10, 2024, 12:14:14 PM12/10/24
to firebird-devel

I would say yes, but I'm not entirely sure what exactly you mean in your question.

To discuss such a possibility, we first need to define what "unnested" means in this context. Does it mean accessing the internal fields as simple types?

If so, additional questions arise:

  • What should be done with the description of the top-level composite type field?
  • How can we determine the start and end of simple type fields that belong to a single composite field?
  • If we are considering (and we are) the future implementation of nested composite fields, where do they start and end? How can we determine how many fields are actually inside a composite type?

I'm not rejecting the idea; I just need a bit of clarification from your side.


вторник, 10 декабря 2024 г. в 13:42:26 UTC+3, Dimitry Sibiryakov:

Dimitry Sibiryakov

unread,
Dec 10, 2024, 12:31:35 PM12/10/24
to firebir...@googlegroups.com
Alexey Mochalov wrote 10.12.2024 18:14:
> I would say yes, but I'm not entirely sure what exactly you mean in your question.
>
> To discuss such a possibility, we first need to define what "unnested" means in
> this context. Does it mean accessing the internal fields as simple types?

Yes.

> If so, additional questions arise:
>
> * What should be done with the description of the top-level composite type field?

It just disappear. Composite type field is transformed into separate simple
types during parse and compilation.

> * How can we determine the start and end of simple type fields that belong to
> a single composite field?

By position as usual.

> * If we are considering (and we are) the future implementation of nested
> composite fields, where do they start and end? How can we determine how many
> fields are actually inside a composite type?

By count and position on declaration.

I.e. your query "SELECT * FROM row_returning_proc" instead produce following
result:

> OUT_1.ID OUT_1.NUM OUT_1.NAME OUT_2.ID OUT_2.NUM OUT_2.NAME
> ============ ============== ============== ========== ============ =============
> 1 2 David 3 4 John

Both OO API and ISC API would have no problem with it and most of envelope
libraries would not notice anything.

In input direction query 'execute procedure proc_with_row_types(?)' produces
MessageMetadata with three fields and pack three value from message buffer into
the row.

But if you accept my idea to unnest row data type at BLR level as well, this
procedure will have input message containing three plain fields and no new BLR
verbs will be needed because for following example

> CREATE PROCEDURE p1 (inpar TYPE OF TABLE test_table)
> RETURNS (outpar TYPE OF TABLE test_table)
> AS
> BEGIN
> outpar = inpar;
> SUSPEND;
> END;

generated BLR and signature in RDB$PROCEDURE_PARAMETERS could be exactly the
same as for following procedure:

> CREATE PROCEDURE p1 (inpar.id int, inpar.num smallint, inpar.name char(10))
> RETURNS (outpar.id int, outpar.num smallint, outpar.name char(10))
> AS
> BEGIN
> outpar.id = inpar.id;
> outpar.num = inpar.num;
> outpar.name = inpar.name;
> SUSPEND;
> END;

--
WBR, SD.

Dimitry Sibiryakov

unread,
Dec 10, 2024, 12:36:06 PM12/10/24
to firebir...@googlegroups.com
'Dimitry Sibiryakov' via firebird-devel wrote 10.12.2024 18:31:
> generated BLR and signature in RDB$PROCEDURE_PARAMETERS could be exactly the same

In short it is the same as for "create view v as select * from t": "*" is
expanded at the moment of compilation and following changes in t don't affect v.

--
WBR, SD.

Alexey Mochalov

unread,
Dec 10, 2024, 1:02:25 PM12/10/24
to firebird-devel
вторник, 10 декабря 2024 г. в 14:50:15 UTC+3, Adriano dos Santos Fernandes:
Is this standard? "TYPE OF CURSOR" would look better for me.

No, the standard does not include the TYPE OF TABLE syntax, so we are free to implement it as we see fit. I avoided adding multiple syntaxes, but on reflection, TYPE OF CURSOR seems like a good option.

Is it order (of input and output) based in column IDs or positions?

Positions
 
Looks weird when considering that this was valid with single column and
had different behavior/semantics.

This part follows the standard, which defines some very unintuitive parsing rules for certain situations. In short:
  • When a SELECT returns a single field, the INTO list must contain a field of the same or a convertible type.
  • When a SELECT returns multiple fields, the INTO list must contain the same number of fields of the same or convertible types, or a ROW object with the corresponding number of internal fields of the same or convertible types.
This looks like different than standard scope rules. Why?

This is implemented this way because a package type can be explicitly called using the package name qualifier, whereas domains do not have such a capability.

There were no direct dependency for packaged items before and
verification of possible delete were done in the package commands. Why
this is different?

This might be due to my insufficient understanding of the mechanism and all the side effects of package recompilation. I'll analyze that.

This looks no extensible at all. There must be tags.
 
Honestly, I can't say I support this idea, but I also can't argue against it. If examples or analogies could be provided to help me better understand what is meant, that would be great.
 
Look at blr_invoke_function. Please, don't create so high number of top
level codes.

Thank you, I’ll definitely consider your example. Initially, I settled on using unambiguous type values, no matter how odd that may seem, for the sake of clarity. 

Can a row type be used in a table column?

 Not yet, but in the future, I would like to add such functionality.

Can circular dependencies exist?

For now, I don’t see reasons or scenarios where they might arise. I’ll make a note to double-check this. Thank you for pointing it out.

Are two similar TABLE row types from different tables compatible?

Yes, ROW types are compatible if they contain the same number of fields of compatible types.
 

Denis Simonov

unread,
Dec 10, 2024, 1:03:54 PM12/10/24
to firebird-devel
I am against ROW TYPE decomposition into flat fields. This can be done but only in compatibility mode (a special parameter is needed).

As for Legacy clients, you can do it like

SET BIND OF ROW TO BLOB

And the corresponding DataTypeCompability behavior. BLOBs are not so slow for network transmission if this is applied https://github.com/FirebirdSQL/firebird/pull/8318

BLOB can contain the internal structure of the ROW type in binary form.

вторник, 10 декабря 2024 г. в 21:02:25 UTC+3, Alexey Mochalov:

Alexey Mochalov

unread,
Dec 10, 2024, 2:35:28 PM12/10/24
to firebird-devel

In fact, the initial implementation approach was exactly that—nothing more than syntactic sugar. Groups of simple values emulated the behavior of ROW types. Composite values were decomposed into sequences of fields and endlessly recalculated to match both sides of assignment and comparison operations. The real "fun" was stuffing such decomposed sequences into system tables as the return type for functions.


вторник, 10 декабря 2024 г. в 18:05:13 UTC+3, Dimitry Sibiryakov:

Alexey Mochalov

unread,
Dec 10, 2024, 3:21:05 PM12/10/24
to firebird-devel

вторник, 10 декабря 2024 г. в 20:31:35 UTC+3, Dimitry Sibiryakov:
> If so, additional questions arise:
> * What should be done with the description of the top-level composite type field?

It just disappear. Composite type field is transformed into separate simple
types during parse and compilation.

This way, we deprive the user of some information about the result of the query they are executing.

> * How can we determine the start and end of simple type fields that belong to
> a single composite field?

By position as usual.

> * If we are considering (and we are) the future implementation of nested
> composite fields, where do they start and end? How can we determine how many
> fields are actually inside a composite type?

By count and position on declaration.

I.e. your query "SELECT * FROM row_returning_proc" instead produce following
result:

> OUT_1.ID OUT_1.NUM OUT_1.NAME OUT_2.ID OUT_2.NUM OUT_2.NAME
> ============ ============== ============== ========== ============ =============
> 1 2 David 3 4 John

With such an approach, the user would not be able to understand where different composite fields begin and end. To illustrate, consider the query:

SELECT 1, 2, ROW(3, 4, 5), 6, 7, ROW(8, 9), 10 FROM RDB$DATABASE;

Without additional metadata, the user has no way to determine where the grouped sets of values are, even though the SQL standard allows them to reference these groups.

Both OO API and ISC API would have no problem with it and most of envelope
libraries would not notice anything.
 
That's absolutely correct—drivers wouldn’t notice. And users wouldn’t notice either because they wouldn’t have any way to differentiate the results between the old and new syntax.
 
But if you accept my idea to unnest row data type at BLR level as well, this
procedure will have input message containing three plain fields and no new BLR
verbs will be needed because for following example

It is a good idea and I don’t just accept it—I even tried it. It doesn’t work and is absolutely mind-breaking to implement when you start dealing with ROW types comparison rules, function and procedure parameters, NOT NULL constraints, sorting, grouping, etc.

Dimitry Sibiryakov

unread,
Dec 10, 2024, 5:53:14 PM12/10/24
to firebir...@googlegroups.com
Alexey Mochalov wrote 10.12.2024 21:21:
> It just disappear. Composite type field is transformed into separate simple
> types during parse and compilation.
>
> This way, we deprive the user of some information about the result of the query they are executing.

What exactly information and what usage it can have for a user?

> It is a good idea and I don’t just accept it—I even tried it. It doesn’t work
> and is absolutely mind-breaking to implement when you start dealing with ROW
> types comparison rules, function and procedure parameters, NOT NULL constraints,
> sorting, grouping, etc.

But how have you managed to workaround all these problems in a new composite
type? Don't say "simple memcpy() for whole buffer" because it cannot work with
character data.

--
WBR, SD.

Adriano dos Santos Fernandes

unread,
Dec 10, 2024, 8:44:56 PM12/10/24
to firebir...@googlegroups.com
Certainly a form of unnest would be good, but what you propose makes no
sense at all.


Adriano

Adriano dos Santos Fernandes

unread,
Dec 10, 2024, 8:51:27 PM12/10/24
to firebir...@googlegroups.com
On 10/12/2024 15:02, Alexey Mochalov wrote:


> Is it order (of input and output) based in column IDs or positions?
>
>
> Positions
>

Are cached compiled routines invalidated correctly when new columns are
added not in the tail of the table?


>
> This looks like different than standard scope rules. Why?
>
> This is implemented this way because a package type can be explicitly
> called using the package name qualifier, whereas domains do not have
> such a capability.
>

The same for packaged procedure vs table, no? So I still think a new
rule should not be created.


>
> This looks no extensible at all. There must be tags.
>
>  
> Honestly, I can't say I support this idea, but I also can't argue
> against it. If examples or analogies could be provided to help me better
> understand what is meant, that would be great.
>  

Just image you have that structure and someone is adding schemas.

That's easy to imagine now, but think something like that can appear in
the future again.



Denis Simonov

unread,
Dec 11, 2024, 3:26:23 AM12/11/24
to firebird-devel
I have another question. We have so-called cursor variables.

FOR SELECT
      ...
    AS CURSOR C
DO BEGIN
     A = :C.Field;
END

How does this relate to ROW TYPE? Are there any mechanisms to turn C into ROW TYPE?

Can the ROW constructor work as ROW (table_alias.*) or ROW (C.*)?

A bit of a digression from the main topic, but still. Is there a TABLE VALUE CONSTRUCTOR planned?

VALUES [ROW] (...), [ [ROW] (...) ...]

среда, 11 декабря 2024 г. в 04:51:27 UTC+3, Adriano dos Santos Fernandes:

Dmitry Yemanov

unread,
Dec 11, 2024, 3:33:04 AM12/11/24
to firebir...@googlegroups.com
11.12.2024 11:26, Denis Simonov wrote:

> A bit of a digression from the main topic, but still. Is there a TABLE
> VALUE CONSTRUCTOR planned?
>
> VALUES [ROW] (...), [ [ROW] (...) ...]

It's implemented and will be proposed separately.


Dmitry

Denis Simonov

unread,
Dec 11, 2024, 4:09:41 AM12/11/24
to firebird-devel
Alexey Mochalov,

If I understood Adriano correctly, he suggests replacing your descriptor in such a way that it can be expanded without worrying too much about the positions of your bytes. Roughly the same way it is done in isc_dpb/isc_info_tags by preceding each group of bytes with a tag from constants. That is, replace this

2 bytes - ordinal number of the described field inside the descriptor starting from 1

1 byte - length of the field name

0..255 bytes - field name

2 bytes - field type

4 bytes - field data length

2 bytes - subtype

2 bytes - scale

2 bytes - charset

1 byte - 0 value or 2 bytes - ordinal number of the subfield


for something like that


isc_rec_description

isc_rec_order <2 bytes> - ordinal number of the described field inside the descriptor starting from 1

isc_rec_field_length <1 byte> - length of the field name

isc_rec_field_name <0..255 bytes> - field name

isc_rec_field_type <2 bytes> - field type

isc_rec_data_length 4 bytes - field data length

isc_rec_sub_type <2 bytes> - subtype

isc_rec_field_scale <2 bytes> - scale

isc_rec_charset <2 bytes> - charset

isc_rec_subfield_number <1 byte - 0 value or 2 bytes> - ordinal number of the subfield

isc_rec_description_end


In this case, you can expand this description wherever you want and ignore anything unknown.


среда, 11 декабря 2024 г. в 11:33:04 UTC+3, Dmitry Yemanov:

Pavel Cisar

unread,
Dec 11, 2024, 6:18:33 AM12/11/24
to firebir...@googlegroups.com
Dne 10. 12. 24 v 23:53 'Dimitry Sibiryakov' via firebird-devel napsal(a):
>
>   What exactly information and what usage it can have for a user?

You missed the primary point of ROW values, they could be referenced and
used as a whole. It has many applications, and allows great code
simplifications. Flattening solves the problem when some interface (like
Delphi dataset) can't handle such nested structures to access individual
encapsulated values, but at the price that you lose the ability to use
the composite value as a whole.

It's typical "eat the cake and also have it" kind of problem. Certainly,
drivers not constrained to flat structures (and there are many) will not
have such problem at all. Hence it's a bad idea to force them to use
flattening. Some support for those who would need it would be great, but
not essential.

regards
Pavel Cisar

Dimitry Sibiryakov

unread,
Dec 11, 2024, 6:21:53 AM12/11/24
to firebir...@googlegroups.com
Pavel Cisar wrote 11.12.2024 12:17:
> Dne 10. 12. 24 v 23:53 'Dimitry Sibiryakov' via firebird-devel napsal(a):
>>
>>    What exactly information and what usage it can have for a user?
>
> You missed the primary point of ROW values, they could be referenced and used as
> a whole. It has many applications, and allows great code simplifications.

Yes, you are right I see no point in this feature that's why I asked the
question above.
I don't see any code simplification in the proposal.
Also I don't think that violation of 1NF is a good idea.

--
WBR, SD.

Pavel Cisar

unread,
Dec 11, 2024, 6:42:55 AM12/11/24
to firebir...@googlegroups.com
Dne 11. 12. 24 v 12:21 'Dimitry Sibiryakov' via firebird-devel napsal(a):
>
>   Yes, you are right I see no point in this feature that's why I asked
> the question above.

Maybe at the SQL level because you never participated in development of
complex business application, but you certainly used STRUCT types in
C/C++ with members that are also STRUCT types? I saw those in Firebird
source. You think that unnesting them all will make your life easier?

regards
Pavel Cisar


Dimitry Sibiryakov

unread,
Dec 11, 2024, 6:58:22 AM12/11/24
to firebir...@googlegroups.com
Pavel Cisar wrote 11.12.2024 12:41:
> Maybe at the SQL level because you never participated in development of complex
> business application, but you certainly used STRUCT types in C/C++ with members
> that are also STRUCT types? I saw those in Firebird source.

Say you have a query "select a.*, b.* from a,b". Currently result set of this
query is unnested and you access fields individually. The code can be seen in
examples.
How will application code look like for a query "select row(a.*), row(b.*)
from a,b"?

> You think that unnesting them all will make your life easier?

Actually, yes if instead of accessing nested data as "std::cout <<a.b" I
would have to write something like this:

auto desc = a.getDescriptor(b);
for (unsigned char* p = desc.getBuffer(); *p; p++)
{
auto order = isc_portable_integer(p, 2);
p += 2;
auto nameLength = *p++;
std::string fieldName(p, nameLength);
auto type = isc_portable_integer(p, 2);
p += 2;
auto length = isc_portable_integer(p, 4);
p += 6;
auto scale = isc_portable_integer(p, 2);
switch (type)
{
case SQL_LONG:
{
ISC_LONG buffer;
memcpy(&buffer, messagebuffer + messageMetadata.getOffset(status,
index), sizeof(ISC_LONG);
std::cout <<(buffer / pow10(-scale));
}
}
}

And this is the code for accessing of the row data I derived from the
proposal. So, show me a simplification.

--
WBR, SD.

Denis Simonov

unread,
Dec 11, 2024, 7:09:15 AM12/11/24
to firebird-devel
Dmitry, you are completely distorting the essence of ROW types, which, as they said, are a single whole.

Firstly, the code for parsing a ROW type descriptor can be executed once, and then put into a structure and worked with, as is done for a flat field buffer. 
That is, such complex manipulations do not need to be performed for each cursor record. In addition, for most users, these manipulations will generally be hidden in the interfaces.

Secondly, ROW types are fundamentally important, this is one of the building blocks on the basis of which GEO types, for example Point, can appear. And this does not violate normal forms at all.

среда, 11 декабря 2024 г. в 14:58:22 UTC+3, Dimitry Sibiryakov:

Dimitry Sibiryakov

unread,
Dec 11, 2024, 7:20:01 AM12/11/24
to firebir...@googlegroups.com
Denis Simonov wrote 11.12.2024 13:09:
> Dmitry, you are completely distorting the essence of ROW types, which, as they
> said, are a single whole.

At PSQL level this type is fine. That's why I don't mind it there.

> Firstly, the code for parsing a ROW type descriptor can be executed once, and
> then put into a structure and worked with, as is done for a flat field buffer.

If it can - it must be already done behind of scene in Y-valve. Current
proposal in API part is a complete nightmare.

> That is, such complex manipulations do not need to be performed for each cursor record.

May be. I would like to see the proposed code from ISQL that prints this:

> SELECT * FROM row_returning_proc;
>
> OUT_1 OUT_2
> ======================================== ========================================
> (1, 2, David) (3, 4, John)

--
WBR, SD.

Alexey Mochalov

unread,
Dec 11, 2024, 9:24:33 AM12/11/24
to firebird-devel
среда, 11 декабря 2024 г. в 04:51:27 UTC+3, Adriano dos Santos Fernandes:
On 10/12/2024 15:02, Alexey Mochalov wrote:

> Is it order (of input and output) based in column IDs or positions?
>
>
> Positions
>

Are cached compiled routines invalidated correctly when new columns are
added not in the tail of the table?

If they depend on tables that change, the cache is cleared and updated on the next routine call, regardless of what was changed and how. Or am I missing something in your question?

>
> This looks like different than standard scope rules. Why?
>
> This is implemented this way because a package type can be explicitly
> called using the package name qualifier, whereas domains do not have
> such a capability.
>

The same for packaged procedure vs table, no? So I still think a new
rule should not be created.

Yes, I think it’s worth making it consistent if everyone agrees. I’ll add this to the discussion summary.

>
> This looks no extensible at all. There must be tags.
>
>  
> Honestly, I can't say I support this idea, but I also can't argue
> against it. If examples or analogies could be provided to help me better
> understand what is meant, that would be great.
>  

Just image you have that structure and someone is adding schemas.

That's easy to imagine now, but think something like that can appear in
the future again.

Thank you, the scenario where an extension might be needed is clear, but I was more asking how you think it should look structurally.

Dimitry Sibiryakov

unread,
Dec 11, 2024, 9:33:04 AM12/11/24
to firebir...@googlegroups.com
Alexey Mochalov wrote 11.12.2024 15:24:
> That's easy to imagine now, but think something like that can appear in
> the future again.
>
> Thank you, the scenario where an extension might be needed is clear, but I was
> more asking how you think it should look structurally.

As was said it is better to be handled inside of Y-valve, completely
invisible to API user. In this case you'll get a freedom of changing of the
implementation whenever you like.
What user should get is (at least) a kind of handle similar to BLOB ID which
can be used to obtain metadata and data for such nested fields.

--
WBR, SD.

Alexey Mochalov

unread,
Dec 11, 2024, 9:37:57 AM12/11/24
to firebird-devel
среда, 11 декабря 2024 г. в 11:26:23 UTC+3, Denis Simonov:
I have another question. We have so-called cursor variables.

FOR SELECT
      ...
    AS CURSOR C
DO BEGIN
     A = :C.Field;
END

How does this relate to ROW TYPE? Are there any mechanisms to turn C into ROW TYPE?

Currently, there is no such mechanism. However, it is possible to specify a ROW type variable derived from a table in the INTO list and use the fields of that variable.

Can the ROW constructor work as ROW (table_alias.*) or ROW (C.*)?

This syntax is not implemented.

Alexey Mochalov

unread,
Dec 11, 2024, 9:53:07 AM12/11/24
to firebird-devel

Thank you for the explanation; I think this is an excellent idea and worth implementing.

среда, 11 декабря 2024 г. в 12:09:41 UTC+3, Denis Simonov:

Dimitry Sibiryakov

unread,
Dec 11, 2024, 9:57:48 AM12/11/24
to firebir...@googlegroups.com
Alexey Mochalov wrote 11.12.2024 15:53:
> Thank you for the explanation; I think this is an excellent idea and worth
> implementing.

Unfortunately exactly in suggested form it doesn't allow to "expand this
description wherever you want and ignore anything unknown". For such ability the
structure must be strictly TLV: <tag><length><value> as in DPB of info result
buffer.

--
WBR, SD.

Alexey Mochalov

unread,
Dec 11, 2024, 10:00:24 AM12/11/24
to firebir...@googlegroups.com
ср, 11 дек. 2024 г. в 15:20, 'Dimitry Sibiryakov' via firebird-devel <firebir...@googlegroups.com>:
Denis Simonov wrote 11.12.2024 13:09:
> Firstly, the code for parsing a ROW type descriptor can be executed once, and
> then put into a structure and worked with, as is done for a flat field buffer.

   If it can - it must be already done behind of scene in Y-valve. Current
proposal in API part is a complete nightmare.

I agree, if you take a closer look, the interface does indeed seem clunky. This discussion is aimed precisely at highlighting the questionable aspects.

We can steer the discussion in a more productive direction, for example, by suggesting interface design options. Something like this:

interface MessageMetadata : ReferenceCounted
{
...
version: // 5.0 => 6.0
const string getCompositeDescriptor(Status status, uint index);
uint getSubfieldsCount(Status status, uint index);
MessageMetadata getSubField(Status status, uint index); // ???
MessageMetadata getNextSubField(Status status, uint index); // ???
}
 
 

Alexey Mochalov

unread,
Dec 11, 2024, 10:05:45 AM12/11/24
to firebird-devel

So, your suggestion is to completely hide the composite type descriptor from users in the OO API. I think that might make sense. I have absolutely no objections, especially once we define an extension to the OO API in this area that the majority will find satisfactory.

среда, 11 декабря 2024 г. в 17:33:04 UTC+3, Dimitry Sibiryakov:

Dimitry Sibiryakov

unread,
Dec 11, 2024, 11:07:50 AM12/11/24
to firebir...@googlegroups.com
Alexey Mochalov wrote 11.12.2024 16:00:
> We can steer the discussion in a more productive direction, for example, by
> suggesting interface design options. Something like this:

Before that several "more basic" things must be considered.

Let's suppose that we agreed that composite data type must be represented in
API at all.
Before adding of new methods it must be decided what already existing methods
shall return for it:

getType(): I suppose something like SQL_ROW.
getSubType(): For a while only one subtype: SUBTYPE_ROW. Later may be
SUBTYPE_TUPLE, SUBTYPE_COLLECTION, etc.
getLength(): Definitely length of data in message buffer. May be fixed (if
handled as BLOB ID) or variable (if handled as BINARY). Related question is a
required alignment for this data. It has to be fixed because of the way data
buffer is formed now.
getOffset() naturally returns offset in data buffer but resulting pointer for
usage must be casted to some type. For now let's use an opaque ISC_ROW.

Then we can decide how to work with data itself. Unfortunately direct cast of
data pointer to user-defined struct hardly can work (though it would be the best
way) because in this case structure of returned record has to be known at
compile time. To handle data of unknown structure we can reuse message mechanic:

MessageMetadata IUtil::getRowMetadata(Status status, ISC_ROW* handle) -
returns metadata which can be used to prepare data buffer for following call of:
void getRowData(Status status, ISC_ROW* handle, void* data) - fills data
buffer with data.

Example code:

> for (int line = 0; curs->fetchNext(&status, buffer.get()) == IStatus::RESULT_OK; ++line)
> {
> IMessageMetadata* subRow = utils->getRowMetadata(&status, buffer.get() + meta->getOffset(&status, 0));
> std::unique_ptr<unsigned char[]> rowBuffer(new unsigned char[subRow->getMessageLength(&status));
> utils->getRowData(&status, buffer.get() + meta->getOffset(&status, 0), rowBuffer.get());
>
> // Print data from row type field
> for (unsigned j = 0; j < subRow->getCount(&status); ++j)
> {
> printf("Field %u: type = %u, length = %u, offset = %u, nullOffset = %u\n",
> j, subRow->getType(&status, j), subRow->getLength(&status, j),
> subRow->getOffset(&status, j), subRow->getNullOffset(&status, j));
> switch (subRow->getType(&status, j))
> {
> case ISC_LONG: printf("\t%f\n", *(ISC_LONG*)(rowBuffer.get() + subRow->getOffset(&status, j)) * pow10(-subRow->getScale(&status, j));
> default:
> printf("Sub row contain unknown type\n");
> }
> }
> }

There may be a simpler option if data length is variable and ISC_ROW
represents ROW's data buffer itself. In this case IUtils::getRowData() is not
needed and this piece of example

> std::unique_ptr<unsigned char[]> rowBuffer(new unsigned
char[subRow->getMessageLength(&status));
> utils->getRowData(&status, buffer.get() + meta->getOffset(&status, 0),
rowBuffer.get());

is reduced to

> unsigned char* rowBuffer = buffer.get() + meta->getOffset(&status, 0),
rowBuffer.get());

But in such form fetched record data buffer actually contains unnested result
which make this whole bunch of code superfluous.

--
WBR, SD.

Jim Starkey

unread,
Dec 11, 2024, 11:30:41 AM12/11/24
to firebir...@googlegroups.com
May I suggest that it be syntactic sugar for a blob of specific blob
type?  Define and publish a self-describing, platform independent,
serialization format.  Implement a library of functions to serialize,
deserialize, create, access, update, and extend operations.  Then use a
combination of functions and syntactic DML sugar to hide all that.

As long as the serialize/deserialization formats are respected, a client
can do anything it wants.

I don't see why the remote protocol(s) and BLR need to know anything
about them.  The semantics are applied at higher levels.
Jim Starkey, AmorphousDB, LLC

Denis Simonov

unread,
Dec 11, 2024, 12:09:07 PM12/11/24
to firebird-devel
I completely disagree with Dimitry Sibiryakov. There is no need to get a description of the ROW type for each record. This description cannot change.

We need to decide whether ROW TYPEs can be nested? I did not see this in the current proposal, but could this be possible in the future? If not, then it is quite sufficient to simply extend IMessageMetadata


interface MessageMetadata : ReferenceCounted
{
...
version: // 5.0 => 6.0
uint getSubFieldsCount(Status status, uint index);
char* getSubFieldName(Status status, uint index, uint subIndex);

unsigned getSubFieldType(Status status, uint index, uint subIndex);

int getSubFieldSubType(Status status, uint index, uint subIndex);
unsigned getSubFieldLength(Status status, uint index, uint subIndex);
int getSubFieldScale(Status status, uint index, uint subIndex);
unsigned getSubFieldCharSet(Status status, uint index, uint subIndex);
unsigned getSubFieldOffset(Status status, uint index, uint subIndex);
unsigned getSubFieldNullOffset(Status status, uint index, uint subIndex);
}

Here offset and nullOffset are relative offsets within the ROW type.

The getSubField* methods only work if getType returned the SQL_ROW type. Otherwise, an error is written to the status.

It should be noted that we need symmetric set* methods in IMetadataBuilder.

If there is a potential to create nested RowTypes, then a separate IRowTypeMetadata interface should be preferred.


interface MessageMetadata : ReferenceCounted
{
...
version: // 5.0 => 6.0
IRowTypeMetadata getRowTypeMetadata(Status status, uint index);
}

interface IRowTypeMetadata: ReferenceCounted
{
uint getCount(Status status);
unsigned getParentFieldIndex(Status status);
char* getName(Status status, uint index);
unsigned getType(Status status, uint index);
int getSubType(Status status, uint index);
unsigned getLength(Status status, uint index);
int getScale(Status status, uint index);
unsigned getCharSet(Status status, uint index);
unsigned getOffset(Status status, uint index);
unsigned getNullOffset(Status status, uint index);
IRowTypeMetadata getRowTypeMetadata(Status status, uint index);
}

In IRowTypeMetadata index for subfields.

среда, 11 декабря 2024 г. в 19:30:41 UTC+3, Jim Starkey:

Dimitry Sibiryakov

unread,
Dec 11, 2024, 12:13:23 PM12/11/24
to firebir...@googlegroups.com
Denis Simonov wrote 11.12.2024 18:09:
> The getSubField* methods only work if getType returned the SQL_ROW type.
> Otherwise, an error is written to the status

Currently getCharset() doesn't do it for numeric fields AFAIK.

--
WBR, SD.

Dimitry Sibiryakov

unread,
Dec 11, 2024, 12:23:55 PM12/11/24
to firebir...@googlegroups.com
Denis Simonov wrote 11.12.2024 18:09:
> There is no need to get a description of the ROW type for each record. This
> description cannot change.

Agree. In this case it returns us to a slightly modified initial suggestion:

IMessageMetadata* IMessageMetadata::getRowMetadata(IStatus* status, unsigned
index);

which returns nullptr for any other type.

For my example in previous message it changes only "utils->" to "meta->", the
rest is the same.

--
WBR, SD.

Denis Simonov

unread,
Dec 11, 2024, 12:44:07 PM12/11/24
to firebird-devel

In your example, the getRowData method is absolutely unnecessary. 
We can get data for the ROW type with a simple *(buffer + offset), and data for any subfield of the ROW type as *(bufer + offset + subFieldOffset). 
Or even better. You can simply create a corresponding nested struct and everything will decompose itself, the only problem may be with alignment.
среда, 11 декабря 2024 г. в 20:23:55 UTC+3, Dimitry Sibiryakov:

Dimitry Sibiryakov

unread,
Dec 11, 2024, 3:17:50 PM12/11/24
to firebir...@googlegroups.com
Denis Simonov wrote 11.12.2024 18:44:
> In your example, the getRowData method is absolutely unnecessary.

Yes if the nested data is in the record's message buffer itself, exactly as I
said.

> We can get data for the ROW type with a simple *(buffer + offset), and data for
> any subfield of the ROW type as *(bufer + offset + subFieldOffset).

...or we can skip all this mess and unnest the data before it reach API
layer. That will give us exactly the same message data buffer with merged metadata.

> Or even better. You can simply create a corresponding nested struct and
> everything will decompose itself, the only problem may be with alignment.

Yes, it is exactly alignments that makes static message binding complex, but
it is a separate topic. And as I also said - it requires knowledge of data
structure ahead (either by fixed database structure or by forcing message
metadata which is possible for record metadata but I see no way to do the same
for ROW field).

--
WBR, SD.

Pavel Cisar

unread,
Dec 12, 2024, 5:25:49 AM12/12/24
to firebir...@googlegroups.com
Hi,

1. Just noticed that definition of <row_type_definition> cover only
DECLARE TYPE typename ( <nested_field> [ , <nested_field> … ] ) variant,
but does not cover DECLARE TYPE typename TYPE OF TABLE <table_name>
variant used in examples.

2. The API problem revisited. It all boils down whether we think that in
future this feature could be somehow used as part of composite data
types stored in the database or not.

a) If there is such possibility, then it's certainly possible that these
structures could be nested, and some more general API solution would be
better. For example something like Jim suggested.

b) If not, then it's more a syntactic sugar than anything else. It
couldn't be nested, and there is no real reason why to complicate the
API much by one sub-level. We can flatten it at IMessageMetadata, and
just add one item there - GroupId, that will link the field with
particular ROW group. Those who would want to work with whole groups can
easily do that with this information, and those who don't care will not
need to do anything. The API just need to handle the case when input
parameters will not have GroupId filled, but that's not a big deal
either as mapping could be done by sequence order. It will also solve
the problem with legacy API, as only grouping will be lost / unused.

regards
Pavel Cisar


Alexey Mochalov

unread,
Dec 12, 2024, 7:47:52 AM12/12/24
to firebir...@googlegroups.com

1.  

Thursday, December 12, 2024, 1:25:49 PM UTC+3, Pavel Cisar:
1. Just noticed that the definition of <row_type_definition> only covers the  
DECLARE TYPE typename ( <nested_field> [ , <nested_field> … ] ) variant but does not cover the DECLARE TYPE typename TYPE OF TABLE <table_name> variant used in the examples.  

Such syntax is not implemented, and it is not used in the examples I provided. Perhaps I made a mistake somewhere, and I will gladly correct it if you point out where exactly.  

2.  

Wednesday, December 11, 2024, 8:09:07 PM UTC+3, Denis Simonov:
We need to decide whether `ROW TYPES` can be nested. I did not see this in the current proposal, but could this be possible in the future? If not, then it is quite sufficient to simply extend IMessageMetadata.  

Thursday, December 12, 2024, 1:25:49 PM UTC+3, Pavel Cisar:
2. The API problem revisited. It all boils down to whether we think that in the future this feature could somehow be used as part of composite data types stored in the database or not.  

The SQL standard states that ROW types can be fields of tables, meaning that, in my opinion, this should be implemented in the future. Moreover, the standard describes an unlimited level of nesting for composite fields. Currently, there was no point in implementing this, as questions arise even regarding the basic aspects.  

3.  

Wednesday, December 11, 2024, 8:23:55 PM UTC+3, Dimitry Sibiryakov:
Denis Simonov wrote on 11.12.2024 at 6:09 PM:  
> There is no need to get a description of the ROW type for each record. This  
> description cannot change.  
Agree. In this case, it brings us back to a slightly modified initial suggestion:  

IMessageMetadata* IMessageMetadata::getRowMetadata(IStatus* status, unsigned index);
which returns nullptr for any other type.  

In my opinion, this looks good:
interface MessageMetadata : ReferenceCounted
{
    ...
    version: // 5.0 => 6.0
    MessageMetadata getCompositeMetadata(Status status, uint index);
}

We can access the data of an internal field as *(buffer + offset + subFieldOffset). Accordingly, the index in the lower-level metadata will define the position of the internal field.  

This creates certain complications, but they are not too great to overcome. Currently, the value of a ROW type is represented in memory as a record on a data page: a null mask + (field data + field alignment). That's why the data buffer does not contain separate smallint values for nullifying internal fields, which means that MessageMetadata::getNullOffset() has nothing to reference.  
The null mask could be replaced with an array of smallints for each internal field at every sublevel, but this is rather wasteful, especially when it comes to storing this in database files.  

These considerations are what led me to delegate parsing of the serialized descriptor and the creation of a helper function to the client-side OO API. This function knows how to transform the data:  
Array<MsgUtil::SubfieldData> MsgUtil::getSubfieldsData(CheckStatusWrapper* status, IMessageMetadata* msgMetadata, const UCHAR* msgDataBuffer, unsigned fieldIndex);

Denis Simonov

unread,
Dec 12, 2024, 7:53:08 AM12/12/24
to firebird-devel


четверг, 12 декабря 2024 г. в 13:25:49 UTC+3, Pavel Cisar:
Hi,

1. Just noticed that definition of <row_type_definition> cover only
DECLARE TYPE typename ( <nested_field> [ , <nested_field> … ] ) variant,
but does not cover DECLARE TYPE typename TYPE OF TABLE <table_name>
variant used in examples.

2. The API problem revisited. It all boils down whether we think that in
future this feature could be somehow used as part of composite data
types stored in the database or not.

a) If there is such possibility, then it's certainly possible that these
structures could be nested, and some more general API solution would be
better. For example something like Jim suggested.

What Jim suggests is inefficient and has a lot of overhead. Why would we want to transmit ROW TYPE over the network as a BLOB if we can just return it in a flat buffer. 
It should be noted that one of the fields of ROW TYPE could also be a BLOB. As I said above, passing ROW TYPE as BLOB only makes sense if you have done

SET BIND OF ROW TYPE TO BLOB
 


b) If not, then it's more a syntactic sugar than anything else. It
couldn't be nested, and there is no real reason why to complicate the
API much by one sub-level. We can flatten it at IMessageMetadata, and
just add one item there - GroupId, that will link the field with
particular ROW group. Those who would want to work with whole groups can
easily do that with this information, and those who don't care will not
need to do anything. The API just need to handle the case when input
parameters will not have GroupId filled, but that's not a big deal
either as mapping could be done by sequence order. It will also solve
the problem with legacy API, as only grouping will be lost / unused.


No need to worry about old clients/applications. If the old client does not understand the new type, it will get the error "Datatype in unknown" just like it gets for DecFloat/INT128 etc. 
No one forces applications that cannot use new data types to return them literally. In this case, limit yourself inside the PSQL code.

 
regards
Pavel Cisar


Dimitry Sibiryakov

unread,
Dec 12, 2024, 8:00:54 AM12/12/24
to firebir...@googlegroups.com
Alexey Mochalov wrote 12.12.2024 13:47:
> We can access the data of an internal field as *(buffer + offset +
> subFieldOffset). Accordingly, the index in the lower-level metadata will define
> the position of the internal field.

We can but you must establish rules for alignment of a composite field to do
it safely: usual and fastest way is to cast void* to ISC_LONG* for example but
it may crash on ARM platforms if the value of pointer is not aligned properly.

> Currently, the value of a ROW type is represented in memory as a record on a data page: a null mask + (field data + field alignment).

That's not bad but you'll have the same problem with interworking between
platforms with different endianness that IReplicator currently has (it uses the
record format for data in data packet as well).

--
WBR, SD.

Denis Simonov

unread,
Dec 12, 2024, 8:03:13 AM12/12/24
to firebird-devel
> This creates certain complications, but they are not too great to overcome. Currently, the value of a ROW type is represented in memory as a record on a data page: a null mask + (field data + field alignment). That's why the data buffer does not contain separate smallint values for nullifying internal fields, which means that MessageMetadata::getNullOffset() has nothing to reference.  
> The null mask could be replaced with an array of smallints for each internal field at every sublevel, but this is rather wasteful, especially when it comes to storing this in database files.  

The way it is transmitted over the network does not necessarily need to be reflected one-to-one in the buffer. 
As far as I remember, null indicators for regular fields are also not transmitted directly, but encoded in the null mask.

четверг, 12 декабря 2024 г. в 15:53:08 UTC+3, Denis Simonov:

Dimitry Sibiryakov

unread,
Dec 12, 2024, 8:05:17 AM12/12/24
to firebir...@googlegroups.com
Denis Simonov wrote 12.12.2024 14:03:
> As far as I remember, null indicators for regular fields are also not
> transmitted directly, but encoded in the null mask.

No, they aren't. Data are transmitted over network in BLR messages and these
messages makes no difference between data and null indicators.

--
WBR, SD.

Pavel Cisar

unread,
Dec 12, 2024, 9:19:37 AM12/12/24
to firebir...@googlegroups.com
>
> No need to worry about old clients/applications. If the old client does
> not understand the new type, it will get the error "Datatype in unknown"
> just like it gets for DecFloat/INT128 etc.
> No one forces applications that cannot use new data types to return them
> literally. In this case, limit yourself inside the PSQL code.

Again, it all depends on whether ROW is actual type or just a
pseudo-type. Actual type means that you can create domains and columns
of this type. Otherwise it's just a pseudo-type that exists only under
specific conditions (aka syntactic sugar). One main distinguishing
factor is that actual types always have names and must be referenced by
name.

So far it seems that ROW is NOT a new data type, but just a collection
of fields defined ad hoc that could be also named. Sure, you can define
it as a "type" via DECLARE TYPE but also as UNTYPED collection of values
via ROW constructor or variable definition (there is no reference to
base type name in provided specification and examples).

So, if it is a real type, then proper declaration and reference to its
name should be mandatory in all constructs. Also, it should be defined
as new field type in API. If this can become storage type in future, or
expand to table type (collection of rows), then some more flexible API
interface that would accommodate future changes should be defined from
the beginning.

If it's *not* a type but syntactic sugar, then there is no need to
declare new field type in API at all, and it's easier to flatten it
there and add group id to identify "row" collections. Those who would
care will use it, those who don't or can't will not, and everybody will
be happy.

regards
Pavel

Denis Simonov

unread,
Dec 12, 2024, 9:40:39 AM12/12/24
to firebird-devel

Please note. The author said that in the future it is planned to use ROW TYPE as table fields, and even to provide the ability to nest composite types within each other. 
In addition, it was said that it is now necessary decide how it should work so that there is the possibility of expansion in the future. 
Therefore, I believe that a good API should be invented right now, even if it causes some difficulties, so as not to suffer later. 
ROW TYPE can be a base for GEO types such as Point2D, Point3D, and also for Line. GEO types are also planned for implementation in future versions. 
No need to make some bad API now, so as not to redo it later.
четверг, 12 декабря 2024 г. в 17:19:37 UTC+3, Pavel Cisar:

Dimitry Sibiryakov

unread,
Dec 12, 2024, 9:48:48 AM12/12/24
to firebir...@googlegroups.com
Denis Simonov wrote 12.12.2024 15:40:
> Therefore, I believe that a good API should be invented right now, even if it
> causes some difficulties, so as not to suffer later.
> ROW TYPE can be a base for GEO types such as Point2D, Point3D, and also for
> Line. GEO types are also planned for implementation in future versions.
> No need to make some bad API now, so as not to redo it later.

Exactly. But you cannot tell if API is good or bad before implementing it. So
the best strategy right now is not to implement any API and see how this type
will be accepted by users at PSQL level-only.

--
WBR, SD.

Pavel Cisar

unread,
Dec 12, 2024, 10:22:41 AM12/12/24
to firebir...@googlegroups.com
Dne 12. 12. 24 v 13:47 Alexey Mochalov napsal(a):
>
> 1.
>
> Thursday, December 12, 2024, 1:25:49 PM UTC+3, Pavel Cisar:
>
> 1. Just noticed that the definition of <row_type_definition> only
> covers the
> DECLARE TYPE typename ( <nested_field> [ , <nested_field> … ] )
> variant but does not cover the DECLARE TYPE typename TYPE OF TABLE
> <table_name> variant used in the examples.
>
>
> Such syntax is not implemented, and it is not used in the examples I
> provided. Perhaps I made a mistake somewhere, and I will gladly correct
> it if you point out where exactly.

You are right, I see now. DECLARE TYPE uses only (..,..) form. I've got
fooled by TYPE OF TABLE constructor used for row variables / parameters.

In fact, it would be more clear if one will always have to declare a row
type via DECLARE TYPE, but could use both methods:

DECLARE TYPE type_name (fspec, fspec...)
or
DECLARE TYPE type_name TYPE OF TABLE table_name

Variables / parameters / values would require reference to row_type
instead using anonymous or table constructors at will. It would simplify
dependency tracking and checks, and maintenance of SQL in applications.
Now it's all fuzzy and error prone, especially runtime ones are awful
surprises in production.

> 2.
>
> The SQL standard states that ROW types can be fields of tables, meaning
> that, in my opinion, this should be implemented in the future. Moreover,
> the standard describes an unlimited level of nesting for composite
> fields. Currently, there was no point in implementing this, as questions
> arise even regarding the basic aspects.

Not in my copy of the SQL 2023 standard 9075-1 - Framework. Section
4.8.5 (Base Tables and their components) in 4.8.5.2 says that "A column
is a field of the row type of the table.". And in section 4.5 it says
that rows are "instance of the row type of the table". Also that: "The
rows of a table have a type, called “the row type”; every row of a table
has the same row type, which is also the row type of the table.". If
columns could be rows, then there is interesting recursion of definition.

Yes, Row types (section 4.5.6.2) are listed among SQL types, so there
could be values of row type, and as relations are just collections of
values of row type, we can have row columns in relations. But it doesn't
mean that it must be for columns of base relations, it could be just for
product relations.

However, can't find anything related to unlimited nesting of rows. Maybe
it's in different part of the standard? Anyway, if Firebird will support
such composite type with unlimited nesting for storage, it enters the
same nightmare league of "unbound" data like Array, JSON and XML (just
with fixed structure), and should be probably handled differently in
BLOB-like fashion?

regards
Pavel Cisar


Dmitry Yemanov

unread,
Dec 12, 2024, 11:11:36 AM12/12/24
to firebir...@googlegroups.com
12.12.2024 17:40, Denis Simonov wrote:
>
> Please note. The author said that in the future it is planned to use ROW
> TYPE as table fields, and even to provide the ability to nest composite
> types within each other.

I'd better say this possibility was considered, but I suppose we
together need to decide whether we need it or not. To the best of my
knowledge, PostgreSQL is the only one that supports stored ROW fields,
others have it purely as PSQL feature. It would be pleasant to be on the
cutting edge, but the question is whether it's really needed for our users.

> ROW TYPE can be a base for GEO types such as Point2D, Point3D, and also
> for Line. GEO types are also planned for implementation in future versions.

I'm not so sure here. While POINT or LINE could be represented as ROW,
more complex GEO types are more like collections as they may contain
arbitrary number of elements inside.

We have OpenGIS support draftly implemented, it will be discussed
separately.


Dmitry

Dmitry Yemanov

unread,
Dec 12, 2024, 11:19:06 AM12/12/24
to firebir...@googlegroups.com
12.12.2024 18:21, Pavel Cisar wrote:
>
> In fact, it would be more clear if one will always have to declare a row
> type via DECLARE TYPE, but could use both methods:
>
> DECLARE TYPE type_name (fspec, fspec...)
> or
> DECLARE TYPE type_name TYPE OF TABLE table_name

I'm not sure about that. We don't support CREATE DOMAIN <my domain> TYPE
OF <another domain>. And we allow procedure/function parameters or
variables being declared via TYPE OF <domain>, not only using the
priorly defined domain name. So the proposed syntax matches what we
already have.


Dmitry

Dmitry Yemanov

unread,
Dec 12, 2024, 11:44:52 AM12/12/24
to firebir...@googlegroups.com
"PSQL level-only" includes input/output parameters, so it's already
surfaced to DSQL users. So "not implementing API" actually means "raise
an error".


Dmitry

Denis Simonov

unread,
Dec 12, 2024, 11:47:32 AM12/12/24
to firebird-devel
I won't be too upset if it won't be possible to have ROW TYPE columns in a table. But I think that support at the API level is a must. 
If an application can't handle ROW TYPE, just don't write queries using ROW TYPE. Also, don't forget that IMessageMetadata is used not only in applications, but also in external UDRs. 
It would be strange to have ROW TYPE support in PSQL, but not in external UDRs.

четверг, 12 декабря 2024 г. в 19:44:52 UTC+3, Dmitry Yemanov:

Dimitry Sibiryakov

unread,
Dec 12, 2024, 11:57:10 AM12/12/24
to firebir...@googlegroups.com
Dmitry Yemanov wrote 12.12.2024 17:44:
>> Exactly. But you cannot tell if API is good or bad before implementing it.
>> So the best strategy right now is not to implement any API and see how this
>> type will be accepted by users at PSQL level-only.
>
> "PSQL level-only" includes input/output parameters, so it's already surfaced to
> DSQL users. So "not implementing API" actually means "raise an error".

It depends on implementation.
Automatic nest/unnest of ROW type in input/output messages is still an option.
Another option is an already working manual unnest in form "alias.*" for
output parameters.

Denis Simonov wrote 12.12.2024 17:47:
> It would be strange to have ROW TYPE support in PSQL, but not in external UDRs.

Nothing strange. We already have PSQL type CURSOR which is not available in
API or UDRs.

--
WBR, SD.

Denis Simonov

unread,
Dec 12, 2024, 12:07:21 PM12/12/24
to firebird-devel


четверг, 12 декабря 2024 г. в 19:57:10 UTC+3, Dimitry Sibiryakov:
We don't have cursors in input and output parameters. So the example is not good. And we can open a cursor inside External UDR itself.
 


--
WBR, SD.

Alexey Mochalov

unread,
Dec 12, 2024, 12:24:00 PM12/12/24
to firebird-devel
четверг, 12 декабря 2024 г. в 18:22:41 UTC+3, Pavel Cisar:
If columns could be rows, then there is interesting recursion of definition.

 And there are.
In SQL 2023 standard 9075 Part 2 starting from 11.3 <table definition> section
<table definition> to <table contents source> to <table element list> to <table element> to <column definition> to <data type or domain name> to <data type> where
<data type> ::=
    <predefined type>
  | <row type>
  | <path-resolved user-defined type name>
  | <reference type>
  | <collection type>
where
<row type> ::=
  ROW <row type body>
<row type body> ::=
  <left paren> <field definition> [ { <comma> <field definition> }... ] <right paren>
<field definition> ::=
  <field name> <data type>

And from here we end up looping within the definition of <data type>.

 
Anyway, if Firebird will support such composite type with unlimited nesting for storage, it enters the
same nightmare league of "unbound" data like Array, JSON and XML (just
with fixed structure), and should be probably handled differently in
BLOB-like fashion?

Why see this as a nightmare? It’s achievable and even partially implemented.  
I see nothing wrong with being able to do more than others.

Denis Simonov

unread,
Dec 12, 2024, 12:29:16 PM12/12/24
to firebird-devel

Just don't think of ROW TYPE in the API as a BLOB. That would be a real nightmare. ROW type has a very specific length, unlike BLOB. 
And ROW TYPE itself can contain a BLOB (BLOB identifier).
четверг, 12 декабря 2024 г. в 20:24:00 UTC+3, Alexey Mochalov:

Pavel Cisar

unread,
Dec 12, 2024, 2:15:37 PM12/12/24
to firebir...@googlegroups.com
Dne 12. 12. 24 v 18:29 Denis Simonov napsal(a):
>
> Just don't think of ROW TYPE in the API as a BLOB. That would be a real
> nightmare. ROW type has a very specific length, unlike BLOB.
> And ROW TYPE itself can contain a BLOB (BLOB identifier).

Well, ARRAY type has well defined fixed length as well, and it's
implemented similarly like BLOB. The BLOB-like means suitable for large
or even huge data. If ROW would become storage type someday, it will be
interesting to provide ways how to solve record length issues. Long
records and resulting low data density is serious performance killer.
Rows could be tiny, but they could be big. Handling big like tiny or
tiny like big is equally bad, although for different reasons. Engine can
select best storage in record according to declared size, either in
record or in BLOB-like supplemental structure. But what about API? Would
we always fetch ROWs in total regardless of their size? Or would we
always handle them like BLOBs/ARRAys? Or provide unified interface that
could handle both cases and engine and/or developer can say what method
to use? That's the question here. BTW, the same problem will apply to
ARRAYs if they will be reintroduced/reworked in better way. Their
handling at API and storage level should be more elastic to handle big
vs. small cases as well, and thus what we'll define to solve rows could
be then used also for them. The sole difference is that rows can have
fields of different type, and arrays can have multiple dimensions.

regards
Pavel


Denis Simonov

unread,
Dec 12, 2024, 2:49:56 PM12/12/24
to firebird-devel
Arrays are just an example of an extremely unsuccessful implementation. The fact that the array is actually a Blob does not allow working with them normally. Arrays can only be read in their entirety or a single element, or written, but only in their entirety. Neither in Sql nor in Psql can we perform any operations with arrays. In theory, you can write anything into a blob, for this you do not need to introduce any new types at all. But will we be able to work effectively with this? I think not. You should not make the same mistakes again.

четверг, 12 декабря 2024 г. в 22:15:37 UTC+3, Pavel Cisar:

Jim Starkey

unread,
Dec 12, 2024, 3:14:03 PM12/12/24
to firebir...@googlegroups.com

Actually, the original Interbase implementation of arrays supported fetching slices of arbitrary size and dimension.  Even got a patent on it (long since expired).

--
You received this message because you are subscribed to the Google Groups "firebird-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebird-deve...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/firebird-devel/4f3a142a-ef4f-46f0-a090-daf88f0c325cn%40googlegroups.com.

Denis Simonov

unread,
Dec 12, 2024, 3:29:15 PM12/12/24
to firebird-devel
In fact, I have nothing against storing arrays as a Blob if the array is really large, for example 100x100 varchar(100). But if you need to store an array of 10 integers, then the blob is an overhead. Such a small array could well be stored as 10 consecutive integers. Such a record can be compressed by RLE. You can change only one element of the array without producing temporary objects. In psql, you can work with it in memory. So probably yes, it is necessary to separate the implementation of large and small arrays. 

четверг, 12 декабря 2024 г. в 23:14:03 UTC+3, Jim Starkey:

Jim Starkey

unread,
Dec 12, 2024, 3:30:09 PM12/12/24
to firebir...@googlegroups.com
That's why defining a high density serialization scheme is a win: the
actual size is dependent only on significant data size rather than
declared length.  It also solves nesting, extensions, and backward
comparability if (ha!) extended.

Pavel Cisar

unread,
Dec 13, 2024, 4:09:42 AM12/13/24
to firebir...@googlegroups.com
Dne 12. 12. 24 v 21:29 Denis Simonov napsal(a):
> In fact, I have nothing against storing arrays as a Blob if the array is
> really large, for example 100x100 varchar(100). But if you need to store
> an array of 10 integers, then the blob is an overhead. Such a small
> array could well be stored as 10 consecutive integers. Such a record can
> be compressed by RLE. You can change only one element of the array
> without producing temporary objects. In psql, you can work with it in
> memory. So probably yes, it is necessary to separate the implementation
> of large and small arrays.

Exactly. The BLOB-like multi-level storage is just implementation method
how to handle arbitrary long data, or too big to be stored in primary
record. It's possible to modify it to store the first level in the
primary record itself for fixed size structures, and shift storage in
auxiliary record to other levels, so there will be 4 instead 3 levels.
Reading any part of the value regardless of level could be implemented
as well (either for operation with parts/slices at SQL level or API).
Sure, the performance of this will depend on actual storage level, but
the API for this could be uniform.

Rows, arrays, tables (arrays or rows) and blobs have a lot in common.
They could be small enough to be embedded in primary record or big
enough to be stored in auxiliary records and pages. They could use
common storage and API plumbing. On API level it means metadata
necessary to identify whether value is part of the buffer, or whether
there is only a "handle" managed by separate interface. In fact, it
should be possible to access values of these types always via specific
type interface (based on some common one), regardless whether the value
is part of the record or not.

The remaining problem is row, table and array representation in memory
(for expressions, sort etc.) and on the wire. But the common small vs.
big apply here as well.

Yes, it wouldn't be simple to design as these structures could be nested
(blobs/arrays in rows in rows etc.), but I think that it's worth to look
at ROW handling in this wider context. It could simplify things a lot
when table and array types will be (re)introduced. Getting blobs under
the same roof would be a bonus, but wouldn't be strictly necessary, as
blobs have variable length (with block or stream access), while all
others have fixed length with sub-elements.

regards
Pavel Cisar

Mark Rotteveel

unread,
Dec 13, 2024, 5:11:37 AM12/13/24
to firebir...@googlegroups.com
On 09/12/2024 16:13, Alexey Mochalov wrote:
[..]
Goals
  1. Implement the ROW data type, a structured PSQL object which contains internal fields

  2. Create declaration syntax of ROW type objects based on existing tables, cursors and user defined types

  3. Nested fields values of ROW type objects should be accessible via qualified name of object plus field name

  4. Add the ability to define local ROW types inside procedures, functions, etc., as well as custom ROW types inside packages

  5. Add <row value constructor> as defined by the SQL standard to create ROW type values on the fly

  6. Make it possible to receive and conveniently process ROW type values on client side

  7. Adhere to the SQL standard as closely as possible

  8. Make the implementation open for future expansions like nested composite fields


What exactly is meant with point 6? Do you want to expose ROW type values as a composite type to the client? If so, I think that goes to far, and significantly raises complexity for drivers.

Syntax

The syntax description is divided into different use cases for ROW types. New features and changes are highlighted in bold. Not all use cases of ROW types are covered in this section, only those that use a previously non-existent syntax. More variants of using ROW type objects can be found in the examples section.

1. ROW type variables, parameters and return values declaration
[..]
2. Local and packaged ROW type definition
[..]
3. Accessing fields in ROW type objects
[..]
4. <row value constructor>

Specify a value or list of values to be constructed into a row value that could be used to initialize variables, in WHERE conditions, as a result of select statements or full record insert/update values.


<row_value_constructor> ::= [ROW] ( <value-expression> [ , <value-expression> … ] )

This syntax is ambiguous and does not match the syntax defined in SQL:2023, 7.1 <row value constructor> is:

<row value constructor> ::=
    <common value expression>
  | <boolean value expression>
  | <explicit row value constructor>

<explicit row value constructor> ::=
    <left paren> <row value constructor element> <comma>
      <row value constructor element list> <right paren>
  | ROW <left paren> <row value constructor element list> <right paren>
  | <row subquery>

<row value constructor element list> ::=
  <row value constructor element> [ { <comma> <row value constructor element> }... ]

<row value constructor element> ::=
  <value expression>

That is, ROW is only optional if there are at least *two* elements in the list, for a list of one element, ROW is required, otherwise there is an ambiguity with <value expression> through:

<value expression primary> ::=
    <parenthesized value expression>
  | <non-parenthesized value expression primary>

<parenthesized value expression> ::=
  <left paren> <value expression> <right paren>

Also, the row subquery derivation is missing, as are  <common value expression> and <boolean value expression>.

5. ROW clause in UPDATE statement

New way of updating whole record using ROW type value:


UPDATE target [[AS] alias]

  SET <set_list>

  [WHERE {<search-conditions> | CURRENT OF cursorname}]

  [PLAN <plan_items>]

  [ORDER BY <sort_items>]

  [ROWS m [TO n]]

  [SKIP LOCKED]

  [RETURNING <returning_list> [INTO <variables>]]


<set_list> ::=

  col_name = <upd_value> [, col_name = <upd_value> ...]

  | ROW = <row_type_upd_value>


<row_type_upd_value> ::= <row_type_value_expression> | DEFAULT


<row_type_value_expression> ::=

    <row_value_constructor>

  | <context-variable>

  | any other expression returning a ROW type value or NULL

This is not the syntax described in the SQL:2023 standard, please lets not invent our own syntax and follow the standard. The syntax in the standard is defined in 14.15 <set clause list>, and is specified as

<set clause list> ::=
  <set clause> [ { <comma> <set clause> }... ]

<set clause> ::=
    <multiple column assignment>
  | <set target> <equals operator> <update source>

<set target> ::=
    <update target>
  | <mutated set clause>

<multiple column assignment> ::=
  <set target list> <equals operator> <assigned row>

<set target list> ::=
  <left paren> <set target> [ { <comma> <set target> }... ] <right paren>

<assigned row> ::=
  <contextually typed row value expression>

In other words, row types used in an update are like

UPDATE ... SET (colum1, column2, column3) = <contextually typed row value expression>

or even (... stands for <contextually typed row value expression>)

UPDATE ... SET (colum1, column2) = ..., (column3, column4) =  ...

Where <contextually typed row value expression> is defined in 7.2 <row value expression>:

<contextually typed row value expression> ::=
    <row value special case>
  | <contextually typed row value constructor>

<row value special case> ::=
  <non-parenthesized value expression primary>

And <contextually typed row value constructor> in 7.1 <row value constructor>

<contextually typed row value constructor> ::=
    <common value expression>
  | <boolean value expression>
  | <contextually typed value specification>
  | <left paren> <contextually typed value specification> <right paren>
  | <left paren> <contextually typed row value constructor element> <comma>
        <contextually typed row value constructor element list> <right paren>
  | ROW <left paren> <contextually typed row value constructor element list> <right paren>

etc, etc.


Examples
[..]
6) updating records using a ROW type variable

UPDATE test_table SET ROW = (3,4,'William');


SELECT * FROM test_table;


          ID     NUM NAME   

============ ======= ==========

           3       4 William  


UPDATE another_table SET ROW = ROW(DEFAULT);


SELECT * FROM another_table;


TEXT                       

==============================

some text


UPDATE another_table SET ROW = DEFAULT;


SELECT * FROM another_table;


TEXT                       

==============================

some text

As I said before under section 5, I don't like this invented, non-standard syntax.

[..]

8) comparing ROW-type objects with each other

SELECT 1 FROM RDB$DATABASE WHERE (1, 2, 3) > (1, 2, 3);


SELECT 1 FROM RDB$DATABASE WHERE (1, 2, 3+1) > (1, 2, 3);


    CONSTANT

============

           1


CREATE PROCEDURE p1 (inpar TYPE OF TABLE test_table)

RETURNS (outpar TYPE OF TABLE test_table)

AS

BEGIN

outpar = inpar;

SUSPEND;

END;


SELECT * FROM p1((1, 3, null)) WHERE (outpar > (1, 2, 3)) IS NOT null;


                                  OUTPAR

========================================

                          (1, 3, <null>)

IIRC, the comparison rules for < and > are rather complex; this will probably need to get a lot of attention during testing.
9) packaged ROW types declaration9.1) declaration of a public packaged ROW type (can be used outside packages via the package name qualifier)
[..]
9.2) declaration of a private packaged ROW type (can be used in the same package where the type is declared)
[..]
Key points underpinning implementation
ODS changes
[..]

I hope this scheme will help to explain more clearly the structure and relations between these fields in system tables (new fields and table are bolded):


This image was absent or not rendered for me.
Dependencie
[..]
Message metadata

To make it possible to retrieve composite types on the client, a serialized ROW type descriptor was added to the message metadata with the new IMessageMetadata::getCompositeDescriptor() function of the corresponding interface.

This descriptor is serialized in such a way that it is convenient to parse and supports almost unlimited nesting of described fields.


The structure of the descriptor passed to the client:

2 bytes - ordinal number of the described field inside the descriptor starting from 1

1 byte - length of the field name

0..255 bytes - field name

2 bytes - field type

4 bytes - field data length

2 bytes - subtype

2 bytes - scale

2 bytes - charset

1 byte - 0 value or 2 bytes - ordinal number of the subfield

As I said, I don't like this because it raises the complexity for drivers. At minimum, I think there must be an option to enable/disable this in firebird.conf/databases.conf and in the DPB, which when "disabled" will unpack row types to "normal" top-level columns in the messages, so that older drivers can still work, or drivers that don't need or don't want the complexity of row-types can simply disable it and handle them as normal columns.

Each field descriptor must end with a 0 value when all of its subfields have already been described. For example:


SQL> create table base_table (id int, num numeric(10,2), name char(10));

SQL> create procedure proc1 returns (outpar type of table base_table) as begin outpar = (1, 3.14, 'Gibson'); suspend; end;

SQL> select * from proc1;


                                              OUTPAR

====================================================

                                   (1, 3.14, Gibson)


Message metadata serialized descriptor of the query result will be:


This image is missing or not rendered for me.

Where:

[..]

Using this descriptor we can parse the data buffer and get the values for each subfield.

Please provide an example with multiple columns, combining normal columns and row-type columns.
BLR codes

[..]

Mark

-- 
Mark Rotteveel

Mark Rotteveel

unread,
Dec 13, 2024, 5:13:57 AM12/13/24
to firebir...@googlegroups.com
On 10/12/2024 15:25, Alexey Mochalov wrote:
> On 10.12.2024 09:37, Denis Simonov wrote:
> 2. Are NEW and OLD context variables in DML triggers of the ROW
> type? Can the same operations be performed with them?
>
> The NEW and OLD contexts have not been converted to the ROW type and
> remain unchanged.

I think they should be retrofitted to be a row type, because
effectively, they already are (or at least, seem to behave as one)!

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
Dec 13, 2024, 5:40:32 AM12/13/24
to firebir...@googlegroups.com
On 10/12/2024 11:06, Denis Simonov wrote:
> No one is obligated to use the new data type for transmission to the
> client.

Sorry, but that is a naive statement. I can already foresee the
complaints coming in from people who write a statement like `select
(column1, column2) from some_table`, or `EXECUTE PROCEDURE x` where X
returns a row-type, and not being able to get the values in a driver
that was not updated to support row values.

> This is not at all the same as it was in 4.0, when the old SQL code
> began returning INT128 or types with TIME ZONE.
> FireDAC uses a mixed approach (Legacy + OO API) and adapts to
> innovations quite quickly.

It is the same, just on a different level, and even more complex to
solve on a driver-level.

Right now, I'm looking at the JDBC specification how I would need to map
a row-type, and I guess that formally, I would need to map it to a
java.sql.Struct, which is annoying and probably unusable for a lot of
cases. Decomposing the row type to individual columns is much simpler
and usable, and it would be good if that is directly supported by
Firebird (e.g. a setting in firebird.conf/databases.conf *and* a DPB item).

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
Dec 13, 2024, 5:47:14 AM12/13/24
to firebir...@googlegroups.com
On 11/12/2024 12:58, 'Dimitry Sibiryakov' via firebird-devel wrote:
> Pavel Cisar wrote 11.12.2024 12:41:
>> Maybe at the SQL level because you never participated in development
>> of complex business application, but you certainly used STRUCT types
>> in C/C++ with members that are also STRUCT types? I saw those in
>> Firebird source.
>
>   Say you have a query "select a.*, b.* from a,b". Currently result set
> of this query is unnested and you access fields individually. The code
> can be seen in examples.
>   How will application code look like for a query "select row(a.*),
> row(b.*) from a,b"?

Formally, according to the SQL standard, select a.*, b.* from a, b
produces two columns, of a row type, so that is already an
incompatibility right there ;)

This is also why I'm not convinced that the SQL standard doesn't already
expect you to decompose a row type in a top-level select list, but I
can't specifically find a rule one way or the other.

Mark
--
Mark Rotteveel

Denis Simonov

unread,
Dec 13, 2024, 5:55:51 AM12/13/24
to firebird-devel
> Right now, I'm looking at the JDBC specification how I would need to map
> a row-type, and I guess that formally, I would need to map it to a
> java.sql.Struct, which is annoying and probably unusable for a lot of cases.

And what is done about this in the MS SQL/Oracle/Postgres JDBC drivers?

I am not against decomposing ROW into a flat set of fields, if only this is an additional feature to make the work of driver writers easier. 
But even in this case, you at least need the name of the top-level column in the metadata. And this will only work if there are no nested ROWs.

Just don't do it by completely replacing the ROW type at the metadata level. As I already said, there is such a thing as UDR and there can be ROW types there too.

пятница, 13 декабря 2024 г. в 13:47:14 UTC+3, Mark Rotteveel:

Mark Rotteveel

unread,
Dec 13, 2024, 6:26:06 AM12/13/24
to firebir...@googlegroups.com
On 13/12/2024 11:55, Denis Simonov wrote:
> > Right now, I'm looking at the JDBC specification how I would need to map
> > a row-type, and I guess that formally, I would need to map it to a
> > java.sql.Struct, which is annoying and probably unusable for a lot of
> cases.
>
> And what is done about this in the MS SQL/Oracle/Postgres JDBC drivers?

I'll check what they do.

> I am not against decomposing ROW into a flat set of fields, if only this
> is an additional feature to make the work of driver writers easier.

Yes, I'm fine if it is something optional, configurable at DPB and
firebird/database.conf level.

> But even in this case, you at least need the name of the top-level
> column in the metadata. And this will only work if there are no nested ROWs.

I don't see the need to be honest or you could always communicate that
through isc_info_sql_relation_alias or a new information item.

> Just don't do it by completely replacing the ROW type at the metadata
> level. As I already said, there is such a thing as UDR and there can be
> ROW types there too.

For one, because that will make things backwards compatible for existing
drivers, and second because it makes things a hell of a lot easier for
drivers that don't want to or simply can't present a row type as a
single column.

Mark
--
Mark Rotteveel

Denis Simonov

unread,
Dec 13, 2024, 6:36:55 AM12/13/24
to firebird-devel

I don't quite understand what backward compatibility you are talking about. No valid Firebird 5.0 SQL query can return ROW type in 6.0. 
In order for ROW type to be returned, this query must be rewritten. Either use ROW constructor, or function returning ROW type. 
The same applies to input parameters.
пятница, 13 декабря 2024 г. в 14:26:06 UTC+3, Mark Rotteveel:

Mark Rotteveel

unread,
Dec 13, 2024, 6:37:31 AM12/13/24
to firebir...@googlegroups.com
Or you can flatten it by decomposing it to its constituent columns.

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
Dec 13, 2024, 6:42:56 AM12/13/24
to firebir...@googlegroups.com
On 13/12/2024 12:36, Denis Simonov wrote:
>
> I don't quite understand what backward compatibility you are talking
> about. No valid Firebird 5.0 SQL query can return ROW type in 6.0.
> In order for ROW type to be returned, this query must be rewritten.
> Either use ROW constructor, or function returning ROW type.
> The same applies to input parameters.

I mean backwards compatibility for older clients talking to a Firebird
6.0 server. You seem to think drivers are only used in carefully crafted
applications, where each query was carefully crafted to be compatible
with the driver used; they are also used in tools like DBeaver where
users will input adhoc queries, and they won't like it if their
syntactically valid query won't work.

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
Dec 13, 2024, 8:12:28 AM12/13/24
to firebir...@googlegroups.com
On 13/12/2024 11:55, Denis Simonov wrote:
> > Right now, I'm looking at the JDBC specification how I would need to map
> > a row-type, and I guess that formally, I would need to map it to a
> > java.sql.Struct, which is annoying and probably unusable for a lot of
> cases.
>
> And what is done about this in the MS SQL/Oracle/Postgres JDBC drivers?

PostgreSQL will describe it as JDBC type OTHER, and returns it as a
class org.postgresql.util.PGobject, which is basically a very thin value
around a string (e.g. ROW ('a', 'b') is rendered as (a,b), and
ROW('a,b', 'c') as ("a,b",c), and ROW ('a,b', 'c', ROW ('d', 'e')) as
("a,b",c,"(d,e)")) and a type name ("record"), which is IMHO even worse
than describing it as JDBC type STRUCT and returning java.sql.Struct.

As far as I can tell, it is already serialized as such server-side, but
that might depend on how the driver asks for it. I didn't dive into the
implementation.

Neither Oracle nor SQL Server support row value constructors outside of
a table value constructor.

I may look at other drivers when I have the time.

Mark
--
Mark Rotteveel

Alexey Mochalov

unread,
Dec 16, 2024, 4:05:34 AM12/16/24
to firebird-devel
пятница, 13 декабря 2024 г. в 13:11:37 UTC+3, Mark Rotteveel:
On 09/12/2024 16:13, Alexey Mochalov wrote:
[..]

What exactly is meant with point 6? Do you want to expose ROW type values as a composite type to the client? If so, I think that goes to far, and significantly raises complexity for drivers.

Yes, in my opinion, it’s important to support passing composite type values to the client. 

[..]

This syntax is ambiguous and does not match the syntax defined in SQL:2023, 7.1 <row value constructor> is:

[..]

I tried to be as precise as possible in describing how the implemented syntax for row types works and adapted it to the official syntax from the Firebird documentation. <value-expression> covers all the options for creating <row value constructor> that were planned for discussion within this proposal.

That is, ROW is only optional if there are at least *two* elements in the list, for a list of one element, ROW is required, otherwise there is an ambiguity with <value expression> through:

[..]

That’s absolutely correct; this is exactly how it works now. I should have noted this in the description.

Also, the row subquery derivation is missing, as are  <common value expression> and <boolean value expression>.

[..]

This proposal only implements the <explicit row value constructor>.

This is not the syntax described in the SQL:2023 standard, please lets not invent our own syntax and follow the standard. The syntax in the standard is defined in 14.15 <set clause list>, and is specified as

[..]

You are absolutely right. Unfortunately, I overlooked this particular UPDATE query syntax, and it definitely needs to be added to the implementation. 
Regarding the SET ROW syntax—it was implemented as an analog from another popular DBMS. When manipulating composite type objects derived from a specific table, it can be quite convenient to specify UPDATE without repeating all the fields of the same table. 

 [..]
As I said, I don't like this because it raises the complexity for drivers. At minimum, I think there must be an option to enable/disable this in firebird.conf/databases.conf and in the DPB, which when "disabled" will unpack row types to "normal" top-level columns in the messages, so that older drivers can still work, or drivers that don't need or don't want the complexity of row-types can simply disable it and handle them as normal columns.

This approach creates significant challenges in the internal implementation for SELECT ... INTO queries, RETURNING, and routine parameters.

 [..]
Please provide an example with multiple columns, combining normal columns and row-type columns.

Ordinary fields do not appear in the serialized descriptor because they have their own index in the metadata. The serialized descriptor does not replace the regular metadata for simple fields but complements it for composite fields.  

If the request was to demonstrate the result and not the descriptor, here it is:

SQL> SELECT 1, ROW(2), 3, (5, 6) FROM rdb$database;

    CONSTANT              ROW     CONSTANT                      ROW
============ ================ ============ ========================
           1              (2)            3                   (5, 6)

Alexey Mochalov

unread,
Dec 16, 2024, 4:14:25 AM12/16/24
to firebird-devel
пятница, 13 декабря 2024 г. в 13:11:37 UTC+3, Mark Rotteveel:
On 09/12/2024 16:13, Alexey Mochalov wrote:
[..]
This image was absent or not rendered for me.
 sheme.png
[..]
This image is missing or not rendered for me.
 
debugger.png

Attached the images in a different way, hopefully they can be seen now

Alexey Mochalov

unread,
Dec 16, 2024, 4:20:08 AM12/16/24
to firebird-devel
Most likely, it can be done; it just wasn’t a priority.
пятница, 13 декабря 2024 г. в 13:13:57 UTC+3, Mark Rotteveel:

Mark Rotteveel

unread,
Dec 16, 2024, 5:36:15 AM12/16/24
to firebir...@googlegroups.com
On 16/12/2024 10:05, Alexey Mochalov wrote:
> пятница, 13 декабря 2024 г. в 13:11:37 UTC+3, Mark Rotteveel:
> This is not the syntax described in the SQL:2023 standard, please
> lets not invent our own syntax and follow the standard. The syntax
> in the standard is defined in 14.15 <set clause list>, and is
> specified as
>
> [..]
>
>
> You are absolutely right. Unfortunately, I overlooked this particular
> UPDATE query syntax, and it definitely needs to be added to the
> implementation.
> Regarding the SET ROW syntax—it was implemented as an analog from
> another popular DBMS. When manipulating composite type objects derived
> from a specific table, it can be quite convenient to specify UPDATE
> without repeating all the fields of the same table.

I think `set ROW` is a bad idea on principle, for the same reasons why
using INSERT without an explicit column list is generally regarded as a
bad idea. It is very ambiguous, and can easily lead to assigning the
wrong values to the wrong columns.

>  [..]
>
> As I said, I don't like this because it raises the complexity for
> drivers. At minimum, I think there must be an option to enable/
> disable this in firebird.conf/databases.conf and in the DPB, which
> when "disabled" will unpack row types to "normal" top-level columns
> in the messages, so that older drivers can still work, or drivers
> that don't need or don't want the complexity of row-types can simply
> disable it and handle them as normal columns.
>
>
> This approach creates significant challenges in the internal
> implementation for SELECT ... INTO queries, RETURNING, and routine
> parameters.

That depends on *where* the unpacking is done. I'm only talking about
conversion on the interface between client and server, similar to how
DataTypeCompatibility works (in fact, as others suggested, it probably
needs to be implemented as part of the DataTypeCompatibility option).

>  [..]
>
> Please provide an example with multiple columns, combining normal
> columns and row-type columns.
>
>
> Ordinary fields do not appear in the serialized descriptor because they
> have their own index in the metadata. The serialized descriptor does not
> replace the regular metadata for simple fields but complements it for
> composite fields.

OK, I thought the descriptor was somehow nested as elements of the
normal descriptor.

Mark
--
Mark Rotteveel

Denis Simonov

unread,
Dec 16, 2024, 7:47:12 AM12/16/24
to firebird-devel


понедельник, 16 декабря 2024 г. в 13:36:15 UTC+3, Mark Rotteveel:
On 16/12/2024 10:05, Alexey Mochalov wrote:
> пятница, 13 декабря 2024 г. в 13:11:37 UTC+3, Mark Rotteveel:
> This is not the syntax described in the SQL:2023 standard, please
> lets not invent our own syntax and follow the standard. The syntax
> in the standard is defined in 14.15 <set clause list>, and is
> specified as
>
> [..]
>
>
> You are absolutely right. Unfortunately, I overlooked this particular
> UPDATE query syntax, and it definitely needs to be added to the
> implementation.
> Regarding the SET ROW syntax—it was implemented as an analog from
> another popular DBMS. When manipulating composite type objects derived
> from a specific table, it can be quite convenient to specify UPDATE
> without repeating all the fields of the same table.

I think `set ROW` is a bad idea on principle, for the same reasons why
using INSERT without an explicit column list is generally regarded as a
bad idea. It is very ambiguous, and can easily lead to assigning the
wrong values to the wrong columns.


 Agreed. I don't think it's a very good idea. It's always better to use an explicit ROW constructor on both sides.


>  [..]
>
> As I said, I don't like this because it raises the complexity for
> drivers. At minimum, I think there must be an option to enable/
> disable this in firebird.conf/databases.conf and in the DPB, which
> when "disabled" will unpack row types to "normal" top-level columns
> in the messages, so that older drivers can still work, or drivers
> that don't need or don't want the complexity of row-types can simply
> disable it and handle them as normal columns.
>
>
> This approach creates significant challenges in the internal
> implementation for SELECT ... INTO queries, RETURNING, and routine
> parameters.

That depends on *where* the unpacking is done. I'm only talking about
conversion on the interface between client and server, similar to how
DataTypeCompatibility works (in fact, as others suggested, it probably
needs to be implemented as part of the DataTypeCompatibility option).

Don't make DataTypeCompatibility part of it. Better to make a separate option in config and/or dpb.

Like UnwrapRowType = true|false or isc_dpb_unwrap_row_type.

Alex Peshkoff

unread,
Dec 23, 2024, 5:11:56 AM12/23/24
to firebir...@googlegroups.com
I'm sorry if I'm late with this suggestion.

As far as I understood main problems come from old drivers using old
API. May be fields to be flatten only for isc API - i.e. in client's
yvalve, at the very last step of delivering it to the client?


Mark Rotteveel

unread,
Dec 24, 2024, 9:12:05 AM12/24/24
to firebir...@googlegroups.com
On 23/12/2024 11:11, Alex Peshkoff wrote:
> I'm sorry if I'm late with this suggestion.
>
> As far as I understood main problems come from old drivers using old
> API. May be fields to be flatten only for isc API - i.e. in client's
> yvalve, at the very last step of delivering it to the client?

That doesn't work for clients that use the wire protocol.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
Dec 24, 2024, 9:16:02 AM12/24/24
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 24.12.2024 15:11:
>>
>> As far as I understood main problems come from old drivers using old API. May
>> be fields to be flatten only for isc API - i.e. in client's yvalve, at the
>> very last step of delivering it to the client?
>
> That doesn't work for clients that use the wire protocol.

That's why OO API should be affected too: in this case flattening will happen
in server's Y-valve and no changes in network protocol is needed.

--
WBR, SD.

Denis Simonov

unread,
Dec 24, 2024, 9:27:39 AM12/24/24
to firebird-devel
I consider passing ROW type flat as one of the tools for backward compatibility for old applications or clients, but not as a default implementation. 
As I said, it is very inconvenient in many cases, for example when developing UDRs. Also, if a client application wants to understand ROW as a separate entity, 
it should be able to do so. Don't prohibit it just because someone might have difficulties. There are tools for backward compatibility for them.

вторник, 24 декабря 2024 г. в 17:16:02 UTC+3, Dimitry Sibiryakov:

Dimitry Sibiryakov

unread,
Dec 24, 2024, 9:31:37 AM12/24/24
to firebir...@googlegroups.com
Denis Simonov wrote 24.12.2024 15:27:
> Also, if a client application wants to understand ROW as a separate entity,
> it should be able to do so.

Such client can consider part of message buffer as a row value. According to
the standard, whole message buffer actually IS a row value.

--
WBR, SD.

Denis Simonov

unread,
Dec 24, 2024, 9:42:07 AM12/24/24
to firebird-devel
Apparently we are talking about different things. I agree that the buffer itself is still flat. We are only talking about metadata from which you can understand that this is a ROW TYPE 
and the ability to read the structure of this ROW TYPE at the IMessageMetadata level. If compatibility mode is enabled, then at the metadata level ROW TYPE will simply be expanded as 
if these were regular columns, but if compatibility mode is disabled, then you can find out the details. In this case, you can report the ROW column as a separate field, and then detail its contents, 
that is, return the metadata of the ROW fields.



вторник, 24 декабря 2024 г. в 17:31:37 UTC+3, Dimitry Sibiryakov:

Denis Simonov

unread,
Dec 24, 2024, 9:56:24 AM12/24/24
to firebird-devel
Let me explain how I see it.

IMessageMetadata ...
{
....
// version 6.0
IMessageMetadata getSubMetadata(Status status, index);
const char* getParentField(Status status, index);
}

In compatibility mode, the parent field name (ROW TYPE field name) is returned for the top-level metadata, the other methods work for each ROW field as if they were regular columns. 
This is exactly the case of ROW unfolding into a flat form.

In normal mode, if getType() returns SQL_ROW, then we can request getSubMetadata() and work with the ROW TYPE internals as if we were working with a regular row. 
In this case, for nested IMessageMetadata, the getParentField() method will also return the parent field name.

вторник, 24 декабря 2024 г. в 17:42:07 UTC+3, Denis Simonov:

Alex Peshkoff

unread,
Dec 24, 2024, 9:59:39 AM12/24/24
to firebir...@googlegroups.com
Why not? We already have thin layer in yvalve, implementing ISC API over
OO one. When it finds ROW type in message metadata, it flattens fields
from it in XSQLDA. How is it related with wire protocol?


Denis Simonov

unread,
Dec 24, 2024, 10:00:12 AM12/24/24
to firebird-devel
The network protocol will change in 6.0 anyway. At the very least, the schema information may appear in the metadata. There is also a branch with an inline blob.

вторник, 24 декабря 2024 г. в 17:59:39 UTC+3, Alex Peshkoff:

Mark Rotteveel

unread,
Dec 25, 2024, 5:07:45 AM12/25/24
to firebir...@googlegroups.com
On 24/12/2024 15:42, Denis Simonov wrote:
> Apparently we are talking about different things. I agree that the
> buffer itself is still flat. We are only talking about metadata from
> which you can understand that this is a ROW TYPE
> and the ability to read the structure of this ROW TYPE at the
> IMessageMetadata level. If compatibility mode is enabled, then at the
> metadata level ROW TYPE will simply be expanded as
> if these were regular columns, but if compatibility mode is disabled,
> then you can find out the details. In this case, you can report the ROW
> column as a separate field, and then detail its contents,
> that is, return the metadata of the ROW fields.

Talking about it in terms of IMessageMetadata is the wrong level. Not
all clients use IMessageMetadata, some actually implement the low-level
wire protocol, and that is where it must be addressed.

If there is a compatibility mode (e.g. using
DataTypeCompatability/isc_dpb_set_bind/SET BIND or some mechanism), that
should simply decide *how* the row is described in op_sql_info: retain
row values as columns (default), or decompose row values into individual
columns (compatibility mode).

As a result, that will also decide how IMessageMetadata will describe a row.

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
Dec 25, 2024, 5:08:47 AM12/25/24
to firebir...@googlegroups.com
You said "client's yvalve", for wire protocol implementations, there is
no "client's yvalve", only a "server yvalve".

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
Dec 25, 2024, 5:11:08 AM12/25/24
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 25.12.2024 11:07:
> If there is a compatibility mode (e.g. using DataTypeCompatability/
> isc_dpb_set_bind/SET BIND or some mechanism), that should simply decide *how*
> the row is described in op_sql_info: retain row values as columns (default), or
> decompose row values into individual columns (compatibility mode).

Or vice versa: decompose by default and describe as a single row for those
who set the flag in DPB because is aware of ROW type.

> As a result, that will also decide how IMessageMetadata will describe a row.

Didn't you forget that on server side the engine and Y-valve use exactly the
same API as on client side?

--
WBR, SD.

Mark Rotteveel

unread,
Dec 25, 2024, 5:26:09 AM12/25/24
to firebir...@googlegroups.com
On 25/12/2024 11:11, 'Dimitry Sibiryakov' via firebird-devel wrote:
> 'Mark Rotteveel' via firebird-devel wrote 25.12.2024 11:07:
>> If there is a compatibility mode (e.g. using DataTypeCompatability/
>> isc_dpb_set_bind/SET BIND or some mechanism), that should simply
>> decide *how* the row is described in op_sql_info: retain row values as
>> columns (default), or decompose row values into individual columns
>> (compatibility mode).
>
>   Or vice versa: decompose by default and describe as a single row for
> those who set the flag in DPB because is aware of ROW type.

But that wouldn't be the desired behaviour. Although the row (or
parameter set) can be a considered a row value, we're talking about a
situation where a row has a number of singular (normal) columns *and*
columns that are row values, so describing as a single row doesn't make
sense, because then you'd still decompose the row values and then
recompose them into another row value.

We're talking about describing a row:

a, (b, c, d), e, (f, g)

as either 4 columns, with columns 2 and 4 being row values of
respectively 3 and 2 columns)

or - compatibility mode - describing it as 7 individual columns.

For a single layer of row values as is proposed now, I think it should
even be possible to keep describing individual columns, but add an info
item to identify the row value. However, that breaks down once you want
to support nested row values, like the standard supports.

>> As a result, that will also decide how IMessageMetadata will describe
>> a row.
>
>   Didn't you forget that on server side the engine and Y-valve use
> exactly the same API as on client side?

No, I'm arguing from perspective of the *wire protocol* being the API,
and that the discussion needs to consider the wire protocol, not one of
its implementations (fbclient).

The wire protocol is the API between client and server; that fbclient
presents the information using IMessageMetadata as well is totally
irrelevant for other implementations of the wire protocol.

How things are described in the wire protocol is important *to me*,
someone who has to implement this stuff, and has to deal with IMHO odd
designs made because - I think - the primary focus seems to always have
been the C API before and now the OO API, without a lot of consideration
for the design of the protocol itself, as long as it worked for fbclient.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
Dec 25, 2024, 5:35:51 AM12/25/24
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 25.12.2024 11:25:
> But that wouldn't be the desired behaviour. Although the row (or parameter set)
> can be a considered a row value, we're talking about a situation where a row has
> a number of singular (normal) columns *and* columns that are row values

Yes, this is the case I'm talking about. I didn't suggest to report whole
result set as a single row value.

> We're talking about describing a row:
>
> a, (b, c, d), e, (f, g)
>
> as either 4 columns, with columns 2 and 4 being row values of respectively 3 and
> 2 columns)
>
> or - compatibility mode - describing it as 7 individual columns.

All I suggested is to do it contrary: describe the row as 7 columns by
default and as 4 columns for those who explicitly asked for it.

--
WBR, SD.
It is loading more messages.
0 new messages