Hi!
Here I'm sharing the initial draft of schemas documentation.
Comments, improvements, additions, questions are welcome.
----
# Schemas (FB 6.0)
Firebird 6.0 introduces support for schemas in the database. Schemas are
not an optional feature, so every Firebird 6
database has at least a `SYSTEM` schema, reserved for Firebird system
objects (`RDB$*` and `MON$*`).
User's objects live in different schemas, which may be the automatically
created `PUBLIC` schema or user-defined
ones. It's not allowed (with an exception about indexes) to create or
change objects in the `SYSTEM` schema.
This documentation explains how schemas work in Firebird, how to use
them and what may be different when migrating
a database from previous versions to Firebird 6.
## Why schemas
Schemas allow the logical grouping of database objects (such as tables,
views and indexes), providing a clear structure
to the database. Mostly frequently, they are used for two different
purposes.
### Schemas for database object organization
This usage of schemas helps in organizing database objects in a modular
way, making the database easier to manage
and maintain. By dividing the database into different schemas,
developers and administrators can focus on specific
areas of the database, improving teams scalability and reducing complexity.
For example, the `SYSTEM` schema is used with this purpose, separating
objects created by two different groups
(the Firebird DBMS core team and the Firebird users).
Firebird users may also want to organize objects in different schemas,
for example, creating schemas like
`FINANCE` and `MARKETPLACE` in the same database.
### Schemas for data isolation
In multi-tenant applications, schemas can be leveraged to provide data
isolation for different clients or tenants.
By assigning a unique schema to each tenant, tables and others objects
can be created with the same names in different
schemas, making it more difficult to leak data and sometimes increasing
the performance. Applications can then set
the schema search path for the current selected customer.
This also simplifies database management and scaling, as each tenant's
data is isolated within its schema,
making maintenance, updates, and backups more straightforward.
Example schemas names could be `CUSTOMER_1` and `CUSTOMER_2`.
## Schema-less and schema-bound objects
There are two categories of database objects: schema-less and
schema-bound objects.
Some database objects live outside of schemas, and for them, everything
works as before. They are:
- Users
- Roles
- Blob filters
- Schemas
The others are now always contained in a schema:
- Tables
- Views
- Triggers
- Procedures
- Exceptions
- Domains
- Indexes
- Character sets
- Sequences / generators
- Functions
- Collations
- Packages
Some objects are highly dependent on their parents, like table-based
triggers and indexes depending on the table.
In this case the child object always lives in the same schema of its parent.
## Search path
Every Firebird session has a search path, which is a list of schemas
that is used to resolve not-qualified object names.
By default, the initial search path is `PUBLIC, SYSTEM`.
This default can be changed using `isc_dpb_search_path` in the API or
later changed using the `SET SEARCH_PATH TO`
statement.
`ALTER SESSION RESET` resets the search path to the initial search path,
i.e. the one passed to `isc_dpb_search_path`
or the default (`PUBLIC, SYSTEM`).
Non-existing schemas may be present in the search path, and in this
case, they are not considered.
The first existing schema in the search path is called the `current
schema` and is exclusively used in some operations.
Bind of unqualified objects to a schema generally happens at statement
prepare time. An exception to this rule is
`MAKE_DBKEY` function with expression (not simple literal) as first
argument, making it resolve the table at execution
time.
Object names may be now qualified with the schema name, for example
`SCHEMA_NAME.TABLE_NAME`, `SCHEMA_NAME.TABLE_NAME.COLUMN_NAME`,
`SCHEMA_NAME.PACKAGE_NAME.PROCEDURE_NAME`. But the
schema qualifier is optional. And here is where the search path is used,
and it is used in different ways depending on
where the unqualified name appears.
With `CREATE`, `CREATE OR ALTER` and `RECREATE` statements, an existing
object is searched only in the `current schema`
(the first valid schema present in the search path) and the new object
is created in this same schema. That same rule is
also used with `GRANT` and `REVOKE` statements related to DDL operations
without using `ON SCHEMA` sub clause.
If there is no `current schema` (no valid schema in the search path), an
error will be raised.
Examples using this rule:
```sql
create table TABLE1 (ID integer);
recreate table TABLE1 (ID integer);
create or alter function F1 returns integer as begin end;
grant create table to user USER1;
```
With `ALTER`, `DROP` and others statements, an existing object is
searched in all schemas present in the search path,
and the reference is bound to the first one found, or an error is raised.
Examples using this rule:
```sql
alter table TABLE1 add X integer;
alter function FUNCTION1 returns integer as begin end;
select * from TABLE1;
```
There is also a difference in relation to search paths in DML vs DDL
statements.
With DML statements, the search path is used to find all the referenced
unqualified objects. For example:
```sql
insert into TABLE1 values (1);
execute block returns (out DOMAIN1)
as
begin
select val from TABLE2 into out;
end;
```
In this case, the search path is used to find `TABLE1`, `DOMAIN1` and
`TABLE2`.
With DDL statements, it is actually used to search in the same way, but
the search path is implicitly temporarily
changed just after the object being created/changed is bound to a schema
when preparing the statement.
The change makes the search path equal to the schema of the object
followed by the `SYSTEM` schema.
For example: TODO: What about PLUGINS schema?
```sql
create schema SCHEMA1;
create schema SCHEMA2;
create domain SCHEMA1.DOMAIN1 integer;
-- DOMAIN1 is bound to SCHEMA1 even without it being in the search path,
as the table being created is bound to SCHEMA1
create table SCHEMA1.TABLE1 (id DOMAIN1);
set search_path to SCHEMA2, SCHEMA1;
-- Error: even if SCHEMA1 is in the search path, TABLE2 is bound to SCHEMA2,
-- so DOMAIN1 is searched in SCHEMA2 and SYSTEM schemas
create table TABLE2 (id DOMAIN1);
set search_path to SYSTEM;
create procedure SCHEMA1.PROC1
as
begin
-- TABLE1 is bound to SCHEMA1 as PROC1
insert into TABLE1 values (1);
end;
```
### Resolving between PACKAGE.OBJECT and SCHEMA.OBJECT
There is now an ambiguity in the syntax `<name>.<name>` between
`<package>.<object>` and `<schema>.<object>`
when referring to procedures and functions.
In this case it first looks for a package using the search path, and if
it exists, bound its name treating the
expression as `<package>.<object>` in the found `<schema>`.
If the package is not found, then the name is treated as an already
qualified name (`<schema>.<object>`).
## Permissions
Permissions to control and use schema-bound objects are now influenced
by the schema permissions.
A schema, like other objects, has an owner. Its owner can manipulate and
use any object in the schema, even the
objects created by others users in that schema.
To manipulate objects in a schema from another user, a user needs DDL
permissions. DDL permissions already existed in
previous versions, but now they are more fine-grained, like in these
examples:
```sql
grant create table on schema SCHEMA1 to user USER1;
grant alter any procedure on schema SCHEMA1 to PUBLIC;
```
`ON SCHEMA <name>` clause is optional, and if not present, it is
implicitly assumed to be the `current schema`.
To use objects it was already necessary to have permissions like
`EXECUTE` or `USAGE` granted for the object.
Now, in addition to that, it is necessary to have granted the `USAGE`
permission for the schema where the object is
contained, like in this example:
```sql
-- Connected as USER1
create schema SCHEMA1;
create table SCHEMA1.TABLE1 (ID integer);
grant usage on schema SCHEMA1 to user USER2;
grant select on table SCHEMA1.TABLE1 to user USER2;
```
## The SYSTEM schema
All system schema-bound objects (`RDB$*` and `MON$*`) are now created in
a special schema called `SYSTEM`.
As the `SYSTEM` schema has a default `USAGE` permission granted to
`PUBLIC` and by default is present in the
search path, its usage is backward compatible with previous versions.
With the exception of index creation and manipulation of these created
indexes, the `SYSTEM` schema is locked
for changes. However, it is not recommended to manipulate objects there.
## The PUBLIC schema
A schema called `PUBLIC` is automatically created in new databases, with
a default `USAGE` permission granted to
`PUBLIC`. Only the database/schema owner has default permissions to
manipulate objects in that schema.
The `PUBLIC` schema is not a system object and it can even be dropped by
the database owner or by a user with
`DROP ANY SCHEMA` permission. When restoring a backup of Firebird >= 6
with `gbak` where the `PUBLIC` schema was not
present, the restored database will be created without it.
## New statements and expressions
### CREATE SCHEMA
```sql
{CREATE [IF NOT EXISTS] | CREATE OR ALTER | RECREATE} SCHEMA <schema name>
[DEFAULT CHARACTER SET <character set name>]
```
If `DEFAULT CHARACTER SET` is not specified, new schemas are created
with the default character set equal to the
database default character set.
The default character set is fine-grained, so each schema can have a
different default character set. When previous
Firebird versions used the default database character set, now it uses
the default schema character set of the
contained object.
Different than the automatically created `PUBLIC` schema, the newly
created schema has `USAGE` permission granted
only to its owner and not to `PUBLIC`.
Schema names `INFORMATION_SCHEMA` and `DEFINITION_SCHEMA` are reserved
and cannot be created.
### ALTER SCHEMA
```sql
ALTER SCHEMA <schema name>
[SET DEFAULT CHARACTER SET <character set name>]
```
### DROP SCHEMA
```sql
DROP SCHEMA [IF EXISTS] <schema name>
```
It is currently allowed to drop only empty schemas. In the future the
`CASCADE` sub clause will be added allowing to
drop schema and all its contained objects.
### CURRENT_SCHEMA
`CURRENT_SCHEMA` returns the first valid schema name present in the
search path of the current session.
If there is none, it returns `NULL`.
### SET SEARCH_PATH TO
```sql
SET SEARCH_PATH TO <schema name> [, <schema name>]...
```
### RDB$GET_CONTEXT
#### CURRENT_SCHEMA (SYSTEM)
`RDB$GET_CONTEXT('SYSTEM', 'CURRENT_SCHEMA')` returns the same value as
the `CURRENT_SCHEMA` expression.
#### SEARCH_PATH (SYSTEM)
`RDB$GET_CONTEXT('SYSTEM', 'SEARCH_PATH')` returns the current session
search path, including invalid schemas present
in the list.
#### SCHEMA_NAME (DDL_TRIGGER)
`RDB$GET_CONTEXT('DDL_TRIGGER', 'SCHEMA_NAME')` returns the schema name
of the affected object in a DDL TRIGGER.
## Monitoring
Monitoring tables has now these informations related to schemas:
`MON$ATTACHMENTS`
- `MON$SEARCH_PATH`: search path of the attachment
`MON$TABLE_STATS`
- `MON$SCHEMA_NAME`: table schema
`MON$CALL_STACK`
- `MON$SCHEMA_NAME`: routine schema
`MON$COMPILED_STATEMENTS`
- `MON$SCHEMA_NAME`: routine schema
## Queries
Field aliases can now be qualified not only with the table name, but
with the schema too, even in the case of implicit
deduced schema name from the search path. It is also possible to qualify
table name with schema and use the column
only with the table name. For example:
```sql
create schema SCHEMA1;
create table SCHEMA1.TABLE1 (ID integer);
set search_path to SCHEMA1;
select
TABLE1.ID from SCHEMA1.TABLE1;
select
SCHEMA1.TABLE1.ID from TABLE1;
select
SCHEMA1.TABLE1.ID from SCHEMA1.TABLE1;
```
If the same table name is used from different schemas, fields should be
qualified with the schema names or aliases:
```sql
create schema SCHEMA1;
create schema SCHEMA2;
create table SCHEMA1.TABLE1 (ID integer);
create table SCHEMA2.TABLE1 (ID integer);
select
SCHEMA1.TABLE1.ID,
SCHEMA2.TABLE1.ID from SCHEMA1.TABLE1,
SCHEMA2.TABLE1;
select
S1.ID,
S2.ID from SCHEMA1.TABLE1 S1, SCHEMA2.TABLE1 S2;
```
### Plans
TODO:
## New DPB items
### isc_dpb_search_path
`isc_dpb_search_path` is a string DPB (like `isc_dpb_user_name`) that
sets the initial schema search path.
## Array support
### isc_sdl_schema
When dealing with arrays using SDL (Array Slice Description Language),
there is now `isc_sdl_schema` to explicitly
qualify the schema. Its format is the same as used with `isc_sdl_relation`.
## Utilities
### isql
#### Option -(SE)ARCH_PATH
This option makes ISQL pass the search path (as received by the OS) as
`isc_dpb_search_path` in every attachment stablished.
```
isql -search_path x,y t1.fdb
select RDB$GET_CONTEXT('SYSTEM', 'SEARCH_PATH') from system.rdb$database;
-- Result: "X", "Y"
set search_path to y;
select RDB$GET_CONTEXT('SYSTEM', 'SEARCH_PATH') from system.rdb$database;
-- Result: "Y"
connect 't2.fdb';
select RDB$GET_CONTEXT('SYSTEM', 'SEARCH_PATH') from system.rdb$database;
-- Result: "X", "Y"
```
```
isql -search_path '"x", "y"' t1.fdb
select RDB$GET_CONTEXT('SYSTEM', 'SEARCH_PATH') from system.rdb$database;
-- Result: "x", "y"
```
### gbak
To use databases created in previous Firebird versions with Firebird 6,
it is necessary to restore a backup in the new
version using the Firebird 6 `gbak`. The restored database will have all
users objects in the `PUBLIC` schema.
TODO: Should we add an option to give customized name to the `PUBLIC`
schema?
## System metadata changes
The following fields were added to system tables. It is important that
applications and tools that read metadata starts
to use them when appropriate, for example considering `RDB$SCHEMA_NAME`
when joining the tables.
| Table | Column |
|--------------------------|---------------------------------|
| MON$ATTACHMENTS | MON$SEARCH_PATH |
| MON$CALL_STACK | MON$SCHEMA_NAME |
| MON$COMPILED_STATEMENTS | MON$SCHEMA_NAME |
| MON$TABLE_STATS | MON$SCHEMA_NAME |
| RDB$CHARACTER_SETS | RDB$DEFAULT_COLLATE_SCHEMA_NAME |
| RDB$CHARACTER_SETS | RDB$SCHEMA_NAME |
| RDB$CHECK_CONSTRAINTS | RDB$SCHEMA_NAME |
| RDB$COLLATIONS | RDB$SCHEMA_NAME |
| RDB$DATABASE | RDB$CHARACTER_SET_SCHEMA_NAME |
| RDB$DEPENDENCIES | RDB$DEPENDED_ON_SCHEMA_NAME |
| RDB$DEPENDENCIES | RDB$DEPENDENT_SCHEMA_NAME |
| RDB$EXCEPTIONS | RDB$SCHEMA_NAME |
| RDB$FIELDS | RDB$SCHEMA_NAME |
| RDB$FIELD_DIMENSIONS | RDB$SCHEMA_NAME |
| RDB$FUNCTIONS | RDB$SCHEMA_NAME |
| RDB$FUNCTION_ARGUMENTS | RDB$FIELD_SOURCE_SCHEMA_NAME |
| RDB$FUNCTION_ARGUMENTS | RDB$RELATION_SCHEMA_NAME |
| RDB$FUNCTION_ARGUMENTS | RDB$SCHEMA_NAME |
| RDB$GENERATORS | RDB$SCHEMA_NAME |
| RDB$INDEX_SEGMENTS | RDB$SCHEMA_NAME |
| RDB$INDICES | RDB$FOREIGN_KEY_SCHEMA_NAME |
| RDB$INDICES | RDB$SCHEMA_NAME |
| RDB$PACKAGES | RDB$SCHEMA_NAME |
| RDB$PROCEDURES | RDB$SCHEMA_NAME |
| RDB$PROCEDURE_PARAMETERS | RDB$FIELD_SOURCE_SCHEMA_NAME |
| RDB$PROCEDURE_PARAMETERS | RDB$RELATION_SCHEMA_NAME |
| RDB$PROCEDURE_PARAMETERS | RDB$SCHEMA_NAME |
| RDB$PUBLICATION_TABLES | RDB$TABLE_SCHEMA_NAME |
| RDB$REF_CONSTRAINTS | RDB$CONST_SCHEMA_NAME_UQ |
| RDB$REF_CONSTRAINTS | RDB$SCHEMA_NAME |
| RDB$RELATIONS | RDB$SCHEMA_NAME |
| RDB$RELATION_CONSTRAINTS | RDB$SCHEMA_NAME |
| RDB$RELATION_FIELDS | RDB$FIELD_SOURCE_SCHEMA_NAME |
| RDB$RELATION_FIELDS | RDB$SCHEMA_NAME |
| RDB$SCHEMAS | RDB$CHARACTER_SET_NAME |
| RDB$SCHEMAS | RDB$CHARACTER_SET_SCHEMA_NAME |
| RDB$SCHEMAS | RDB$DESCRIPTION |
| RDB$SCHEMAS | RDB$OWNER_NAME |
| RDB$SCHEMAS | RDB$SCHEMA_NAME |
| RDB$SCHEMAS | RDB$SECURITY_CLASS |
| RDB$SCHEMAS | RDB$SYSTEM_FLAG |
| RDB$TRIGGERS | RDB$SCHEMA_NAME |
| RDB$TRIGGER_MESSAGES | RDB$SCHEMA_NAME |
| RDB$USER_PRIVILEGES | RDB$RELATION_SCHEMA_NAME |
| RDB$USER_PRIVILEGES | RDB$USER_SCHEMA_NAME |
| RDB$VIEW_RELATIONS | RDB$RELATION_SCHEMA_NAME |
| RDB$VIEW_RELATIONS | RDB$SCHEMA_NAME |
## Differences with previous versions
### CREATE SCHEMA in IAttachment::executeCreateDatabase and
isc_create_database
It was possible to use `CREATE SCHEMA` using the APIs functions
`IAttachment::executeCreateDatabase` and
`isc_create_database` to create databases. Not this is not allowed. The
only valid syntax is `CREATE DATABASE`.
### Object names in error messages
Object names present in error or informative messages are now qualified
and quoted in the parameters of the messages,
even for DIALECT 1 databases. For example:
```sql
SQL> create table TABLE1 (ID integer);
SQL> create table TABLE1 (ID integer);
Statement failed, SQLSTATE = 42S01
unsuccessful metadata update
-CREATE TABLE "PUBLIC"."TABLE1" failed
-Table "PUBLIC"."TABLE1" already exists
SQL> create schema "Weird ""Schema""";
SQL> create schema "Weird ""Schema""";
Statement failed, SQLSTATE = 42000
unsuccessful metadata update
-CREATE SCHEMA "Weird ""Schema""" failed
-Schema "Weird ""Schema""" already exists
```
### Object name parsing outside SQL
When dealing with object names in `isc_dpb_search_path`,
`isc_sdl_schema` and `MAKE_DBKEY`, object names follow the
same rules as when using in SQL, requiring quotes for names with special
or lower case characters.
For `MAKE_DBKEY`, unqualified names use the search path.
Previously, `MAKE_DBKEY` used exact table name inside its first
parameter and do not allowed usage of double-quotes for
special characters.
### Minimum page size
The minimal database page size was increased from 4096 to 8192 because
the old minimum was not enough to
fit changes in system indexes.
### Bult-in plugins
TODO:
## Downgrade compatibility
It's expected that Firebird 6 databases not using multiple users schemas
(for example, a Firebird 5 database just
migrated to Firebird 6 by `gbak`) would not be always downgradable to
previous Firebird versions using `gbak`.
The Firebird team will backport to Firebird 5 essential internal changes
to make that possible. This documentation
will be updated when this is ready.
----
Adriano