What is actually the storage difference between stream and segmented blobs?

57 views
Skip to first unread message

Mark Rotteveel

unread,
May 16, 2025, 4:00:22 AMMay 16
to firebir...@googlegroups.com
From what I can tell from some experimenting, stream blobs are also
stored in segments. So what is the real difference?

And given both are (or at least, seem to be) segmented on storage, why
can I seek on a stream blob, and not on a segmented?

Mark
--
Mark Rotteveel

Vlad Khorsun

unread,
May 16, 2025, 4:25:42 AMMay 16
to firebir...@googlegroups.com
пт, 16 трав. 2025 р. о 11:00 'Mark Rotteveel' via firebird-devel <firebir...@googlegroups.com> пише:
 From what I can tell from some experimenting, stream blobs are also
stored in segments. So what is the real difference?

  Stream blobs are not stored in segments. It is represented at client side as if it was stored in 
segments,though.
 
And given both are (or at least, seem to be) segmented on storage, why
can I seek on a stream blob, and not on a segmented?

  Because stream blobs are not stored in segments ;) It allows to use "direct" way for seek,
not traversing segments one-by-one.

Regards,
Vlad

Mark Rotteveel

unread,
May 16, 2025, 5:06:24 AMMay 16
to firebir...@googlegroups.com
On 16/05/2025 10:25, Vlad Khorsun wrote:
> пт, 16 трав. 2025 р. о 11:00 'Mark Rotteveel' via firebird-devel
> <firebir...@googlegroups.com <mailto:firebird-
> de...@googlegroups.com>> пише:
>
>  From what I can tell from some experimenting, stream blobs are also
> stored in segments. So what is the real difference?
>
>
>   Stream blobs are not stored in segments. It is represented at client
> side as if it was stored in
> segments,though.

That is not why I observe. If I write a stream blob in 3 puts of 16384
bytes, and then select it, I receive an inline blob with 3 segments of
16384 bytes. If I write it in puts of 16383 bytes, I receive one with 3
segments of 16383 bytes, and so on. If it was stored as a continuous
byte stream, I would expect an inline blob with 1 segment of 3 * <put size>.

Similar when I access it as a normal blob.

Mark
--
Mark Rotteveel

Vlad Khorsun

unread,
May 16, 2025, 5:13:19 AMMay 16
to firebir...@googlegroups.com
пт, 16 трав. 2025 р. о 12:06 'Mark Rotteveel' via firebird-devel <firebir...@googlegroups.com> пише:
  Engine tracks maximum "segment" size used when application writes blob data
and then returns blob data using that maximum segment size. Can't say I like it
or see as necessary. Since it is not affects network buffering, I consider it as
harmless. While useless :)

Regards,
Vlad

Mark Rotteveel

unread,
May 16, 2025, 5:29:39 AMMay 16
to firebir...@googlegroups.com
On 16/05/2025 11:12, Vlad Khorsun wrote:
>   Engine tracks maximum "segment" size used when application writes
> blob data
> and then returns blob data using that maximum segment size. Can't say I
> like it
> or see as necessary. Since it is not affects network buffering, I
> consider it as
> harmless. While useless :)

Ah, OK. That explains it. Thanks.

It does somewhat affect network efficiency though, because if a large
stream blob was created/written inefficiently in small increments, then
retrieval will also be inefficient due to the overhead of segment
lengths inflating the transfer size.

For a test I was first trying to create a blob with N characters from an
EXECUTE BLOCK using RPAD, but that resulted in receiving a blob with N
segments of 1 character (though I can't recall if I managed to create a
stream blob that way or only a segmented blob).

Similarly, if I create it by using RDB$BLOB_UTIL to create a stream blob
and use BLOB_APPEND to construct a larger blob, and the value I append
is X characters, then you'll receive N/X segments of (up to) X bytes. If
X is small compared to N, then you'll also have a high overhead due to
segment lengths.

Mark
--
Mark Rotteveel

Dmitry Yemanov

unread,
May 16, 2025, 6:53:44 AMMay 16
to firebir...@googlegroups.com
16.05.2025 12:12, Vlad Khorsun пишет:
>
>   Engine tracks maximum "segment" size used when application writes
> blob data
> and then returns blob data using that maximum segment size. Can't say I
> like it
> or see as necessary. Since it is not affects network buffering, I
> consider it as harmless. While useless :)

API entries (even at the engine level) are not so cheap, so less
getSegment() calls with bigger resulting buffers would be faster, at
least in the embedded mode.


Dmitry

Vlad Khorsun

unread,
May 17, 2025, 4:31:05 AMMay 17
to firebir...@googlegroups.com
16.05.2025 13:53, Dmitry Yemanov:
So, its a time to unconditionally set blb::blb_max_segment to MAX_USHORT for stream blobs ?

Regards,
Vlad

Vlad Khorsun

unread,
May 17, 2025, 4:35:08 AMMay 17
to firebir...@googlegroups.com
16.05.2025 12:29, 'Mark Rotteveel' via firebird-devel:
> On 16/05/2025 11:12, Vlad Khorsun wrote:
>>    Engine tracks maximum "segment" size used when application writes blob data
>> and then returns blob data using that maximum segment size. Can't say I like it
>> or see as necessary. Since it is not affects network buffering, I consider it as
>> harmless. While useless :)
>
> Ah, OK. That explains it. Thanks.
>
> It does somewhat affect network efficiency though, because if a large stream blob was created/written inefficiently in small
> increments, then retrieval will also be inefficient due to the overhead of segment lengths inflating the transfer size.
>
> For a test I was first trying to create a blob with N characters from an EXECUTE BLOCK using RPAD, but that resulted in receiving a
> blob with N segments of 1 character (though I can't recall if I managed to create a stream blob that way or only a segmented blob).

RPAD/LPAD creates a segmented blobs. I see no problem if its could be changed
to stream blobs instead. Opinions ?

> Similarly, if I create it by using RDB$BLOB_UTIL to create a stream blob and use BLOB_APPEND to construct a larger blob, and the
> value I append is X characters, then you'll receive N/X segments of (up to) X bytes. If X is small compared to N, then you'll also
> have a high overhead due to segment lengths.

Agree. I see no much sence to keep "maximum segment size" value for stream blobs.


Regards,
Vlad

Mark Rotteveel

unread,
May 17, 2025, 7:25:45 AMMay 17
to firebir...@googlegroups.com
On 17/05/2025 10:35, Vlad Khorsun wrote:
> 16.05.2025 12:29, 'Mark Rotteveel' via firebird-devel:
>> For a test I was first trying to create a blob with N characters from
>> an EXECUTE BLOCK using RPAD, but that resulted in receiving a blob
>> with N segments of 1 character (though I can't recall if I managed to
>> create a stream blob that way or only a segmented blob).
>
>   RPAD/LPAD creates a segmented blobs. I see no problem if its could be
> changed
> to stream blobs instead. Opinions ?

Yes, but when I experimented with this, I also did some additional
things to try and create a stream blob, but I'm not sure if I also tried
that with RPAD or not.

In any case, making all internal functions produce stream blobs instead
of segmented blobs sounds like a good idea to me (or at least, I can't
think of any issues that could have for clients).

>> Similarly, if I create it by using RDB$BLOB_UTIL to create a stream
>> blob and use BLOB_APPEND to construct a larger blob, and the value I
>> append is X characters, then you'll receive N/X segments of (up to) X
>> bytes. If X is small compared to N, then you'll also have a high
>> overhead due to segment lengths.
>
>   Agree. I see no much sence to keep "maximum segment size" value for
> stream blobs.

Or, for backwards compatibility (?), maybe just set use min(65533,
total-length) for stream blobs?

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
May 17, 2025, 7:27:23 AMMay 17
to firebir...@googlegroups.com
On 17/05/2025 10:31, Vlad Khorsun wrote:
> 16.05.2025 13:53, Dmitry Yemanov:
>> API entries (even at the engine level) are not so cheap, so less
>> getSegment() calls with bigger resulting buffers would be faster, at
>> least in the embedded mode.
>
>   So, its a time to unconditionally set blb::blb_max_segment to
> MAX_USHORT for stream blobs ?

Technically, MAX_USHORT - 2 given a get of MAX_USHORT will return a 2
bytes length + 65533 data

Mark
--
Mark Rotteveel

Vlad Khorsun

unread,
May 18, 2025, 2:17:09 PMMay 18
to firebir...@googlegroups.com
17.05.2025 14:27, 'Mark Rotteveel' via firebird-devel:
AFAIK, Firebird works with blob segments of MAX_USHORT size without a problem.

Regards,
Vlad

Vlad Khorsun

unread,
May 18, 2025, 2:19:03 PMMay 18
to firebir...@googlegroups.com
17.05.2025 14:25, 'Mark Rotteveel' via firebird-devel:
> On 17/05/2025 10:35, Vlad Khorsun wrote:
>> 16.05.2025 12:29, 'Mark Rotteveel' via firebird-devel:
>>> For a test I was first trying to create a blob with N characters from an EXECUTE BLOCK using RPAD, but that resulted in receiving
>>> a blob with N segments of 1 character (though I can't recall if I managed to create a stream blob that way or only a segmented
>>> blob).
>>
>>    RPAD/LPAD creates a segmented blobs. I see no problem if its could be changed
>> to stream blobs instead. Opinions ?
>
> Yes, but when I experimented with this, I also did some additional things to try and create a stream blob, but I'm not sure if I
> also tried that with RPAD or not.
>
> In any case, making all internal functions produce stream blobs instead of segmented blobs sounds like a good idea to me (or at
> least, I can't think of any issues that could have for clients).

Me too.

>>> Similarly, if I create it by using RDB$BLOB_UTIL to create a stream blob and use BLOB_APPEND to construct a larger blob, and the
>>> value I append is X characters, then you'll receive N/X segments of (up to) X bytes. If X is small compared to N, then you'll
>>> also have a high overhead due to segment lengths.
>>
>>    Agree. I see no much sence to keep "maximum segment size" value for stream blobs.
>
> Or, for backwards compatibility (?), maybe just set use min(65533, total-length) for stream blobs?

Could you explain - what kind of backward compatibility you worried about ?

Regards,
Vlad

Vlad Khorsun

unread,
May 18, 2025, 2:22:05 PMMay 18
to firebir...@googlegroups.com
>>>> Similarly, if I create it by using RDB$BLOB_UTIL to create a stream blob and use BLOB_APPEND to construct a larger blob, and the
>>>> value I append is X characters, then you'll receive N/X segments of (up to) X bytes. If X is small compared to N, then you'll
>>>> also have a high overhead due to segment lengths.
>>>
>>>    Agree. I see no much sence to keep "maximum segment size" value for stream blobs.
>>
>> Or, for backwards compatibility (?), maybe just set use min(65533, total-length) for stream blobs?
>
>   Could you explain - what kind of backward compatibility you worried about ?

I agree, of course, that "maximum segment size" should not be greater than "total size".


Regards,
Vlad

Adriano dos Santos Fernandes

unread,
May 18, 2025, 5:50:35 PMMay 18
to firebir...@googlegroups.com
On 17/05/2025 08:25, 'Mark Rotteveel' via firebird-devel wrote:
>
> In any case, making all internal functions produce stream blobs instead
> of segmented blobs sounds like a good idea to me (or at least, I can't
> think of any issues that could have for clients).
>

I'm not sure this may cause problems with MBCS text blobs.

If something produces a blob in segments and each segment has a set of
complete characters, readers will read complete set of characters. If
it's stream blob, that may not be true, they may get buffers with
incomplete characters.


Adriano

Mark Rotteveel

unread,
May 19, 2025, 3:27:04 AMMay 19
to firebir...@googlegroups.com
Say a driver implementation or client application requests info item
`isc_info_blob_max_segment` and for some reason uses that to size its
get segment call. If now you would no longer send that item in the info
response, or send a value of 0, or -1 or is unreasonable (e.g. very
small) or otherwise not possible to use for get segment, you'd suddenly
break those client application.

If instead you'd respond with a value that is still in a valid range
((0, 65533] but preferably closer to 65533 than to 0) and "correct" (<=
`isc_info_blob_total_length`), you'd maintain compatibility for clients
that use it in such a way (or another way we can't foresee).

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
May 19, 2025, 3:37:08 AMMay 19
to firebir...@googlegroups.com
That could happen now as well, because a get segment call can and does
split up segments if needed to fit in the requested size, and such a
split could happen in the middle of a multibyte character.

On a related note, I also find it odd that the get segment response even
includes segment lengths, it makes reading blobs more complicated client
side than should be needed.

Basically the p_resp_data of a get segment response is now

<int32-buffer-size><int16LE-segment-size><segment><int16LE-segment-size><segment>...

while - in a new protocol or maybe a new operation code - it could just
as well be:

<int32-buffer-size><data>

You'd save N*2 bytes, and not require a client to extract segments from
the buffer.

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
May 19, 2025, 3:39:23 AMMay 19
to firebir...@googlegroups.com
Yes, on put, but not on get. On get, the maximum size you can request is
MAX_USHORT, but that is used as the response size, and that response has
N segments, and thus the actual data size is N*2 bytes shorter.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 19, 2025, 4:09:01 AMMay 19
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 19.05.2025 9:26:
> If instead you'd respond with a value that is still in a valid range ((0, 65533]
> but preferably closer to 65533 than to 0) and "correct" (<=
> `isc_info_blob_total_length`), you'd maintain compatibility for clients that use
> it in such a way (or another way we can't foresee).

This value don't have to be "correct", delivering of segments smaller that
maximum is a perfectly valid case, so hardcoded MAX_USHORT as a maximum segment
size should be fine for stream BLOBs.

> Yes, on put, but not on get. On get, the maximum size you can request is MAX_USHORT, but that is used as the response size, and that response has N segments, and thus the actual data size is N*2 bytes shorter.

From application POV segments of size MAX_USHORT are delivered fine so I
guess at network level they are transferred by two responses.

--
WBR, SD.

Mark Rotteveel

unread,
May 19, 2025, 4:13:33 AMMay 19
to firebir...@googlegroups.com
On 19/05/2025 10:08, 'Dimitry Sibiryakov' via firebird-devel wrote:
> 'Mark Rotteveel' via firebird-devel wrote 19.05.2025 9:26:
>> If instead you'd respond with a value that is still in a valid range
>> ((0, 65533] but preferably closer to 65533 than to 0) and
>> "correct" (<= `isc_info_blob_total_length`), you'd maintain
>> compatibility for clients that use it in such a way (or another way we
>> can't foresee).
>
>   This value don't have to be "correct", delivering of segments smaller
> that maximum is a perfectly valid case, so hardcoded MAX_USHORT as a
> maximum segment size should be fine for stream BLOBs.

Whether it needs some semblance of correctness highly depends on what a
client application or driver implementation actually uses
`isc_info_blob_max_segment` for. And given we don't *know* what our
users do with it, it's better to be safe than sorry.

>> Yes, on put, but not on get. On get, the maximum size you can request
>> is MAX_USHORT, but that is used as the response size, and that
>> response has N segments, and thus the actual data size is N*2 bytes
>> shorter.
>
>   From application POV segments of size MAX_USHORT are delivered fine
> so I guess at network level they are transferred by two responses.

No they are not, that would require two get requests, which is not
necessarily efficient.

Maybe fbclient does some of its own magic there to do it in two
requests, but it's not how the protocol itself works.

Mark
--
Mark Rotteveel

Vlad Khorsun

unread,
May 19, 2025, 12:30:14 PMMay 19
to firebir...@googlegroups.com
19.05.2025 10:26, 'Mark Rotteveel' via firebird-devel:
> On 18/05/2025 20:18, Vlad Khorsun wrote:
>> 17.05.2025 14:25, 'Mark Rotteveel' via firebird-devel:
>>> On 17/05/2025 10:35, Vlad Khorsun wrote:
>>>>    Agree. I see no much sence to keep "maximum segment size" value for stream blobs.
>>>
>>> Or, for backwards compatibility (?), maybe just set use min(65533, total-length) for stream blobs?
>>
>>    Could you explain - what kind of backward compatibility you worried about ?
>
> Say a driver implementation or client application requests info item `isc_info_blob_max_segment` and for some reason uses that to
> size its get segment call. If now you would no longer send that item in the info response, or send a value of 0, or -1 or is
> unreasonable (e.g. very small) or otherwise not possible to use for get segment, you'd suddenly break those client application.

Of course I never offered to return 0 or -1 as max_segment :)

> If instead you'd respond with a value that is still in a valid range ((0, 65533] but preferably closer to 65533 than to 0) and
> "correct" (<= `isc_info_blob_total_length`), you'd maintain compatibility for clients that use it in such a way (or another way we
> can't foresee).

MAX_USHORT also perfectly valid value. Thus, I consider MIN(MAX_USHORT, total_length) as correct value
for "max_segment" of stream blob.

Regards,
Vlad

Vlad Khorsun

unread,
May 19, 2025, 12:42:24 PMMay 19
to firebir...@googlegroups.com
19.05.2025 11:13, 'Mark Rotteveel' via firebird-devel:
Efficiency depends on total_size of blob. And there is no way to find ideal "max segment size"
to avoid last op_get_segment that not utilized whole receive buffer ;)

For example, user stored a stream blob of size 65535. Accordingly to your offer, on read,
server returns max_segment_size = 65533. And application will issue two get_segment() calls
to read this blob. Same as with max_segment_size = 65535.

> Maybe fbclient does some of its own magic there to do it in two requests, but it's not how the protocol itself works.

Yes, fbclient does it transparently for user apps.

Regards,
Vlad

Vlad Khorsun

unread,
May 19, 2025, 12:49:26 PMMay 19
to firebir...@googlegroups.com
19.05.2025 10:37, 'Mark Rotteveel' via firebird-devel:
> On 18/05/2025 23:50, Adriano dos Santos Fernandes wrote:
>> On 17/05/2025 08:25, 'Mark Rotteveel' via firebird-devel wrote:
>>>
>>> In any case, making all internal functions produce stream blobs instead
>>> of segmented blobs sounds like a good idea to me (or at least, I can't
>>> think of any issues that could have for clients).
>>>
>>
>> I'm not sure this may cause problems with MBCS text blobs.
>>
>> If something produces a blob in segments and each segment has a set of
>> complete characters, readers will read complete set of characters. If
>> it's stream blob, that may not be true, they may get buffers with
>> incomplete characters.
>
> That could happen now as well, because a get segment call can and does split up segments if needed to fit in the requested size, and
> such a split could happen in the middle of a multibyte character.
>
> On a related note, I also find it odd that the get segment response even includes segment lengths, it makes reading blobs more
> complicated client side than should be needed.

It is absolutely necessary for segmented blobs if you don't want to get each
segment within separate roundtrip. Current implementation receives buffer of up
to 64KB of blob data that could contain many blob segments. Yes, it makes things
a bit more complex, but for a good reason.

> Basically the p_resp_data of a get segment response is now
>
> <int32-buffer-size><int16LE-segment-size><segment><int16LE-segment-size><segment>...
>
> while - in a new protocol or maybe a new operation code - it could just as well be:
>
> <int32-buffer-size><data>
>
> You'd save N*2 bytes, and not require a client to extract segments from the buffer.

This is possible for stream blobs only and will break backward compatibility, as
all existing fbclient's expects to get data buffer with segment length inside.

Regards,
Vlad

Vlad Khorsun

unread,
May 19, 2025, 12:51:28 PMMay 19
to firebir...@googlegroups.com
19.05.2025 10:39, 'Mark Rotteveel' via firebird-devel:
You speak about internals of wire protocol, while I'm speak about API :)

Regards,
Vlad

Mark Rotteveel

unread,
May 19, 2025, 12:58:39 PMMay 19
to firebir...@googlegroups.com
On 19/05/2025 18:49, Vlad Khorsun wrote:
> 19.05.2025 10:37, 'Mark Rotteveel' via firebird-devel:
>> On a related note, I also find it odd that the get segment response
>> even includes segment lengths, it makes reading blobs more complicated
>> client side than should be needed.
>
>   It is absolutely necessary for segmented blobs if you don't want to
> get each
> segment within separate roundtrip. Current implementation receives
> buffer of up
> to 64KB of blob data that could contain many blob segments. Yes, it
> makes things
> a bit more complex, but for a good reason.

I still don't see the reason for this. The server needs to stuff the
segments into a single buffer with segment lengths anyway, so it can
just as well do the same thing _without_ including those segment lengths.

In other words, the fact that you receive multiple segments is not a
reason why the response needs to include the segment lengths. And
contrary to your claim, leaving out the segment lengths would not
necessitate only sending a single segment.

And given the server already chops up segments if needed, it doesn't
even (always) preserve the same boundaries as used on write, so getting
something a lot bigger is perfectly fine.

>> Basically the p_resp_data of a get segment response is now
[..]
>> You'd save N*2 bytes, and not require a client to extract segments
>> from the buffer.
>
>   This is possible for stream blobs only and will break backward
> compatibility, as
> all existing fbclient's expects to get data buffer with segment length
> inside.

Which is why I said "while - in a new protocol or maybe a new operation
code - it could just as well be [...]"

Mark
--
Mark Rotteveel

Mark Rotteveel

unread,
May 19, 2025, 1:00:11 PMMay 19
to firebir...@googlegroups.com
On 19/05/2025 18:51, Vlad Khorsun wrote:
> 19.05.2025 10:39, 'Mark Rotteveel' via firebird-devel:
>> Yes, on put, but not on get. On get, the maximum size you can request
>> is MAX_USHORT, but that is used as the response size, and that
>> response has N segments, and thus the actual data size is N*2 bytes
>> shorter.
>
>   You speak about internals of wire protocol, while I'm speak about API :)

As I've said before, to me, the wire protocol *is* the API.

Mark
--
Mark Rotteveel

Vlad Khorsun

unread,
May 19, 2025, 1:09:22 PMMay 19
to firebir...@googlegroups.com
19.05.2025 19:58, 'Mark Rotteveel' via firebird-devel:
> On 19/05/2025 18:49, Vlad Khorsun wrote:
>> 19.05.2025 10:37, 'Mark Rotteveel' via firebird-devel:
>>> On a related note, I also find it odd that the get segment response even includes segment lengths, it makes reading blobs more
>>> complicated client side than should be needed.
>>
>>    It is absolutely necessary for segmented blobs if you don't want to get each
>> segment within separate roundtrip. Current implementation receives buffer of up
>> to 64KB of blob data that could contain many blob segments. Yes, it makes things
>> a bit more complex, but for a good reason.
>
> I still don't see the reason for this. The server needs to stuff the segments into a single buffer with segment lengths anyway, so
> it can just as well do the same thing _without_ including those segment lengths.

Are we both speak about segmented blobs here ? Client should get segments of exactly
that length that was put.

> In other words, the fact that you receive multiple segments is not a reason why the response needs to include the segment lengths.
> And contrary to your claim, leaving out the segment lengths would not necessitate only sending a single segment.
>
> And given the server already chops up segments if needed, it doesn't even (always) preserve the same boundaries as used on write, so
> getting something a lot bigger is perfectly fine.

Still about segmented blobs ?

>>> Basically the p_resp_data of a get segment response is now
> [..]
>>> You'd save N*2 bytes, and not require a client to extract segments from the buffer.
>>
>>    This is possible for stream blobs only and will break backward compatibility, as
>> all existing fbclient's expects to get data buffer with segment length inside.
>
> Which is why I said "while - in a new protocol or maybe a new operation code - it could just as well be [...]"

Ah, ok. Missed it somewhere ;)

With new protocol/op code we can (should?) relax limit on response size.

Regards,
Vlad

Mark Rotteveel

unread,
May 19, 2025, 1:27:10 PMMay 19
to firebir...@googlegroups.com
On 19/05/2025 19:09, Vlad Khorsun wrote:
> 19.05.2025 19:58, 'Mark Rotteveel' via firebird-devel:
>> I still don't see the reason for this. The server needs to stuff the
>> segments into a single buffer with segment lengths anyway, so it can
>> just as well do the same thing _without_ including those segment lengths.
>
>   Are we both speak about segmented blobs here ? Client should get
> segments of exactly
> that length that was put.

And that doesn't happen. If a blob was put with a segment size of 32k,
and a client does gets with a size of 1k: you'll receive segments of 1k.
Similar with my favourite example of putting in segments of 65535 bytes,
you'll get a response with a segment of 65533 bytes, and the next get
response has two segments, one of 2 bytes and one of 65529 bytes, and so on.

>> In other words, the fact that you receive multiple segments is not a
>> reason why the response needs to include the segment lengths. And
>> contrary to your claim, leaving out the segment lengths would not
>> necessitate only sending a single segment.
>>
>> And given the server already chops up segments if needed, it doesn't
>> even (always) preserve the same boundaries as used on write, so
>> getting something a lot bigger is perfectly fine.
>
>   Still about segmented blobs ?

Yes.

Mark
--
Mark Rotteveel

Vlad Khorsun

unread,
May 19, 2025, 1:37:01 PMMay 19
to firebir...@googlegroups.com
19.05.2025 20:27, 'Mark Rotteveel' via firebird-devel:
> On 19/05/2025 19:09, Vlad Khorsun wrote:
>> 19.05.2025 19:58, 'Mark Rotteveel' via firebird-devel:
>>> I still don't see the reason for this. The server needs to stuff the segments into a single buffer with segment lengths anyway,
>>> so it can just as well do the same thing _without_ including those segment lengths.
>>
>>    Are we both speak about segmented blobs here ? Client should get segments of exactly
>> that length that was put.
>
> And that doesn't happen. If a blob was put with a segment size of 32k, and a client does gets with a size of 1k: you'll receive
> segments of 1k.

And return code about incomplete segment - IStatus::RESULT_SEGMENT (isc_segment). And it will
receive parts of segment with this code until get it completely. And if last part of blob segment
will be 1 byte, get_segment() will return 1 byte, despite of user buffer size.

> Similar with my favourite example of putting in segments of 65535 bytes, you'll get a response with a segment of
> 65533 bytes, and the next get response has two segments, one of 2 bytes and one of 65529 bytes, and so on.

Yes. And I see no problem with it.

>>> In other words, the fact that you receive multiple segments is not a reason why the response needs to include the segment
>>> lengths. And contrary to your claim, leaving out the segment lengths would not necessitate only sending a single segment.
>>>
>>> And given the server already chops up segments if needed, it doesn't even (always) preserve the same boundaries as used on write,
>>> so getting something a lot bigger is perfectly fine.
>>
>>    Still about segmented blobs ?
>
> Yes.

How do you return to user 3 segments of 5, 7, and 11 bytes if response buffer contains
no segment lengths ? Here we speak about API, not protocol.

Regards,
Vlad

Mark Rotteveel

unread,
May 19, 2025, 1:52:54 PMMay 19
to firebir...@googlegroups.com
On 19/05/2025 19:36, Vlad Khorsun wrote:
>   How do you return to user 3 segments of 5, 7, and 11 bytes if
> response buffer contains
> no segment lengths ? Here we speak about API, not protocol.

I don't. I absolutely don't care about individual segments, except for
the pain in the ass they are, nor have I ever heard from any Jaybird
user being interested in them.

Mark
--
Mark Rotteveel

Vlad Khorsun

unread,
May 19, 2025, 1:56:09 PMMay 19
to firebir...@googlegroups.com
19.05.2025 20:52, 'Mark Rotteveel' via firebird-devel:
:)

Regards,
Vlad

Alex Peshkoff

unread,
May 20, 2025, 7:06:32 AMMay 20
to firebir...@googlegroups.com
Mark, one simple sample where use of segmented BLOB is very useful.
There is field RDB$RUNTIME in RDB$RELATIONS. It's BLOB containing
compressed runtime info about fields of that relation (from name to
various BLR like default & validation). Each info item is separate
segment. Very fast and easy to read.


Mark Rotteveel

unread,
May 20, 2025, 7:10:26 AMMay 20
to firebir...@googlegroups.com
On 20/05/2025 13:06, Alex Peshkoff wrote:
> Mark, one simple sample where use of segmented BLOB is very useful.
> There is field RDB$RUNTIME in RDB$RELATIONS. It's BLOB containing
> compressed runtime info about fields of that relation (from name to
> various BLR like default & validation). Each info item is separate
> segment. Very fast and easy to read.

OK, that is an interesting thing, but I'd still prefer such delineation
to exist in the actual data itself (maybe it has that as well? I didn't
look at it specifically), and not as an artifact of it being stored in a
certain way and the API extracting it that way, but that's just me.

Mark
--
Mark Rotteveel

Dmitry Yemanov

unread,
May 20, 2025, 9:52:10 AMMay 20
to firebir...@googlegroups.com
20.05.2025 14:06, Alex Peshkoff wrote:
>
> Mark, one simple sample where use of segmented BLOB is very useful.
> There is field RDB$RUNTIME in RDB$RELATIONS. It's BLOB containing
> compressed runtime info about fields of that relation (from name to
> various BLR like default & validation). Each info item is separate
> segment. Very fast and easy to read.

I suppose something similar is also handy for arrays (reading them by
slices) that are known to be implemented over blobs.


Dmitry

Jim Starkey

unread,
May 20, 2025, 12:29:07 PMMay 20
to firebir...@googlegroups.com
Question:  Has anybody ever used the slice mechanism?  Is it actually
useful?

When I designed the array feature, a good size array was larger than
most workstation RAM, so it intuitively felt like a requirement.
--
Jim Starkey, AmorphousDB, LLC

Pavel Cisar

unread,
May 20, 2025, 12:37:44 PMMay 20
to firebir...@googlegroups.com
Dne 20. 05. 25 v 18:29 Jim Starkey napsal(a):
> Question:  Has anybody ever used the slice mechanism?  Is it actually
> useful?

Well, as ARRAY support in connectivity libraries was not a common thing,
probably nobody uses ARRAYs (and thus slices). I had to deal with ARRAYs
because I wanted to implement support in Python driver(s), but never
used it. And driver unit tests do not use slices either.

regards
Pavel Cisar

Dmitry Yemanov

unread,
May 20, 2025, 12:55:30 PMMay 20
to firebir...@googlegroups.com
20.05.2025 19:29, Jim Starkey wrote:

> Question:  Has anybody ever used the slice mechanism?  Is it actually
> useful?

Nowadays developers seem to need arrays mostly inside stored
procedures/functions, so they tend to conclude that we don't support
arrays ;-) I'm afraid it would be hard to find anyone using arrays
beyond basic tests.


Dmitry

Jim Starkey

unread,
May 20, 2025, 1:09:10 PMMay 20
to firebir...@googlegroups.com
The place where arrays have become critical are to support HNSW indexes
for anything approaching AI functionality.  But in this context, slices
are utterly useless.

Arrays were original implemented to support the Boeing noise group,
which had acoustic sensor laid out the length of a runway and capturing
time series from each sensor resulting in a large two dimensional
array.  Each part of the analysis required only a slice, hence the
feature.  Boeing Commercial Aircraft was one of Interbase's best
friend.  Boeing Computing Services, on the other hand, was Interbase's
worst enemy, though only because a) we weren't Oracle, and b) Boeing
Commercial Aircraft loved us.  Among the best things that ever happened
to Interbase was the cancellation of the 7J7 (ducted fan) project, which
spread Interbase loving engineers all over the company.

But I digress.

Adriano dos Santos Fernandes

unread,
Jun 9, 2025, 8:23:54 PMJun 9
to firebir...@googlegroups.com
On 16/05/2025 05:00, 'Mark Rotteveel' via firebird-devel wrote:
> From what I can tell from some experimenting, stream blobs are also
> stored in segments. So what is the real difference?
>
> And given both are (or at least, seem to be) segmented on storage, why
> can I seek on a stream blob, and not on a segmented?
>

This case may interest you:

https://github.com/FirebirdSQL/firebird/pull/8591


Adriano

Reply all
Reply to author
Forward
0 new messages