[feature-requests:#1563] Optimize selection serialization
Status: open
Group: Initial
Labels: Scintilla selection optimization
Created: Fri Jun 27, 2025 12:23 PM UTC by Zufu Liu
Last Updated: Fri Jun 27, 2025 12:23 PM UTC
Owner: Neil Hodgson
Some ideas to optimize selection serialization and deserialization, may not worth it.
sv.find(tag)
by change serialization to following form: [selType] [# mainRange ,] [anchor [v virtualSpace] [- caret [v virtualSpace]]] [, [anchor [v virtualSpace] [- caret [v virtualSpace]]]]
std::from_chars()
returns pointer after all digits, it can be used to advance std::string_view sv
to next tag or separator character.
serialization can be implemented with local buffer + std::to_chars()
(which returns pointer after last written byte), when local buffer does not have enough space to write a SelectionRange
, copy local buffer to result string.
number (only non-negative size_t
and Sci::Position
) can be serialized in more compact form, e.g. with following ULEB128 like encoding:
char *to_string(char *p, size_t value) noexcept {
do {
*p++ = static_cast<char>(0x80 | (value & 0x7f));
value >>= 7;
} while (value);
return p;
}
Sent from sourceforge.net because scintill...@googlegroups.com is subscribed to https://sourceforge.net/p/scintilla/feature-requests/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/scintilla/admin/feature-requests/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.
Putting the main range first may be OK but it adds an extra ','. It went at the end as it's not needed for some uses so could be optional.
Using from_chars_result::ptr
for advancement is good if it can be done without adding much more code. Its unlikely string_view::find
is expensive in any realistic scenarios.
Most of the detail level SelectionPosition::ToString
, SelectionRange::ToString
are going to fit inside the small string optimization limit and avoid allocation. Without some strong problem cases, complicating the code with explicit buffer management isn't worth it.
Readable, easily debugged text seems reasonable for the intensity of use of this feature.
to_string
above should include masking the last byte inside the function as its confusing otherwise. Unless it's relying on low-byte values always being structuring punctuation.
The main alternative I've considered is using a single string for the whole history with '\n' between items to make it easy to serialize the whole history to/from files. Either with explicit <index>:
prefixes on lines or many empty lines for multi-change steps. Since its mostly treated as a stack, almost all work occurs just on the end so there wouldn't be much scanning.
Putting the main range first may be OK but it adds an extra ','. It went at the end as it's not needed for some uses so could be optional.
Here is at least one SelectionRange
, the comma is needed to separate numbers. mainRange
is decoded first and used for both branches inside Selection(std::string_view sv)
, so it looks good to encode it first.
using sv.find()
(memchr
like, should has not performance problem) to separate numbers / ranges looks to me just decode from random position and discard any non-digits, so not strict.
Unless it's relying on low-byte values always being structuring punctuation.
All number has prefixed with a tag or separator, the binary result with above to_string
can't be saved to external file.
[feature-requests:#1563] Optimize selection serialization
Status: open
Group: Initial
Labels: Scintilla selection optimization
Created: Fri Jun 27, 2025 12:23 PM UTC by Zufu Liu
Last Updated: Sat Jun 28, 2025 10:51 PM UTC
Owner: Neil Hodgson
Patch without forward lookup, it makes Scintilla.dll smaller.
Attachments:
[feature-requests:#1563] Optimize selection serialization
Status: open
Group: Initial
Labels: Scintilla selection optimization
Created: Fri Jun 27, 2025 12:23 PM UTC by Zufu Liu
Last Updated: Sat Jun 28, 2025 11:26 PM UTC
Owner: Neil Hodgson
There's some sv.front()
that may need protection with !sv.empty()
.
The unit tests in scintilla\test\unit\testSelection.cxx
need updating.
[feature-requests:#1563] Optimize selection serialization
Status: open
Group: Initial
Labels: Scintilla selection optimization
Created: Fri Jun 27, 2025 12:23 PM UTC by Zufu Liu
Last Updated: Sun Jun 29, 2025 12:36 AM UTC
Owner: Neil Hodgson
Updated test code.
sv.empty()
check for mainRange
block can be omitted since here should be at least one digit after selType
or mainRange
.
Attachments:
[feature-requests:#1563] Optimize selection serialization
Status: open
Group: Initial
Labels: Scintilla selection optimization
Created: Fri Jun 27, 2025 12:23 PM UTC by Zufu Liu
Last Updated: Tue Jul 01, 2025 12:23 PM UTC
Owner: Neil Hodgson
Committed with [b19406].
[feature-requests:#1563] Optimize selection serialization
Status: open
Group: Committed
Labels: Scintilla selection optimization
Created: Fri Jun 27, 2025 12:23 PM UTC by Zufu Liu
Last Updated: Tue Jul 01, 2025 12:56 PM UTC
Owner: Neil Hodgson