Save and restore undo history

141 views
Skip to first unread message

Neil Hodgson

unread,
Feb 10, 2024, 1:10:29 AMFeb 10
to Scintilla mailing list
APIs have been implemented to allow access to the undo history. This may be used to save and restore the undo state between sessions so the user may more transparently continue from their previous work.

This feature has been long requested but I didn't want to add it before the memory reduction also announced today as the implementation affects the efficiency of different access patterns.

There are 3 important points within an undo history that should be saved and restored: the current point (where the document currently is); the save point where the document was last saved or loaded; and the tentative point which marks the start of tentative actions. Tentative actions are used for short-term language (like Japanese or Chinese) input mode states and must be synchronized with the IME (input method editor). IME states are difficult to reproduce programmatically so it's uncertain what to do with tentative actions.

There may not be a reachable save point - when a file is saved; undo is performed; then new actions taken that break the connection to the saved text. This state is termed 'detached' within Scintilla and is represented by a save point of -1.

There are two main scenarios supported although they are similar and both may be offered together: file-associated undo histories; and session persistence.

For file-associated undo histories, when a file is saved, an undo session file is also saved with the state of the undo history at that point so save-point==current-point in the history. This is currently the best supported technique.

For session persistence, a session may be saved either periodically or on user command. Saving a session does not necessarily save the files in that session to their home location. instead there may be shadow copies of the file and its undo history somewhere else, such as a session directory. In this case the save point may not be the same as the current point in the undo history. This is currently less supported but may receive further work.

There is some demonstration code that can be added to SciTE on Windows, replacing the Load and Save Session behaviour, that can be used to experiment with this feature. This uses a text format with some global settings then a list of insertions and deletions. There is no support for container-defined undo actions as these are not implemented by SciTE. An example file:

# SciTE undo session file
file=C:\u\Patches\Features\New Undo\UndoStuff.txt
checksum=22669
actions=11
save=11
current=11
tentative=-1
+ 124,M
+ 125,o
+ 126,r
+ 127,e
+. 128,
-. 329,ScaledVector
-. 65,history
+ 30,
+ 31,A
+ 32,B
+. 33,C

The file setting is the path to the original file that this is a history for. The checksum is to ensure that the undo history is applied only to the file that it was saved with as even a slight change to the file will make it fail to match the undo history. A mismatch will make undo behave unexpectedly or fail. The checksum is currently a simple addition of all the bytes in the file but should be something better like CRC32. Then comes the total number of undo actions and the save point, current point and tentative point. Then there is a list of insertions '+' and deletions '-'. Following the '+' or '-' is a space for an incomplete user operation or '.' for a completed user operation. Then comes a position, ',' and the text of the change (escaped to prevent control characters) terminated by a newline.

In this example, the user typed "More ", deleted instances of "ScaledVector" and "history", then typed " ABC" and saved the file. There are 11 undo actions but only 4 user operations so pressing the undo key 4 times will undo all the changes.

The format is just an example and applications may prefer binary or compressed formats. The type could also be saved as its numeric value (likely more reliant to changes) instead of separating it into '+' or '-' and ' ' or '.'. I just found the current format easier to understand.

There are two sets of APIs: retrieval and restoration.

Retrieval

The APIs are SCI_GETUNDOACTIONS, SCI_GETUNDOSAVEPOINT, SCI_GETUNDOCURRENT, SCI_GETUNDOTENTATIVE, SCI_GETUNDOACTIONTYPE, SCI_GETUNDOACTIONPOSITION, and SCI_GETUNDOACTIONTEXT. The 3 SCI_GETUNDOACTION* APIs for retrieving the state of each action must be called with an ascending series of action numbers as shown in the code example and discussed further on.

Example code for saving the undo history into this format:


//SaveSessionFile(saveName);
std::ofstream ofs(saveName, std::ios::binary);
if (!ofs) {
return;
}
ofs << "# SciTE undo session file\n";
const int actions = wEditor.UndoActions();
ofs << "file=" << filePath.AsUTF8() << "\n";
ofs << "checksum=" << CheckSumDocument(wEditor) << "\n";
ofs << "actions=" << actions << "\n";
ofs << "save=" << wEditor.UndoSavePoint() << "\n";
ofs << "current=" << wEditor.UndoCurrent() << "\n";
ofs << "tentative=" << wEditor.UndoTentative() << "\n";
for (int act = 0; act < actions; act++) {
const int type = wEditor.UndoActionType(act);
const int actType = type & 0xff;
if (actType == 0 || actType == 1) {
// Only insertions and deletions recorded, not container actions
const char open = ((type & 0x100) != 0) ? ' ' : '.';
const SA::Position pos = wEditor.UndoActionPosition(act);
const std::string text = wEditor.UndoActionText(act);
const std::string escaped = Slash(text, false);
ofs << (actType ? '-' : '+') << open << " " << pos << "," << escaped << "\n";
}
}


Restoral

The APIs are SCI_SETUNDOSAVEPOINT, SCI_SETUNDOCURRENT, SCI_SETUNDOTENTATIVE, SCI_PUSHUNDOACTIONTYPE, and SCI_CHANGELASTUNDOACTIONTEXT.

Example code for restoring the undo history from this format:


//LoadSessionFile(openName);
//RestoreSession();
std::ifstream ifs(openName, std::ios::binary);
if (!ifs) {
return;
}
std::map<std::string, std::string> settings;
while (!ifs.eof()) {
std::string line;
std::getline(ifs, line);
if (line.empty()) {
continue;
}
constexpr size_t lengthTypeField = 3;
const char first = line[0];
if (IsAlphabetic(first)) {
const size_t eq = line.find('=');
if (eq != std::string::npos) {
const std::string key = line.substr(0, eq);
const std::string value = line.substr(eq + 1);
settings[key] = value;
if (key == "checksum") {
const size_t checkSumNow = CheckSumDocument(wEditor);
const size_t checkSumFromFile = IntPtrFromString(value, 0);
if (checkSumNow != checkSumFromFile) {
WindowMessageBox(wSciTE, L"Undo session file does not match contents: checksum differs.");
return;
}
}
}
} else if ((line.length() > lengthTypeField) && (first == '-' || first == '+')) {
const int open = (line[1] == '.') ? 0 : 0x100;
const int type = open | ((first == '-') ? 1 : 0);
const size_t comma = line.find(',', lengthTypeField);
if (comma == std::string::npos) {
continue;
}
const intptr_t position = IntPtrFromString(line.substr(lengthTypeField, comma - lengthTypeField), 0);
const std::string text = line.substr(comma + 1);
const std::string unescaped = UnSlashString(text);
wEditor.PushUndoActionType(type, position);
wEditor.ChangeLastUndoActionText(unescaped.length(), unescaped.c_str());
}
}
if (const int savePoint = IntegerFromString(settings["save"], -2); savePoint != -2) {
wEditor.SetUndoSavePoint(savePoint);
}
if (const int tentative = IntegerFromString(settings["tentative"], -2); tentative != -2) {
wEditor.SetUndoTentative(tentative);
}
if (const int current = IntegerFromString(settings["current"], -2); current != -2) {
wEditor.SetUndoCurrent(current);
}
// No handling of 'actions' as same as number of PushUndoActionType


The CheckSumDocument function used in the above code examples is:


size_t CheckSumDocument(Scintilla::ScintillaCall &sci) {
size_t sum = 0;
const size_t length = sci.Length();
for (size_t i = 0; i < length; i++) {
const unsigned char byte = sci.CharacterAt(i);
sum += byte;
}
return sum;
}


Change history is currently incompatible with restoring undo history and should be turned off before restoration. This may be implemented in the future as it is likely possible to replay the actions into an empty change history. There are potential problems as this is not exactly the same as the initial session: for example the initial session may have been saved multiple times but only one save point is restored. The detached state with no save point also needs to be considered.

The changes that reduce memory use also make access to each undo action's data less direct and more complex. That means that random access may be slow so access should be performed sequentially. Currently, access to the text and length of actions should be performed starting from the first action and iterating forward. While reading action type and position could currently be performed randomly, that data may be further compressed with (for example) run-length encoding of types or using positions relative to the previous action.

There may be other features that applications would like to add using undo history that are not well handled by the current implementation. For example, an application may want to display a list of recent undo actions in a side bar. This would be best implemented with backwards iteration so that may need to be efficiently implemented.

It may be an idea to implement a reasonably good check sum API in Scintilla so applications don't need to do this work.

Neil

Austin Green

unread,
Feb 11, 2024, 6:18:42 PMFeb 11
to scintilla...@googlegroups.com

On Sat, 10 Feb 2024 17:10:13 +1100
"'Neil Hodgson' via scintilla-interest" <scintilla...@googlegroups.com> wrote:

> ...
> There may be other features that applications would like to add using undo history that are not well handled by the current implementation.

I would like to see a feature that would (optionally) restore the caret position along with the text changes. I understand that it would not be possible universally because there may be multiple carets. I am unfamiliar with that concept, but in the single caret case, which I suspect would be a high percentage of the time, it would be very useful. Any chance that this could be incorporated with the current changes regarding undo history?

Neil Hodgson

unread,
Feb 12, 2024, 5:48:06 PMFeb 12
to scintilla...@googlegroups.com
Austin Green:

> I would like to see a feature that would (optionally) restore the caret position along with the text changes.

For a single view scenario, the caret is currently moved to the
location of the change when that change is undone. For deletions the
caret goes to the end of the deleted text, since it's more common to
select from start to end.

For deletion, there could be a 'start' or 'end' flag but I suspect
that isn't your actual issue.

> I understand that it would not be possible universally because there may be multiple carets.

Multiple carets could be saved and restored. The complexity comes from
selection being view state and undo currently being only concerned
with document state.

Scintilla supports multiple views on one document with each view
having independent selections and this complicates restoring
selections as that should only be done to the originating view or its
selected successor.

For restore selection, even if multiple views were not initially
supported, the design should be chosen so that it does not make it
more difficult to implement multiple views later.

Neil

John Ehresman

unread,
Feb 16, 2024, 6:26:21 AMFeb 16
to scintilla...@googlegroups.com
Wing has implemented saving and restoring of selection state along with the first visible line for some time. The implementation is in Python, though so it can’t be easily contributed to scintilla itself. State is saved on a per view basis and then restored on the view that the undo / redo is invoked for. The number of selection states saved is capped so memory use shouldn’t grow without bound.

One wrinkle is that when doing a redo, the selection state after the modification is made to the buffer should be restored. The after selection state is not readily available when a new action is started so it is recorded at the next update-ui notification. This has some flaws, but seems to work well enough.

Cheers,

John
> --
> You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-inter...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/scintilla-interest/CACWkrTirbrE-tSQxNNQac00123pqTwRfWV9H_iU8CG68N_ku4w%40mail.gmail.com.

Neil Hodgson

unread,
Feb 16, 2024, 6:50:47 PMFeb 16
to scintilla...@googlegroups.com
Hi John,

> Wing has implemented saving and restoring of selection state along with the first visible line for some time.

Restoring the top line is an addition to selection state that I hadn't
been considering. There could be issues in wrap mode with restoring
into a window that had been narrowed or widened.

> The implementation is in Python, though so it can’t be easily contributed to scintilla itself.

A built-in save selection may not have all your features (like top
line) so shouldn't stop a container-side implementation.

> State is saved on a per view basis and then restored on the view that the undo / redo is invoked for.

One common scenario is where a primary view can be split vertically
(or horizontally), perhaps by dragging a splitter down. The user can
edit two portions of the file easily and then collapse back to one
view by dragging the splitter to the top or bottom. The surviving view
is the successor and should restore selections as the corresponding
actions are undone. The surviving view may be the first or second
view, depending on the direction of collapse. The collapsed view may
later be restored with another splitter drag.

This could be implemented by associating each view with a primary view
ID that marks the selection states that it sends to undo history, and
a set of receiving view IDs that it is the current successor to and
responds to. A simpler alternative is to just support simpler
scenarios well with a view either responding just to its own ID or to
every ID. This handles the common case where there are up to 2 true
editing views along with some more static views like document maps or
thumbnails that do not permit selection and so do not contribute
selection states to undo.

While Document could vend view ID cookies to each view that subscribes
to selection state, it may be better for the application to set view
IDs so they are more stable and can be saved into undo history files.

> The number of selection states saved is capped so memory use shouldn’t grow without bound.

Multiple selection and rectangular selection can increase the cost of
this feature.

For rectangular selection, just the rectangular type tag, anchor, and
caret should be saved with the expansion into a vector of per-line
ranges omitted then recalculated after restore. If each per-line range
was saved then the result could be huge - some users perform
whole-file rectangular operations on large data files. This reduction
of rectangular selections may occasionally produce anomalous results
when view parameters like font are changed.

> One wrinkle is that when doing a redo, the selection state after the modification is made to the buffer should be restored. The after selection state is not readily available when a new action is started so it is recorded at the next update-ui notification.

Since it is the view maintaining complex selections like rectangular
or multiple selection, there is no place within the document code that
can ask for the after change selection state. Each piece of view (and
container?) code that changes document text could add a call to save
the after selection state once that is set completely but that would
be a lot of added calls.

If an 'after' selection state is saved then another change made that
is considered part of that user operation then the first selection
state will never be used so should be deleted.

Intense editing operations like replace all will expose the costs of
producing selection states so selection state production should be
avoided where possible. Inside a BeginUndoAction .. EndUndoAction
group, selection state should only be created and saved for the final
top-level EndUndoAction.

> This has some flaws, but seems to work well enough.

There'll be some breakage when multiple user operations occur together
and it would be better to avoid that. Inside Scintilla, there can be a
handler at the end of processing each message (perhaps an object
destructor that is called on return from ScintillaBase::WndProc) that
saves selection state if the document changed and is not currently
inside an undo group.

Neil

John Ehresman

unread,
Feb 17, 2024, 5:10:55 AMFeb 17
to 'seasoned_geek' via scintilla-interest
A few comments --

> On Feb 17, 2024, at 6:50 AM, Neil Hodgson <scintil...@gmail.com> wrote:
>
> Hi John,
>
>> Wing has implemented saving and restoring of selection state along with the first visible line for some time.
>
> Restoring the top line is an addition to selection state that I hadn't
> been considering. There could be issues in wrap mode with restoring
> into a window that had been narrowed or widened.

The rationale for restoring the top line is so the restored selection is visible if it had been visible; alternatively scroll caret could be used, though using the top line has the advantage of scrolling back to where things were. I just double checked and we use scroll caret if the main caret isn’t visible after restoring the top line.

>> The implementation is in Python, though so it can’t be easily contributed to scintilla itself.
>
> A built-in save selection may not have all your features (like top
> line) so shouldn't stop a container-side implementation.

Yes, one reason I wrote about our implementation is to report that it is possible to implement save & restore selection without changing the scintilla core — I recall that it was easier to implement than I thought it would be. Our implementation may have more limitations than one added to the core, but it seems to work well enough.

>> State is saved on a per view basis and then restored on the view that the undo / redo is invoked for.
>
> One common scenario is where a primary view can be split vertically
> (or horizontally), perhaps by dragging a splitter down...

One way to do this is to put the logic about what to restore in the undo / redo implementation. Note that almost everything about selection state is saved in per-view objects; I think the only data added to the per-document object is a simple counter for undo indices. None of this is persisted across sessions

>> One wrinkle is that when doing a redo, the selection state after the modification is made to the buffer should be restored. The after selection state is not readily available when a new action is started so it is recorded at the next update-ui notification.
>
> Since it is the view maintaining complex selections like rectangular
> or multiple selection, there is no place within the document code that
> can ask for the after change selection state. Each piece of view (and
> container?) code that changes document text could add a call to save
> the after selection state once that is set completely but that would
> be a lot of added calls.

Using the state at the next update-ui seems to work well enough, though it may have more flaws than I realize — this is only used in the redo implementation, which isn’t used as often as undo. The reason for capturing the after state is so that in the case where a letter is added the cursor ends up after the letter after redo is used.

> Intense editing operations like replace all will expose the costs of
> producing selection states so selection state production should be
> avoided where possible. Inside a BeginUndoAction .. EndUndoAction
> group, selection state should only be created and saved for the final
> top-level EndUndoAction.

The state is only saved when SC_STARTACTION is set in modificationType of a SCN_MODIFIED notification. This should only happen once in a replace all operation.

John

Neil Hodgson

unread,
Feb 20, 2024, 8:20:12 PMFeb 20
to scintilla-interest
It looks like there is insufficient information in the undo history to reproduce the full change history.

For example, when there are changes made to the file that are saved and then undone then more changes are made that branch from that point (detaching) then the actions on the initial saved branch past the detach point are not in the undo history. When performed initially, these are stored in change history and indicated to the user so they can see where the visible state differs from the on-disk saved state.

There are two potential directions here:
1) Save the change history with the undo history.
2) Accept a simplified version of change history. This does reduce the accuracy of this feature for continuing a session.

Neil

Neil Hodgson

unread,
Feb 27, 2024, 3:22:56 AMFeb 27
to scintilla-interest
Some more code has been committed that now restores as much of change history as appears possible by replaying the undo history into the change history. There are some more fixes and tests to this feature.

Explicit saving of change history state may be tackled in the future. It may also be possible to hold onto dead undo actions from the save point back to the detach point and allow these to be saved and restored to restore more or all of the previous sessions change history state.

It's likely that the current functionality will be delivered in the next release. It's unlikely I will implement undo save and restore as a visible feature in SciTE in the near future but it may be added as a hidden experimental feature after the next release. There are a couple of unrelated issues that may be addressed before a release is made.

Neil
Reply all
Reply to author
Forward
0 new messages