UUID gnx support

93 views
Skip to first unread message

spike

unread,
Sep 12, 2022, 4:39:40 PM9/12/22
to leo-e...@googlegroups.com
A longstanding pet peeve of mine is that Leo includes the user name in
the gnx string, so I patched it to use UUIDs instead. I don't know if
anyone else cares enough to warrant including this upstream, but I've
created a PR (#2843) in case anyone is interested.

I've been using this for a couple of months without difficulty, but YMMV.

Edward K. Ream

unread,
Sep 12, 2022, 5:40:10 PM9/12/22
to leo-editor
On Mon, Sep 12, 2022 at 3:39 PM spike <spiketheh...@runbox.com> wrote:
A longstanding pet peeve of mine is that Leo includes the user name in
the gnx string, so I patched it to use UUIDs instead. I don't know if
anyone else cares enough to warrant including this upstream, but I've
created a PR (#2843) in case anyone is interested.

I care, and I want to retain the user ids.

Edward

spike

unread,
Sep 13, 2022, 12:46:11 AM9/13/22
to leo-e...@googlegroups.com
Fair enough.

Edward K. Ream

unread,
Sep 13, 2022, 6:14:19 AM9/13/22
to leo-editor
On Monday, September 12, 2022 at 11:46:11 PM UTC-5 spike wrote:
On 2022-09-12 15:39, Edward K. Ream wrote:
 
>>> A longstanding pet peeve of mine is that Leo includes the user name in the gnx string, so I patched it to use UUIDs instead.
...
>>>I don't know if anyone else cares enough to warrant including this upstream, but I've created a PR (#2843) in case anyone is interested.

>>  I care, and I want to retain the user ids.

> Fair enough.

My response was inadequate and misleading. Using UUIDs would not work! References:

- Most recent post on this topic.

All devs should remember the following:

- Unique user ids will eliminate most (but not all!) gnx clashes
- Creating nodes from several copies of Leo's bridge has created gnx clashes resulting from race conditions!
- Leo's read code contains  complex, non-intuitive, to resolve the resulting gnx clashes.

Conclusions

Changing the format of gnxs would introduce the worst kind of bug: subtle and rare.

Being wary of UUIDs should be part of Leo's institutional memory. gnxs must stay as they are! As a result, I have just closed PR #2843.

Edward

P.S. A point of preference. The user ids and timestamps in gnxs are documentation and well-deserved acknowledgments.  Imo, those data should remain!

EKR

Edward K. Ream

unread,
Sep 13, 2022, 6:37:20 AM9/13/22
to leo-editor
On Tuesday, September 13, 2022 at 5:14:19 AM UTC-5 Edward K. Ream wrote:

> Being wary of UUIDs should be part of Leo's institutional memory.

PR #2827 adds a warning to NodeIndices.computeNewIndex:

def computeNewIndex(self) -> str:
    """Return a new gnx."""

    # Warning! Warning! Warning!

    # Don't even *think* about changing the format of gnxs!
    # Doing so could introduce the worst kind of bugs: subtle and rare.

    # See this post: https://groups.google.com/g/leo-editor/c/Lldywoievn4/m/RUMMzB7fBgAJ

    t_s = self.update()  # Updates self.lastTime and self.lastIndex.
    gnx = g.toUnicode(f"{self.userId}.{t_s}.{self.lastIndex:d}")
    return gnx

Edward

spike

unread,
Sep 13, 2022, 5:36:05 PM9/13/22
to leo-e...@googlegroups.com
On 2022-09-13 04:37, Edward K. Ream wrote:
> On Tuesday, September 13, 2022 at 5:14:19 AM UTC-5 Edward K. Ream wrote:
>
>> Being wary of UUIDs should be part of Leo's institutional memory.
>
> PR #2827 <https://github.com/leo-editor/leo-editor/issues/2827> adds a
> warning to NodeIndices.computeNewIndex:
>
> def computeNewIndex(self) -> str:
> """Return a new gnx."""
>
> # Warning! Warning! Warning!
>
> # Don't even *think* about changing the format of gnxs!
> # Doing so could introduce the worst kind of bugs: subtle and rare.
>
> # See this post:
> https://groups.google.com/g/leo-editor/c/Lldywoievn4/m/RUMMzB7fBgAJ
>
> t_s = self.update() # Updates self.lastTime and self.lastIndex.
> gnx = g.toUnicode(f"{self.userId}.{t_s}.{self.lastIndex:d}")
> return gnx
>
> Edward
>
Having read the discussion on #1348 (thank you for the link), I must
strongly disagree with this assessment.

There were two problems involved in that bug:
1) Leo created a spurious vnode when reading an outline.
2) This exposed a timing race due to using timestamp based IDs.

Introducing UUIDs was a proposed solution. Sequence numbers were the
chosen alternative. Both were reasonable.

Using username and timestamp information in your workflow is both fair
and reasonable. Blaming UUIDs for a bug caused by improper use of
timestamps is neither.

This seems to be a hot button for you, so I'll drop the issue.

Edward K. Ream

unread,
Sep 13, 2022, 6:17:54 PM9/13/22
to leo-editor
On Tuesday, September 13, 2022 at 4:36:05 PM UTC-5 spike wrote:

Having read the discussion on #1348 (thank you for the link), I must
strongly disagree with this assessment.

There were two problems involved in that bug:
1) Leo created a spurious vnode when reading an outline.
2) This exposed a timing race due to using timestamp based IDs.

Introducing UUIDs was a proposed solution. Sequence numbers were the
chosen alternative. Both were reasonable.

Interesting. I don't remember the details, so you may well be correct.

Using username and timestamp information in your workflow is both fair
and reasonable. Blaming UUIDs for a bug caused by improper use of
timestamps is neither.

Thanks for these remarks. All I remember was the pain of the bugs, so perhaps you are correct that UUIDs were not the real culprit. This is not a hot button issue for me. I'm often wrong about details.

It seems we both agree that the present scheme is reasonable. Let's leave it at that.

Anyway, it's way too late to consider significant changes to Leo for 6.7.0, and any change to Leo's file format qualifies as a highly significant change :-)

Edward

spike

unread,
Sep 13, 2022, 7:22:59 PM9/13/22
to leo-e...@googlegroups.com
> and any change to Leo's file format qualifies as a *highly* significant
> change :-)
>
> Edward
>
I understand the pain and frustration of hard-to-trace bugs. We've all
been there. I also apologize for being harsh. I get attached to things,
too. ;)

We both have reasonable objectives. That's why I asked for an option
setting, so both use cases are covered, but I don't mind keeping a
private fork.

I agree that file format changes are a big deal. That's why I was hoping
to get this in for 6.7.0, but I guess I rushed it. Sorry about that.

On that note I have some ideas for cleaning up and extending Leo's file
format and undo helpers -- user attribute handling in particular is
pretty cludgy -- but that would be for 7.0 at minimum. Such changes
would run far and deep and I've been trying to avoid that with my
submissions.

Cheers.

Thomas Passin

unread,
Sep 13, 2022, 7:51:11 PM9/13/22
to leo-editor
It's far from clear to me that any code that uses gnx's needs to know anything except their string value.  You want to avoid inadvertent duplications of gnx's (so copies can be distinguished from clones, for one thing), that's clear.  Not that I've studied the code or anything useful like that!  it ought to be possible to combine a UUID with the user name, so people to whom it matters (like @edward) could still see that.

Don't forget, there are several kinds of UUIDs.  Some are more collision-proof than others, or harder to generate, or the scope of uniqueness varies (e.g., local computer vs the world).  ISTR that TheOldNewThing had a post or series of posts on the differences between different kinds of UUIDs.

File format changes should be backwards compatible - Please Please Please!

Edward K. Ream

unread,
Sep 14, 2022, 4:38:40 AM9/14/22
to leo-editor
On Tue, Sep 13, 2022 at 6:23 PM spike <spiketheh...@runbox.com> wrote:

We both have reasonable objectives. That's why I asked for an option
setting, so both use cases are covered, but I don't mind keeping a
private fork.

Leo is not going to change its file format. Not now. Not ever.  There are two problems:

1. [Minor] Supporting two formats is messy within Leo.
2. [Fatal] Creating a new format now means that old versions of Leo won't be able to read .leo files created in the new format.

Use a fork if you must. Leo isn't going to use UUIDs.

Adding a command-line option is too horrible to contemplate. Adding a user setting won't work because user options are read too late.

On that note I have some ideas for cleaning up and extending Leo's file
format and undo helpers -- user attribute handling in particular is
pretty cludgy -- but that would be for 7.0 at minimum. Such changes
would run far and deep and I've been trying to avoid that with my
submissions.

I will not consider such changes. Leo is a mature program, constrained by its past. Stability is much more important than behind-the-scenes improvements, or supposed improvements.

Edward

P.S. I did not sufficiently consider compatibility issues when approving PR #2818, which says:

"Older versions of Leo will not be able to read files that use this feature. Files not using this feature are not affected."

I'm considering whether to back out of this PR.

Edward

Edward K. Ream

unread,
Sep 14, 2022, 4:53:20 AM9/14/22
to leo-editor
On Wednesday, September 14, 2022 at 3:38:40 AM UTC-5 Edward K. Ream wrote:

P.S. I did not sufficiently consider compatibility issues when approving PR #2818, which says:

"Older versions of Leo will not be able to read files that use this feature. Files not using this feature are not affected."

 I've just taken a look at the relevant read code: FastRead.resolveUa. In fact, older version of Leo will be able to read files with _json uAs.

As I read the code, Leo will issue the warning: f"can not unhexlify {attr}={val}" for _json uAs, but almost-never-issued warnings shouldn't disqualify the PR.

Edward

Thomas Passin

unread,
Sep 14, 2022, 8:28:54 AM9/14/22
to leo-editor
I wondered about that message.  I saw it after pulling the latest changeset.  I haven't set any uAs of any variety, myself.  Maybe some plugin?

spike

unread,
Sep 14, 2022, 11:13:48 AM9/14/22
to leo-e...@googlegroups.com
On 2022-09-14 02:38, Edward K. Ream wrote:
> On Tue, Sep 13, 2022 at 6:23 PM spike <spiketheh...@runbox.com> wrote:
>
> We both have reasonable objectives. That's why I asked for an option
>> setting, so both use cases are covered, but I don't mind keeping a
>> private fork.
>>
>
> Leo is not going to change its file format. Not now. Not ever. There are
> two problems:
>
> 1. [Minor] Supporting two formats is messy within Leo.
> 2. [Fatal] Creating a new format now means that old versions of Leo won't
> be able to read .leo files created in the new format.
>
This does not apply to UUID gnx. Older versions of Leo all the way back
to 4.1 will accept them without problems. The only affected code is
'sort by creation date', which already fails gracefully because user
names aren't guaranteed to match Leo's expected format, and that breaks
decoding the timestamp.

> Use a fork if you must. Leo isn't going to use UUIDs.
>
> Adding a command-line option is too horrible to contemplate. Adding a user
> setting won't work because user options are read too late.
>
The user settings works as expected since new ID's aren't created until
after the settings are read. This was tested both ways: Disabled in
global settings and enabled in the file, and enabled in global and
disabled in the file. Both worked as expected.

> On that note I have some ideas for cleaning up and extending Leo's file
>> format and undo helpers -- user attribute handling in particular is
>> pretty cludgy -- but that would be for 7.0 at minimum. Such changes
>> would run far and deep and I've been trying to avoid that with my
>> submissions.
>>
>
> I will not consider such changes. Leo is a mature program, constrained by
> its past. Stability is much more important than behind-the-scenes
> improvements, or supposed improvements.
>
>
Edward
So be it.

>
> P.S. I did not sufficiently consider compatibility issues when approving PR

spike

unread,
Sep 14, 2022, 11:21:44 AM9/14/22
to leo-e...@googlegroups.com
On 2022-09-14 06:28, Thomas Passin wrote:
> I wondered about that message. I saw it after pulling the latest
> changeset. I haven't set any uAs of any variety, myself. Maybe some
> plugin?
>
> On Wednesday, September 14, 2022 at 4:53:20 AM UTC-4 Edward K. Ream wrote:
>
>> On Wednesday, September 14, 2022 at 3:38:40 AM UTC-5 Edward K. Ream wrote:
>>
>> P.S. I did not sufficiently consider compatibility issues when approving
>>> PR #2818 <https://github.com/leo-editor/leo-editor/pull/2818>, which
>>> says:
>>>
>>> "Older versions of Leo will not be able to read files that use this
>>> feature. Files not using this feature are not affected."
>>>
>>
>> I've just taken a look at the relevant read code: FastRead.resolveUa*.*
>> In fact, older version of Leo *will* be able to read files with _json uAs.
>>
>> As I read the code, Leo will issue the warning: f"can not unhexlify
>> {attr}={val}" for _json uAs, but almost-never-issued warnings shouldn't
>> disqualify the PR.
>>
>> Edward
>>
>
User attributes are used by several plugins in Leo. I asked for JSON
support because the only options were 'plain string' and 'pickle'. The
former is too limiting and the latter isn't text, which can be a problem
when you need to examine or edit a .leo file by hand. Debugging user
icons was needlessly troublesome because of it.

spike

unread,
Sep 14, 2022, 12:31:02 PM9/14/22
to leo-e...@googlegroups.com
>> 6.7.0, and any change to Leo's file format qualifies as a *highly*
>> significant change :-)
>>
>> Edward
>>
>
IDs should always be treated as blind tokens. Attempting to decode them
is asking for trouble precisely because it causes compatibility problems
and hard to trace bugs. The token generator needs to check against
existing tokens to verify uniqueness (Leo does this) and may perform
heuristics if it needs to ensure sequencing, but care must be taken to
safely handle unexpected input.

Leo decodes gnx in exactly two places:

The unit test for the token generator, which should never see tokens
created by a different version and thus is safe.

The ToDo plugin, in todo.py on line 1303. It takes proper precautions to
fail gracefully when it hits an ID it can't read because failures were a
known and expected issue when it was first created. Even something as
simple as a '.' in a user name is enough to trip it up.

There is a more general problem in Leo with not escaping output, which
can create unreadable files if unexpected characters are present. Raw
gnxes are written to both .leo files and in @file annotations,
preventing the use of the characters '<', '>', '"', '&', ':', and '\n'.
None of those characters occur in UUIDs, which are limited to [0-9A-F]
and '-'.

My proposed patch used UUID4, which are randomly generated and provide
strong guarantees of uniqueness. It was also gated by a user setting
that defaulted to 'off'.

UUID gnx is verified compatible with any recent Leo and should be
compatible all the way back to 4.1. That's why I felt comfortable
submitting it for 6.7.0.

Edward K. Ream

unread,
Sep 14, 2022, 1:03:45 PM9/14/22
to leo-editor
On Wed, Sep 14, 2022 at 10:13 AM spike <spiketheh...@runbox.com> wrote:

> 2. [Fatal] Creating a new format now means that old versions of Leo won't
> be able to read .leo files created in the new format.
>
This does not apply to UUID gnx. Older versions of Leo all the way back
to 4.1 will accept them without problems.

That's good to know, but I'm in no mood for such changes.

I suggest you create a plugin to do exactly what you want. There will be no need to convince me, and no impact on existing Leonistas.

Edward

Edward K. Ream

unread,
Sep 14, 2022, 3:02:54 PM9/14/22
to leo-editor
On Wednesday, September 14, 2022 at 12:03:45 PM UTC-5 Edward K. Ream wrote:
On Wed, Sep 14, 2022 at 10:13 AM spike <spiketheh...@runbox.com> wrote:

> 2. [Fatal] Creating a new format now means that old versions of Leo won't
> be able to read .leo files created in the new format.
>
This does not apply to UUID gnx. Older versions of Leo all the way back
to 4.1 will accept them without problems.

That's good to know, but I'm in no mood for such changes.

Heh. At last I understand:

- All Leo versions can read UUID gnxs, so there is no compatibility impact.
- A user setting can be used to choose gnx format.

My apologies for being so slow to understand.  Please file a PR. If it looks good we can make it part of 6.7.0.

Edward
Reply all
Reply to author
Forward
0 new messages