text-widget: confusion about marks...

Andreas Leitgeb

unread,

Jun 26, 2015, 9:42:18 AM6/26/15

to

See this sample script:

package req Tk; pack [text .t]
proc x {} { puts "foo: [.t index foo] end: [.t index end]" }
.t mark set foo end; .t mark gravity foo left; # LEFT gravity!
x ;# -> 2.0 2.0 ====> foo is behind the final \n
.t insert end "line1\n"
x ;# -> 3.0 3.0 ====> despite gravity, mark foo stuck to right
# insert at end seems to be really an insert at end-1char
.t mark set foo end-1char
x ;# -> 2.0 3.0 => seems ok: right after line1\n and before final \n
.t insert end "line2\n"
x ;# -> 2.0 4.0 => seems ok, mark foo held its position.
.t delete foo end ;# intended: delete any lines after line1
x ;# -> 1.5 2.0 ====> now mark foo warped back over a \n

Summary:
- if I set the mark at "end", then on next insert it behaves
like it had right-gravity
- if I set the mark at "end-1char", which is supposed to be *after*
all my own characters and *before* the internal final \n, then
deleting between mark and end shifts the mark backwards and thus
also deletes the \n that was originally *before* the mark.

I believe that this is all a consequence of the text widget trying hard
to maintain its own final \n at end, but how can I work around that?

Is this even the intended behaviour? Or is it actually a bug to be
reported for the text widget?

Francois Vogel

unread,

Jun 28, 2015, 12:24:46 PM6/28/15

to

Andreas Leitgeb a écrit le 26/06/2015 15:40 :
> Summary:
> - if I set the mark at "end", then on next insert it behaves
> like it had right-gravity

The text widget doesn't allow insertions on the last (dummy) line of
the text.
Since "end" indicates the end of the text (the character just after
the last newline), when trying to insert text at "end" position,
you're trying to do exactly that.

In this situation, the text widget actually inserts at the last
allowable index for insertion, which is the largest index at end of
the previous line. There is intentional code for doing this (see
InsertChars in TkText.c).

In the insertion process, the mark "foo" is unchanged and still at
"end" position (with left gravity). Since the foo mark is after the
last allowable index for insertion, this mark shifts to the right when
inserting text.

As you mention, this is a consequence of the text widget trying hard
to maintain its own final \n at its end.

Workaround: instead of setting the mark at end, set it at end-1char,
which is what you tried below:

> - if I set the mark at "end-1char", which is supposed to be *after*
> all my own characters and *before* the internal final \n, then
> deleting between mark and end shifts the mark backwards and thus
> also deletes the \n that was originally *before* the mark.

Actually, independently from any mark setting, it is the deletion
operation from 2.0 to end that removes one unwanted \n. This is bug
[2886436fff]:

http://core.tcl.tk/tk/tktview?name=2886436fff

> I believe that this is all a consequence of the text widget trying hard
> to maintain its own final \n at end, but how can I work around that?
>
> Is this even the intended behaviour? Or is it actually a bug to be
> reported for the text widget?

We could perhaps discuss whether the user should be allowed to set a
mark at really the "end" position. If the text widget would silently
move it to the last allowable index instead (like it does it for
insertion), the first issue you saw could probably be avoided. But
then this would remove equivalence between the mark name and the "end"
index:

.t mark set foo end ; # would in fact set it to the last allowable
index, not "end"
.t compare [.t index end] == [.t index foo] ; # would return 0
(currently: 1)

Would this be a problem or not...?

François

Andreas Leitgeb

unread,

Jun 28, 2015, 4:46:00 PM6/28/15

to

Thanks for answering and providing further information!

Francois Vogel <fvogelne...@free.fr> wrote:
> Andreas Leitgeb a écrit le 26/06/2015 15:40 :
>> Summary:
>> - if I set the mark at "end", then on next insert it behaves
>> like it had right-gravity
>
> The text widget doesn't allow insertions on the last (dummy) line of
> the text.
> Since "end" indicates the end of the text (the character just after
> the last newline), when trying to insert text at "end" position,
> you're trying to do exactly that.

> In this situation, the text widget actually inserts at the last
> allowable index for insertion, which is the largest index at end of
> the previous line. There is intentional code for doing this (see
> InsertChars in TkText.c).

So, while "end" marks the very last position (after the final \n),
insertion is coded to correct such a position, and instead insert
just before the final \n.

Other operations like retrieving text actually can deal with this
off-the-end "end" in that they just include the final \n if the
given position range includes it.

> Actually, independently from any mark setting, it is the deletion
> operation from 2.0 to end that removes one unwanted \n. This is bug
> [2886436fff]:
> http://core.tcl.tk/tk/tktview?name=2886436fff

The patch contains (on the "-" side) what was the reason for this
irregularity in the first place: it was the idea that a deletion
including the final \n is an attempt to delete a whole line, so
the \n before the deleted block shall become the new final \n and
some tags shifted accordingly. With that background it even makes
sense to me. Not sure, though, if it is my fault that I didn't
realize (and expect) it just from reading the docs. I tend to think
not.

>> I believe that this is all a consequence of the text widget trying hard
>> to maintain its own final \n at end, but how can I work around that?

The workaround turned out to be not all that complicated: I changed the
delete operation's second argument to "end-1char", removed some of my
previous attempts at workarounds, and it seems to work well enough now.
Alternatively, I might just as well have worked with a tag rather than
marks, and these would have had their tag.last just before the final \n
and thus the delete would have worked without surprise.

>> Is this even the intended behaviour? Or is it actually a bug to be
>> reported for the text widget?
> We could perhaps discuss whether the user should be allowed to set a
> mark at really the "end" position.

> If the text widget would silently move it to the last allowable index...
That wasn't even any of the fixes I pondered.

The problem is not really with the mapping of marks to positions, but
with some design-decision (of keeping all lines \n-terminated) causing
some surprise (shifting marks) in unexpected places.

So, what remains is hope that perhaps a comment be added to the text-docs,
hinting users that end-1char is more likely the mark they want for "deletion
to the end" than end itself.

Andreas Leitgeb

unread,

Jun 28, 2015, 5:16:57 PM6/28/15

to

Andreas Leitgeb <a...@auth.logic.tuwien.ac.at> wrote:
> Not sure, though, if it is my fault that I didn't realize (and expect)
> it just from reading the docs. I tend to think not.
>

> So, what remains is hope that perhaps a comment be added to the text-docs,
> hinting users that end-1char is more likely the mark they want for "deletion
> to the end" than end itself.

Re-checked the docs and found this line:
" It is not allowable to delete characters in a way that would
" leave the text without a newline as the last character.

This means, that e.g. something like ".t delete 1.0 end" isn't
allowable in the first place...

Maybe it just ought to throw an error "illegal deletion of last
newline" -- just kidding.

Yet, obviously (at some point in the long past) code was added
to change the behaviour of this illegal operation to delete some
randomly bystanding newline instead of the final one. ;-)

Perhaps that line should be changed to a description of what actually
happens when the last newline happens to be included in the range for
deletion.

Francois Vogel

unread,

Jun 29, 2015, 4:34:18 PM6/29/15

to

Andreas Leitgeb a écrit le 28/06/2015 22:44 :
> So, while "end" marks the very last position (after the final \n),
> insertion is coded to correct such a position, and instead insert
> just before the final \n.

Yes and (reading the man page again) it is even documented in the
"insert" command section: "[...] If /index/ refers to the end of the
text (the character after the last newline) then the new text is
inserted just before the last newline instead.[...]"

> Other operations like retrieving text actually can deal with this
> off-the-end "end" in that they just include the final \n if the
> given position range includes it.

Yes, correct. For instance:

package req Tk; pack [text .t] ; .t get 1.0 end

returns a \n character whereas one would naturally expect the empty
string. The empty (nothing was inserted) text widget always has a \n
in it...!

I have always thought that this internal historical requirement for a
final \n in the text widget should have exactly zero exposure to the
user at all. Even to the programmer in Tcl/Tk language this should not
be exposed. If the underlying B-tree implementation of the text widget
requires this trailing \n, then fine, but it should be completely
hidden at the Tk level. It's not the case, which leads to unexpected
quirks and hair-pulling corner cases.

Maintainers escape the real problem by *very* carefully reading the
man pages and claiming that these cases are documented. And this
statement is true, they are!

Oh well, I have just used such an escape door :-) ! But it does not
really satisfy me because it's a lazy man solution: the real man
solution should be to tackle down this \n and hide it from the Tk
programming level completely. No, forget it, I'm not ready to launch
such a revolution...! This would be huge and complex work and would
certainly break existing user code in the wild (starting with mine!).
So let's live with this more.

> The patch contains (on the "-" side) what was the reason for this
> irregularity in the first place: it was the idea that a deletion
> including the final \n is an attempt to delete a whole line, so the
> \n before the deleted block shall become the new final \n and some
> tags shifted accordingly.

Indeed. You express this original intention so clearly that I think I
will:
- add this as a comment in the source code
- add a note in the "delete" section of the text widget man page
- and as a consequence reject the patch proposed in bug [2886436fff],
by relying once again on the fact that this will now be documented

So far I did not apply this patch because I couldn't figure out who
was right: the patch submitter or the existing source code. Now the
way ahead is clearer to me.

> The problem is not really with the mapping of marks to positions,
> but with some design-decision (of keeping all lines \n-terminated)
> causing some surprise (shifting marks) in unexpected places.

Absolutely, as fully agreed above.

> So, what remains is hope that perhaps a comment be added to the
> text-docs, hinting users that end-1char is more likely the mark they
> want for "deletion to the end" than end itself.

Wilco, following your proposal in your other reply.

I suggest this discussion be followed up in the bug report:

http://core.tcl.tk/tk/tktview?name=2886436fff

along with the branch [bug-2886436fff] that I will soon create in the
fossil repository to deal with this (documentation) issue.

Thanks!
François

Andreas Leitgeb

unread,

Jul 5, 2015, 10:13:49 AM7/5/15

to

Francois Vogel <fvogelne...@free.fr> wrote:
> Indeed. You express this original intention so clearly that I think I
> will:
> - add this as a comment in the source code
> - add a note in the "delete" section of the text widget man page
> - and as a consequence reject the patch proposed in bug [2886436fff],
> by relying once again on the fact that this will now be documented

Only now (playing around with fossil) I noticed that you've meanwhile done it.

Thanks!