[dev] [st] Strange behaviour of backspace under csh under st

2 views
Skip to first unread message

Jinsong Zhao

unread,
Nov 5, 2024, 4:15:00 AM11/5/24
to d...@suckless.org
Hi there,

I was trying to use st on a FreeBSD workstation, and my shell is csh.
When I use backspace to delete the Chinese character, I observe strange
behavior.

On the first,
zjs@freebsd:~ % 中文|

After pressing a backspace key,
zjs@freebsd:~ % |

After pressing ctrl + l to refresh st,
zjs@freebsd:~ % 中|

The vertical bar indicates the cursor position.

This behavior is observed under bash, but not under sh.

Any hint would be greatly appreciated.

Best,
Jinsong


Steffen Nurpmeso

unread,
Nov 6, 2024, 8:38:13 PM11/6/24
to dev mail list
Jinsong Zhao wrote in
<09350f56-59c1-4a2f...@yeah.net>:
Which locale is it that you are using? I think (pretty sure) st
supports only UTF-8 locales, so that is one thing. Cannot be any
of those
zh_CN.GB18030.src
zh_CN.GB2312.src
zh_CN.GBK.src
zh_CN.eucCN.src
zh_TW.Big5.src
that i see in origin/main:share/colldef; try locale -a aka
zh_CN.utf8 or better BSD-style zh_CN.UTF-8.

Also .. now looking .. st simply assumes it can hand-join UTF-8 to
UTF-32 and take that as a wchar_t for using the wcwidth(3)
function that is part of FreeBSD. Now letting aside the fact that
the Citrus library they used when i was looking for real last
(many years ago after which Daroussin then implemented something
to be able to generate actualized character mapping tables and
more from Unicode aka ICU releases *if* i get that right) had some
special things which violate(d) that assumption back in the day,
i cannot tell whether (a) this is still true (b) this has ever
been true for zh_, especially not with UTF-8. (Likely not.)

The issue as such seems to me pretty known in that backspace does
not seem to erase all bytes for real, so that a repaint brings
back garbage.

--steffen
|
|Der Kragenbaer, The moon bear,
|der holt sich munter he cheerfully and one by one
|einen nach dem anderen runter wa.ks himself off
|(By Robert Gernhardt)
|
|And in Fall, feel "The Dropbear Bard"s ball(s).
|
|The banded bear
|without a care,
|Banged on himself fore'er and e'er
|
|Farewell, dear collar bear

Steffen Nurpmeso

unread,
Nov 6, 2024, 9:14:42 PM11/6/24
to dev mail list
Steffen Nurpmeso wrote in
<20241107013734.gC5CYhMl@steffen%sdaoden.eu>:
|Jinsong Zhao wrote in
| <09350f56-59c1-4a2f...@yeah.net>:
||I was trying to use st on a FreeBSD workstation, and my shell is csh.
||When I use backspace to delete the Chinese character, I observe strange
||behavior.
||
||On the first,
||zjs@freebsd:~ % 中文|

not to mention that possibly only the wcwidth(3) attributes of
these "So" (Symbol, other) Unicode entries is false.
This is a bug of the locale tables of FreeBSD then.

...
||This behavior is observed under bash, but not under sh.

Bash also uses wcwidth(3), sh seems to use BSD editline library
instead, and that surely uses myriads of successive processing of
mbtowc and wctomb etc to get the stuff back and forth, and likely
keeps, like eg ncurses, "index slots" instead of a simple
"character byte data". So that when you backspace all bytes
making up an "index slot" are removed, whereas st (and mksh fwiw)
simply "synchronizes back" on the "character byte data" until it
finds an UTF-8 start byte.
That is: with Unicode combining characters etc multiple adjacent
such UTF-8 characters form a single "grapheme" in Unicode terms;
many languages have / know / require that in Unicode. Ie bash:

master:lib/readline/rlmbutil.h:# define WCWIDTH(wc) ((_rl_utf8locale && UNICODE_COMBINING_CHAR(wc)) ? 0 : _rl_wcwidth(wc))

With that, backspace in reality has to skip over multiple adjacent
(UTF-8) characters (aka multi multi-byte bytes).
For the simple line editor i have written for my MUA i use

tc.tc_novis = (iswprint(wc) == 0);
tc.tc_width = a_tty_wcwidth(wc);

(where it is not wcwidth() because ISO C did not standardize it).
I use cells aka index-slots, too.

Having said that, now i confused myself. Plain is that bash on
Linux (glibc 2.40) *can* handle these characters. So likely the
character set data of the actual locale you are using on your
specific FreeBSD does not correctly describe the symbols you
mention. Now it *must* be said that in my latest UnicodeData
i have (from 2019, ooops), i see

3197;IDEOGRAPHIC ANNOTATION MIDDLE MARK;So;0;L;<super> 4E2D;;;;N;KAERITEN TYUU;;;;
32A5;CIRCLED IDEOGRAPH CENTRE;So;0;L;<circle> 4E2D;;;;N;CIRCLED IDEOGRAPH CENTER;;;;
1F22D;SQUARED CJK UNIFIED IDEOGRAPH-4E2D;So;0;L;<square> 4E2D;;;;N;;;;;

2F42;KANGXI RADICAL SCRIPT;So;0;ON;<compat> 6587;;;;N;;;;;
3246;CIRCLED IDEOGRAPH SCHOOL;So;0;L;<circle> 6587;;;;N;;;;;

but *no* other occurrences of U+4E2D or U+6587, so maybe the
fallback for "unknown" code points is wrong. My thing uses

# ifdef mx_HAVE_WCWIDTH
w = (wc == '\t' ? 1 : wcwidth(wc));
# else
if(wc == '\t' || iswprint(wc))
w = 1 + (wc >= 0x1100u); /* S-CText isfullwidth() */
else
w = -1;
# endif

which is very shitty, but since both codepoints are above U+1100
we treat them as fullwidth aka of width 2. ...

Hope that helps .. :/

Jinsong Zhao

unread,
Nov 7, 2024, 8:57:38 AM11/7/24
to d...@suckless.org
my locale is C.UTF-8.

The problem disappeared after I upgraded st to 0.9.2. The problem
occurred on my FreeBSD 14.1 with st-0.9.1 installed using pkg, FreeBSD's
package management system.

Sorry for the noise. I should check the latest version of st before
posting here.

Best,

Jinsong

Страхиња Радић

unread,
Nov 7, 2024, 10:43:21 AM11/7/24
to dev mail list
Дана 24/11/07 09:56PM, Jinsong Zhao написа:
> Sorry for the noise. I should check the latest version of st before posting
> here.

This is why suckless software should not be installed through packages.
Suckless programs are intended to be built from source, possibly after
applying patches and configuring settings in config.h. Prebuilt
packages clash with one of the main ideas behind suckless software.

Dave Blanchard

unread,
Apr 13, 2025, 10:03:07 PMApr 13
to d...@suckless.org
The main problem is that 'suckless' code like st actually *sucks*. lol @ you attempting to blame a package for your shit code.

People who use real terminals like rxvt, developed by competent programmers, don't have these crazy problems with their terminal. They just get their work done, painlessly.

Dave

Elie Le Vaillant

unread,
Apr 14, 2025, 12:46:43 AMApr 14
to dev mail list
Hi Dave,

On Mon Apr 14, 2025 at 4:07 AM CEST, Dave Blanchard wrote:
> The main problem is that 'suckless' code like st actually *sucks*. lol @ you attempting to blame a package for your shit code.
>
> People who use real terminals like rxvt, developed by competent programmers, don't have these crazy problems with their terminal. They just get their work done, painlessly.
>
> Dave

Why did you post this to this mailing list? What value do you think
it brings exactly?

Why do you feel so entitled to talk about the experience of suckless
devs/users? Did you make a survey?

Why are you recommanding a terminal emulator that has been abandoned
and that doesn't even support unicode?
Why do you think other projects such as rxvt, would be bug-free?
What is your justification for this crazy opinion here?

Please, if you don't like suckless software, you're free and very
welcome to write about your bad opinions somewhere else than on
the dev mainling list of suckless.
Cheers,
Elie Le Vaillant

Jeremy

unread,
Apr 14, 2025, 2:16:24 AMApr 14
to dev mail list
On 04/13/25 09:07PM, Dave Blanchard wrote:
> > Дана 24/11/07 09:56PM, Jinsong Zhao написа:
> > This is why suckless software should not be installed through packages.
> > Suckless programs are intended to be built from source, possibly after
> > applying patches and configuring settings in config.h. Prebuilt
> > packages clash with one of the main ideas behind suckless software.
>
> The main problem is that 'suckless' code like st actually *sucks*. lol @ you attempting to blame a package for your shit code.

I agree. Suckless code quality has decayed.

For example, the initial purpose of stest was to list the executable
files in a directory. Now, well, see for yourself: find | stest -x -f

I do not understand config.def.h. I use git to track my changes. I'm
not sure when config.def.h crept into all the repos... redhat probably
had something to do with it

Jeremy

Storkman

unread,
Apr 14, 2025, 5:05:55 AMApr 14
to dev mail list
On April 14, 2025 6:12:00 AM UTC, Jeremy <j...@jer.cx> wrote:
>> something something short code bad
>
>I agree. Suckless code quality has decayed.

The quality of bait on the mailing list has declined.

--
Storkman

Roberto E. Vargas Caballero

unread,
Apr 14, 2025, 3:54:39 PMApr 14
to dev mail list
Hi,

On Sun, Apr 13, 2025 at 09:07:54PM -0500, Dave Blanchard wrote:
> The main problem is that 'suckless' code like st actually *sucks*. lol @ you attempting to blame a package for your shit code.
>
> People who use real terminals like rxvt, developed by competent programmers, don't have these crazy problems with their terminal. They just get their work done, painlessly.
>

Take a look to the code of xterm, and later a look to the code of st. I am
pretty sure that you didn't read any of them, and your intention is just
trolling. I'll no feed more the troll.

Regards,

Reply all
Reply to author
Forward
0 new messages