Vim currently is susceptible to unicode steganography attacks such as the one described here. (An example of a commit where the malware had been deployed that contains evil characters can be found here.)
By default vim completely ignores such characters, displaying nothing on the screen at the positions they are supposed to be at. And yet the characters are written to a file when :w ing.
This contrasts to how vim handles some invisible unicode characters such as U-180E (the mongolian vowel separator) which is displayed as its codepoint inside angle brackets.
I'd like to propose a behavior where vim protects the user from such attacks by default, with an option to disable this feature for people that use languages that make legitimate use of such characters.
The behavior would be
(1) to display a warning message if such characters ever enter a buffer(this is so that people don't paste into vim and don't realize it contains evil characters);
and
(2) to display every non-printable character as a codepoint inside angle brackets.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
What characters would that be? I wonder if we can use the listchars setting for this?
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.![]()
Here is an example set of characters, extracted from the commit I referenced above:
U+0E0110 󠄐 VARIATION SELECTOR-33 printable=True
U+0E0117 󠄗 VARIATION SELECTOR-40 printable=True
U+0E0118 󠄘 VARIATION SELECTOR-41 printable=True
U+0E0119 󠄙 VARIATION SELECTOR-42 printable=True
U+0E011A 󠄚 VARIATION SELECTOR-43 printable=True
U+0E011B 󠄛 VARIATION SELECTOR-44 printable=True
U+0E011C 󠄜 VARIATION SELECTOR-45 printable=True
U+0E011D 󠄝 VARIATION SELECTOR-46 printable=True
U+0E011E 󠄞 VARIATION SELECTOR-47 printable=True
U+0E0120 󠄠 VARIATION SELECTOR-49 printable=True
U+0E0121 󠄡 VARIATION SELECTOR-50 printable=True
U+0E0122 󠄢 VARIATION SELECTOR-51 printable=True
U+0E0123 󠄣 VARIATION SELECTOR-52 printable=True
U+0E0124 󠄤 VARIATION SELECTOR-53 printable=True
U+0E0125 󠄥 VARIATION SELECTOR-54 printable=True
U+0E0126 󠄦 VARIATION SELECTOR-55 printable=True
U+0E0127 󠄧 VARIATION SELECTOR-56 printable=True
U+0E0128 󠄨 VARIATION SELECTOR-57 printable=True
U+0E0129 󠄩 VARIATION SELECTOR-58 printable=True
U+0E012B 󠄫 VARIATION SELECTOR-60 printable=True
U+0E012D 󠄭 VARIATION SELECTOR-62 printable=True
U+0E012E 󠄮 VARIATION SELECTOR-63 printable=True
U+0E0132 󠄲 VARIATION SELECTOR-67 printable=True
U+0E0134 󠄴 VARIATION SELECTOR-69 printable=True
U+0E0137 󠄷 VARIATION SELECTOR-72 printable=True
U+0E0138 󠄸 VARIATION SELECTOR-73 printable=True
U+0E013E 󠄾 VARIATION SELECTOR-79 printable=True
U+0E013F 󠄿 VARIATION SELECTOR-80 printable=True
U+0E0140 󠅀 VARIATION SELECTOR-81 printable=True
U+0E0143 󠅃 VARIATION SELECTOR-84 printable=True
U+0E0144 󠅄 VARIATION SELECTOR-85 printable=True
U+0E0148 󠅈 VARIATION SELECTOR-89 printable=True
U+0E014B 󠅋 VARIATION SELECTOR-92 printable=True
U+0E014D 󠅍 VARIATION SELECTOR-94 printable=True
U+0E0151 󠅑 VARIATION SELECTOR-98 printable=True
U+0E0152 󠅒 VARIATION SELECTOR-99 printable=True
U+0E0153 󠅓 VARIATION SELECTOR-100 printable=True
U+0E0154 󠅔 VARIATION SELECTOR-101 printable=True
U+0E0155 󠅕 VARIATION SELECTOR-102 printable=True
U+0E0156 󠅖 VARIATION SELECTOR-103 printable=True
U+0E0157 󠅗 VARIATION SELECTOR-104 printable=True
U+0E0158 󠅘 VARIATION SELECTOR-105 printable=True
U+0E0159 󠅙 VARIATION SELECTOR-106 printable=True
U+0E015C 󠅜 VARIATION SELECTOR-109 printable=True
U+0E015D 󠅝 VARIATION SELECTOR-110 printable=True
U+0E015E 󠅞 VARIATION SELECTOR-111 printable=True
U+0E015F 󠅟 VARIATION SELECTOR-112 printable=True
U+0E0160 󠅠 VARIATION SELECTOR-113 printable=True
U+0E0161 󠅡 VARIATION SELECTOR-114 printable=True
U+0E0162 󠅢 VARIATION SELECTOR-115 printable=True
U+0E0163 󠅣 VARIATION SELECTOR-116 printable=True
U+0E0164 󠅤 VARIATION SELECTOR-117 printable=True
U+0E0165 󠅥 VARIATION SELECTOR-118 printable=True
U+0E0166 󠅦 VARIATION SELECTOR-119 printable=True
U+0E0167 󠅧 VARIATION SELECTOR-120 printable=True
U+0E0168 󠅨 VARIATION SELECTOR-121 printable=True
U+0E0169 󠅩 VARIATION SELECTOR-122 printable=True
U+0E016A 󠅪 VARIATION SELECTOR-123 printable=True
U+0E016B 󠅫 VARIATION SELECTOR-124 printable=True
U+0E016D 󠅭 VARIATION SELECTOR-126 printable=True
'printable' is the result of python's isprintable function, for comparison. Maybe vim uses a similar strategy for deciding which characters to print? For completeness: the mongolian vowel separator returns False for isprintable instead.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
// This file demonstrates unicode steganography attack vectors.
// Hidden variation selector characters are embedded in the strings belowTread carefully with this - a similar suggestion was put forward (and closed) at issue 19071. There are legitimate uses for variation selectors and the matter of them being used maliciously seems to be particular to npm, right? That said, perhaps a filetype plugin change for JavaScript/TypeScript could be a suitable place where defaulting their display differently is justifiable?
Incidentally, @chrisbra - what are your thoughts on #18876 (comment) where I suggest having gVim display the text variant of emoji using U+FE0E when toggling 'rop'? It would require only a few changes to gui_dwrite.cpp. (Fwiw, that is a clear illustration of how variation selectors have a legitimate, intended use case.)
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
That would be a good start, but since rop is platform dependent, it doesn't help us on Linux and Mac platform.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()
This is not a rendering (DirectWrite) issue. Vim internally classifies U+E0100–E01EF as composing characters in utf_iscomposing() and as zero-width in utf_char2cells() (both in mbyte.c). They are treated as invisible before they ever reach the rendering layer.
To address this, we would need to change how Vim handles these characters — for example, treating them as non-printable so they display as <e0100>, similar to how U+180E is shown. Since variation selectors do have legitimate uses, this should probably be controlled by an option.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()