It has come to my attention, that vim doesn't display Unicode variation selectors: https://en.wikipedia.org/wiki/Variation_Selectors_(Unicode_block)
This issue is being used by malicious actors to hide payloads in plain sight:
https://www.koi.ai/blog/glassworm-first-self-propagating-worm-using-invisible-code-hits-openvsx-marketplace
https://www.reddit.com/r/vim/comments/1obeoog/how_to_display_nonprintable_unicode_characters/
How to reproduce:
~ $ printf '\x0a\x76\x61\x72\x20\x64\x65\x63\x6f\x64\x65\x64\x42\x79\x74\x65\x73\x20\x3d\x20\x64\x65\x63\x6f\x64\x65\x28\x27\x7c\xF3\xA0\x85\x94\xF3\xA0\x85\x9D\xF3\xA0\x84\xB6\xF3\xA0\x85\xA9\xF3\xA0\x84\xB9\xF3\xA0\x84\xB6\xF3\xA0\x84\xA9\xF3\xA0\x85\x96\xF3\xA0\x85\x89\xF3\xA0\x84\xA3\xF3\xA0\x84\xBA\xF3\xA0\x85\x9C\xF3\xA0\x85\x89\xF3\xA0\x85\x88\xF3\xA0\x85\x82\xF3\xA0\x85\x9C\xF3\xA0\x84\xB9\xF3\xA0\x84\xB4\xF3\xA0\x84\xA0\xF3\xA0\x85\x97\xF3\xA0\x85\x84\xF3\xA0\x84\xA2\xF3\xA0\x84\xBA\xF3\xA0\x85\xA1\xF3\xA0\x85\xA5\x27\x29' > output.js
It's explained here: https://www.reddit.com/r/vim/comments/1obeoog/comment/nkonjui/
var decodedBytes = decode('|󠅔󠅝'), instead of all the unicode characters.:set encoding=latin1, and see that the unicode characters appear.To me, this is a security issue. I prefer to see symbols representing unknown unicode characters rather than no characters at all.
But none of the options I've tried seems to display these unicode characters: set conceallevel, fileencoding=utf-t, list, listchars=, display+=uhex, binary, noeol, nofixeol, noemoji
9.1.1882
Operating System: Debian
X: X11 and Xwayland
Terminal:
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.![]()
well, isn't the whole idea of unicode variation selectors to be invisible? I also checked briefly using more and less, it hides those characters as well.
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.![]()
Yes, like the control characters. The thing is that, as far as I can tell, there's no way to display these set of characters.
:set list has no effect in this case. It'd be great if we could unhide these characters with :set list or other option.
—
Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.![]()
This is eerily similar to what was asked, and what I answered, at https://www.reddit.com/r/vim/s/ND6rFiatjg
They are not supposed to be visible per se, but adding them to spaces has no purpose. The challenge is, though, where to stop? For example, when should any non-displayable character be displayed versus not? That’s already done for control codes, plus characters like U+FEFF, but what about ones that aren’t displayed but are legitimately spaces, e.g., en space (U+2002), and what about other invisible/non-displayed private use characters like U+10FFFD - should they display as something if not in your current font or not?
The rationale, actual scope of the risk (none in the editor per se), and potential side effects should be carefully weighed before making any universal change to how these are presented (IMO). A plugin to display such characters could be the go - an optional one for the few who would want/need it?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.![]()