Re: Can't paste Unicode Delta symbol (Δ) into vim nor gVim

625 views
Skip to first unread message

Tony Mechelynck

unread,
Sep 13, 2012, 3:56:51 PM9/13/12
to vim...@googlegroups.com, MC Andre
On 13/09/12 19:57, MC Andre wrote:
> I can copy and past the Unicode Delta symbol (οΏ½) in a normal Command Prompt window.
>
> I can copy and past ASCII Roman alphabetic text in vim and gVim.
>
> But I can't paste the Unicode Delta symbol (οΏ½) into vim nor gVim.
>
> Specs:
>
> * gVim 7.3
> * Windows 7 Professional x64
>
I just pasted it from your mail into my running gvim (7.3.661 Huge with
GTK2-GNOME2 GUI), and ga over it gives me:

<οΏ½> 916, Hex 0394, Octal 1624

So the next question is: What is your 'encoding' set to? Type

:verbose set encoding?

and see what the answer is. If it is latin1 or cp1252, or indeed any
8-bit encoding other than a Greek one, the reason you can't paste the
delta sign is that you haven't set gvim to be able to represent it in
memory. In that case, see

http://vim.wikia.com/wiki/Working_with_Unicode

about how to set gvim to work in UTF-8 (which can represent anything
that any other encoding can represent) and remember that any change to
'encoding' must happen in your vimrc, before any editfile is loaded,
otherwise the files which _are_ already loaded at the time of the change
will become garbled.


Best regards,
Tony.
--
All things are possible, except skiing thru a revolving door.

skeept

unread,
Sep 13, 2012, 4:21:10 PM9/13/12
to vim...@googlegroups.com, MC Andre
You may also have to choose a font that supports these characters.
In the font I am currently using type ctrl-k *a (ctrl-k followed by * followed by a) should show the greek letter alpha but instead it shows only a little square.

Tony Mechelynck

unread,
Sep 13, 2012, 6:39:16 PM9/13/12
to vim...@googlegroups.com, skeept, MC Andre
On 13/09/12 22:21, skeept wrote:
> On Thursday, September 13, 2012 3:56:56 PM UTC-4, Tony Mechelynck wrote:
>> On 13/09/12 19:57, MC Andre wrote:
>>
>>> I can copy and past the Unicode Delta symbol (οΏ½) in a normal Command Prompt window.

Can't your mailer read ISO-8859-7 as set in the Content-Type header,
André? I'll send this one in UTF-8 then.

>>
>>>
>>
>>> I can copy and past ASCII Roman alphabetic text in vim and gVim.
>>
>>>
>>
>>> But I can't paste the Unicode Delta symbol (οΏ½) into vim nor gVim.
>>
>>>
>>
>>> Specs:
>>
>>>
>>
>>> * gVim 7.3
>>
>>> * Windows 7 Professional x64
>>
>>>
>>
>> I just pasted it from your mail into my running gvim (7.3.661 Huge with
>>
>> GTK2-GNOME2 GUI), and ga over it gives me:
>>
>>
>>
>> <οΏ½> 916, Hex 0394, Octal 1624
>>
>>
>>
>> So the next question is: What is your 'encoding' set to? Type
>>
>>
>>
>> :verbose set encoding?
>>
>>
>>
>> and see what the answer is. If it is latin1 or cp1252, or indeed any
>>
>> 8-bit encoding other than a Greek one, the reason you can't paste the
>>
>> delta sign is that you haven't set gvim to be able to represent it in
>>
>> memory. In that case, see
>>
>>
>>
>> http://vim.wikia.com/wiki/Working_with_Unicode
>>
>>
>>
>> about how to set gvim to work in UTF-8 (which can represent anything
>>
>> that any other encoding can represent) and remember that any change to
>>
>> 'encoding' must happen in your vimrc, before any editfile is loaded,
>>
>> otherwise the files which _are_ already loaded at the time of the change
>>
>> will become garbled.
>>
>>
>>
>>
>>
>> Best regards,
>>
>> Tony.
>>
>> --
>>
>> All things are possible, except skiing thru a revolving door.
[repeat snipped]
>
> You may also have to choose a font that supports these characters.
> In the font I am currently using type ctrl-k *a (ctrl-k followed by * followed by a) should show the greek letter alpha but instead it shows only a little square.
>

Yes indeed. That is mentioned in the "Additional remarks" section on the
"Working with Unicode" wiki page I mentioned.

The font currently set in my gvim instance is Bitstream Vera Sans Mono,
and it has no problem with the delta character; it can also display
Latin, Greek, Cyrillic, Arabic, and others that I haven't tried; indeed
it has quite a wide charset repertoire. But I don't know if it is
available for Windows. If it isn't, see if you can get DejaVu Sans Mono
which is similar. And otherwise…

On Windows, IIRC "Courier New" has glyphs for many charsets, though it
isn't very pretty; OTOH "Lucida Console" had Cyrillic bold glyphs which
were (when I was on Windows) one pixel wider than its unbold glyphs, and
that wreaked havoc in Vim's display whenever I used both kinds on one
line. I didn't try it with Greek.


Best regards,
Tony.
--
Coward, n.:
One who in a perilous emergency thinks with his legs.
-- Ambrose Bierce, "The Devil's Dictionary"

Ben Fritz

unread,
Sep 14, 2012, 10:50:08 AM9/14/12
to vim...@googlegroups.com, skeept, MC Andre
On Thursday, September 13, 2012 5:39:20 PM UTC-5, Tony Mechelynck wrote:
> On 13/09/12 22:21, skeept wrote:
>
> > On Thursday, September 13, 2012 3:56:56 PM UTC-4, Tony Mechelynck wrote:
>
> >> On 13/09/12 19:57, MC Andre wrote:
>
> >>
>
> >>> I can copy and past the Unicode Delta symbol (οΏ½) in a normal Command Prompt window.
>
>
>
> Can't your mailer read ISO-8859-7 as set in the Content-Type header,
>
> André? I'll send this one in UTF-8 then.
>
>

On the Google Groups interface, the OP's message looks fine, but yours is garbled, actually. https://groups.google.com/d/topic/vim_dev/Yjv59u5y7Qw/discussion

>
> >
>
> > You may also have to choose a font that supports these characters.
>
> > In the font I am currently using type ctrl-k *a (ctrl-k followed by * followed by a) should show the greek letter alpha but instead it shows only a little square.
>
> >
>
>
>
> Yes indeed. That is mentioned in the "Additional remarks" section on the
>
> "Working with Unicode" wiki page I mentioned.
>
>
>
> The font currently set in my gvim instance is Bitstream Vera Sans Mono,
>
> and it has no problem with the delta character; it can also display
>
> Latin, Greek, Cyrillic, Arabic, and others that I haven't tried; indeed
>
> it has quite a wide charset repertoire. But I don't know if it is
>
> available for Windows. If it isn't, see if you can get DejaVu Sans Mono
>
> which is similar. And otherwise…
>
>

Deja Vu fonts are certainly available on Windows! It's what I settled on long ago for my Vim font. Get them from http://dejavu-fonts.org/wiki/Download or directly from sourceforge at http://sourceforge.net/projects/dejavu/ .

>
> On Windows, IIRC "Courier New" has glyphs for many charsets, though it
>
> isn't very pretty; OTOH "Lucida Console" had Cyrillic bold glyphs which
>
> were (when I was on Windows) one pixel wider than its unbold glyphs, and
>
> that wreaked havoc in Vim's display whenever I used both kinds on one
>
> line. I didn't try it with Greek.
>

On Windows, if I haven't installed Deja Vu yet, I like using Consolas. It doesn't have as many glyphs as Deja Vu, but it has a much wider range than the default Fixedsys font, and looks fairly nice in my opinion. Lucida Console and Lucida Typewriter look ok to me but I personally don't like them as they don't have very distinct glyphs for 0 and O. I have the same problem with Courier New, which additionally has similar glyphs for 1 and l.

All these fonts I mention have a glyph for the Δ character, and I can paste it into Vim just fine.

Tony Mechelynck

unread,
Sep 14, 2012, 1:56:20 PM9/14/12
to vim...@googlegroups.com, Ben Fritz, skeept, MC Andre
On 14/09/12 16:50, Ben Fritz wrote:
[...]
>
> On the Google Groups interface, the OP's message looks fine, but yours is garbled, actually. https://groups.google.com/d/topic/vim_dev/Yjv59u5y7Qw/discussion
>
[...]

Strange. I get the list messages by POP on my gmail account: the OP's
message arrived to me in ISO-8859-7 (Greek) and I replied (the first
time) in the same encoding.

My ISP blocks me from sending anything by SMTP except to its own
servers, so my reply was sent to the list (just like this one) with a
From: @google.com but a Received: by relay.skynet.be (and not by
smtp.gmail.com).

I know there are sometimes "weird transcodings" when sending to the
list, and that the charset in the Content-Type header is not always
obeyed. I'm sending this message in UTF-8 because its content wouldn't
tolerate anything else. How do you see the following letters from U+00A0
and above? (I could have added more but I thought these ones were
enough.) (The names are from memory, I didn't cross-check with Unicode.)

for French:
é LATIN SMALL LETTER E WITH ACUTE ACCENT
É LATIN CAPITAL LETTER E WITH ACUTE ACCENT
ù LATIN SMALL LETTER U WITH GRAVE ACCENT
Î LATIN CAPITAL LETTER I WITH CIRCUMFLEX
œ LATIN SMALL LIGATURE OE
Œ LATIN CAPITAL LIGATURE OE

for Danish:
æ LATIN SMALL LIGATURE AE
Æ LATIN CAPITAL LIGATURE AE
ø LATIN SMALL LETTER O WITH DIAGONAL
Ø LATIN CAPITAL LETTER O WITH DIAGONAL

for Icelandic:
Þ LATIN CAPITAL LETTER THORN
ð LATIN SMALL LETTER ETH

for Spanish:
ñ LATIN SMALL LETTER N WITH TILDE
Ñ LATIN CAPITAL LETTER N WITH TILDE

for Polish:
ł LATIN SMALL LETTER L WITH BAR

for Czech, Croatian, etc.:
Č LATIN CAPITAL LETTER C WITH CARON

for Esperanto:
ĉ LATIN SMALL LETTER C WITH CIRCUMFLEX
Ŝ LATIN CAPITAL LETTER S WITH CIRCUMFLEX
ŭ LATIN SMALL LETTER U WITH BREVE
Ŭ LATIN CAPITAL LETTER U WITH BREVE

for Greek: small letters + space + final sigma, then capitals:
αβγδεζηθικλμνξοπρστυφχψω ς
ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ

for Russian: lowercase then uppercase (with yo after ye and short i
after i):
абвгдеёжзийклмнопрстуфхцчшщъыьэюя
АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ


Best regards,
Tony.
--
"This is lemma 1.1. We start a new chapter so the numbers all go back
to one."
-- Prof. Seager, C&O 351

Benjamin Fritz

unread,
Sep 14, 2012, 2:04:38 PM9/14/12
to Tony Mechelynck, vim...@googlegroups.com, skeept, MC Andre
These all look valid to me both on the list and in GMail. Weird.

Tony Mechelynck

unread,
Sep 14, 2012, 2:46:23 PM9/14/12
to Benjamin Fritz, vim...@googlegroups.com, skeept, MC Andre
On 14/09/12 20:04, Benjamin Fritz wrote:
[...]
> These all look valid to me both on the list and in GMail. Weird.
>
Well, I guess it didn't like my ISO-8859-7 in the other post then. For
some reason the original post (as I received it) had its Subject header
in UTF-8 with MIME wrapping but its text in ISO-8859-7: maybe some mail
router along the way silently translated the post (and changed the
Content-Type charset accordingly)? That would have spared it one byte
per Delta in the text (uppercase delta is 0xC4 in ISO-8859-7 but 0xCE
0x94 in UTF-8) but at the cost of six bytes in the Content-Type header
("iso-8859-7" is 6 bytes longer than "utf-8").

I suppose I should always reply to the list in UTF-8, not in the same
charset as whatever I'm replying to, which is my mailer's default if the
charset fits (with fallback to UTF-8 if it doesn't). I can change the
preferences to always reply to everything in "my preferred charset"
which I can set to UTF-8 (or maybe to Latin1, which would still fallback
to UTF-8 if anything higher than U+00FF was encountered).


Best regards,
Tony.
--
Eggnog is a traditional holiday drink invented by the English. Many
people wonder where the word "eggnog" comes from. The first syllable
comes from the English word "egg", meaning "egg". I don't know where
the "nog" comes from.

To make eggnog, you'll need rum, whiskey, wine gin and, if they are in
season, eggs...

Reply all
Reply to author
Forward
0 new messages