Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Special Characters

54 views
Skip to first unread message

Ian Baylis

unread,
Aug 10, 2015, 9:37:36 PM8/10/15
to
I have a file that has special characters in it. When I open the file in Emacs an ' is represented like:
\200\231 or just \231

What type of characters are they and is there a list of those characters I can use as reference for the future.

Thank you

Emanuel Berg

unread,
Aug 10, 2015, 10:32:59 PM8/10/15
to help-gn...@gnu.org
Move point, then use `describe-char' - I have

(defalias 'what-char 'describe-char)

(OT: should that be sharp-quoted as well?)

I think "what-char" is neat as it is faster to type
and consistent with `what-face' etc.

--
underground experts united
http://user.it.uu.se/~embe8573


Emanuel Berg

unread,
Aug 10, 2015, 10:35:14 PM8/10/15
to help-gn...@gnu.org
Emanuel Berg <embe...@student.uu.se> writes:
> Move point, then use `describe-char' - I have
>
> (defalias 'what-char 'describe-char)
>
> (OT: should that be sharp-quoted as well?)
>
> I think "what-char" is neat as it is faster to type
> and consistent with `what-face' etc.

(defun what-face (pos)
(interactive "d")
(let*((point (point))
(face (or (get-char-property point 'face)
(get-char-property point 'read-cf-name) )))
(if face (message " Face: %s" face)
(message " No face at %d." pos) )))

Ian Zimmerman

unread,
Aug 10, 2015, 10:43:29 PM8/10/15
to help-gn...@gnu.org
https://www.cs.tut.fi/~jkorpela/chars/

http://www.eki.ee/letter/

--
Please *no* private copies of mailing list or newsgroup messages.
Rule 420: All persons more than eight miles high to leave the court.


Eli Zaretskii

unread,
Aug 10, 2015, 10:47:05 PM8/10/15
to help-gn...@gnu.org
> Date: Mon, 10 Aug 2015 18:37:34 -0700 (PDT)
> From: Ian Baylis <ibay...@gmail.com>
> Injection-Date: Tue, 11 Aug 2015 01:37:34 +0000
>
> I have a file that has special characters in it. When I open the file in Emacs an ' is represented like:
> \200\231 or just \231
>
> What type of characters are they and is there a list of those characters I can use as reference for the future.

Those are not necessarily "special". They could be normal characters
that Emacs failed to decode, because other parts of the file were
encoded inconsistently, or because it used the locale-specific
defaults, and the file didn't tell it needs to be decoded differently.

Are you sure this is not a UTF-8 encoded file? What happens if you
visit it with "C-x RET c utf-8 RET C-x C-f FILE-NAME RET"?

Ian Baylis

unread,
Aug 10, 2015, 10:57:36 PM8/10/15
to
I was told it was unicode, UTF-8 encoded, but the I noticed that \200\231 translated over to ' and not that strange c character.

Rusi

unread,
Aug 10, 2015, 11:04:43 PM8/10/15
to
On Tuesday, August 11, 2015 at 8:27:36 AM UTC+5:30, Ian Baylis wrote:
> I was told it was unicode, UTF-8 encoded, but the I noticed that \200\231 translated over to ' and not that strange c character.

I think what Eli is saying is that you may have been told but you may not have
told emacs (successfully). Usually emacs guesses it ok; sometimes it needs a
bit of help. Eli's command is to ensure that. So please try it and repost

Emanuel Berg

unread,
Aug 10, 2015, 11:13:28 PM8/10/15
to help-gn...@gnu.org
Emanuel Berg <embe...@student.uu.se> writes:

> (defun ...

Better (?) version:

(defun what-face (pos)
(interactive "d")
(let((face (or (get-char-property pos 'face)
(get-char-property pos 'read-cf-name) )))
(message " Face: %s" (or face "(no face!)")) ))

Source:

http://user.it.uu.se/~embe8573/conf/emacs-init/faces.el

Marcin Borkowski

unread,
Aug 11, 2015, 12:43:40 PM8/11/15
to help-gn...@gnu.org

On 2015-08-11, at 04:31, Emanuel Berg <embe...@student.uu.se> wrote:
> Move point, then use `describe-char' - I have
>
> (defalias 'what-char 'describe-char)
>
> (OT: should that be sharp-quoted as well?)
>
> I think "what-char" is neat as it is faster to type
> and consistent with `what-face' etc.

And why not just move point to that place and type `C-u C-x ='? On my
keyboard, this is faster even than `M-x what-char RET'.

You're welcome;-)

--
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University

Yuri Khan

unread,
Aug 11, 2015, 1:22:12 PM8/11/15
to Ian Baylis, help-gn...@gnu.org
On Tue, Aug 11, 2015 at 7:37 AM, Ian Baylis <ibay...@gmail.com> wrote:
> I have a file that has special characters in it. When I open the file in Emacs an ' is represented like:
> \200\231 or just \231

The Unicode apostrophe ’ (U+2019 Right single quotation mark) is
encoded in UTF-8 as a sequence of three bytes, whose octal
representation is \342\200\231 (or hexadecimal E2 80 99).

If your Emacs incorrectly picks e.g. the ISO-8859-1 (aka Latin-1)
encoding for this file, you will see the letter â (U+00E2 Latin small
letter a with circumflex), followed by two codes \200 and \231,
because those do not correspond to printable characters in Latin-1.

In order to view the file as intended, you need to re-open that file
using the correct encoding (UTF-8). Eli has given you the command:

C-x RET c utf-8 RET C-x C-f FILE-NAME RET

Alternatively, if you already have a buffer visiting the file, you can
revert it using the correct encoding:

C-x RET r utf-8 RET (you might need to confirm the revert).

You then need to evaluate how often you use files in encodings other
than UTF-8. If rarely, you might want to set UTF-8 as your default
encoding.

Emanuel Berg

unread,
Aug 11, 2015, 10:50:59 PM8/11/15
to help-gn...@gnu.org
Marcin Borkowski <mb...@mbork.pl> writes:

>> I think "what-char" is neat as it is faster to type
>> and consistent with `what-face' etc.
>
> And why not just move point to that place and type
> `C-u C-x ='? On my keyboard, this is faster even
> than `M-x what-char RET'.

It might be faster in some sense but that isn't
something I do every day and neither does the
Joe Emacs Hacker deeming from that bulky keystroke.

If I did use it often, I would assign a better
keystroke, but I don't so I'll stick with "what-char".

I like the idea of asking questions and having the
computer answer them. More people should do interfaces
like that. Here is a good example of what I mean:

http://user.it.uu.se/~embe8573/distance/

Yuri Khan

unread,
Aug 12, 2015, 12:19:52 AM8/12/15
to Ian Baylis, help-gn...@gnu.org
On Tue, Aug 11, 2015 at 11:55 PM, Ian Baylis <ibay...@gmail.com> wrote:
> Thanks for the reply. Is there a list that contains all the octal
> representations of characters like \342\200\231?

If you’re interested, you might want to read a description of the
UTF-8 encoding, then browse the Unicode charts.

http://tools.ietf.org/html/rfc3629
http://www.unicode.org/charts/


However, I must ask: Why do you want to know? Are you going to
hand-decode files that come your way? Why not delegate that work to
computers?

There are many different character encodings. When people or software
do not agree on which one they use, misdecoding occurs. With some
experience, one can make an accurate guess at which encoding was used
originally, although this becomes less necessary as we migrate to
UTF-8.


PS: please don’t top-post.

> On Aug 11, 2015 1:22 PM, "Yuri Khan" <yuri....@gmail.com> wrote:
>>
>> On Tue, Aug 11, 2015 at 7:37 AM, Ian Baylis <ibay...@gmail.com> wrote:
>> > I have a file that has special characters in it. When I open the file
>> > in Emacs an ' is represented like:
>> > \200\231 or just \231
>>

Marcin Borkowski

unread,
Aug 12, 2015, 3:56:23 AM8/12/15
to help-gn...@gnu.org

On 2015-08-12, at 04:49, Emanuel Berg <embe...@student.uu.se> wrote:

> Marcin Borkowski <mb...@mbork.pl> writes:
>
>>> I think "what-char" is neat as it is faster to type
>>> and consistent with `what-face' etc.
>>
>> And why not just move point to that place and type
>> `C-u C-x ='? On my keyboard, this is faster even
>> than `M-x what-char RET'.
>
> It might be faster in some sense but that isn't
> something I do every day and neither does the
> Joe Emacs Hacker deeming from that bulky keystroke.

Joking aside, I do it often enough to remember the keychord. And it's
not that bad - just two keys (counting C-x as one), with universal
argument if you want more detailed info.

> I like the idea of asking questions and having the
> computer answer them. More people should do interfaces

[Insert your favorite sci-fi reference joke here.]

> like that. Here is a good example of what I mean:
>
> http://user.it.uu.se/~embe8573/distance/

Cute! I can't find any use-case for me, unfortunately.

Best,

Emanuel Berg

unread,
Aug 12, 2015, 9:29:00 PM8/12/15
to help-gn...@gnu.org
Marcin Borkowski <mb...@mbork.pl> writes:

> Joking aside, I do it often enough to remember the
> keychord. And it's not that bad - just two keys
> (counting C-x as one), with universal argument if
> you want more detailed info.

It is not *very* hard/bad but I don't do it often
enough to be sure I will remember it one year from
now. Also, at least on my keyboard with my hands, the
key `=' isn't the easiest one to hit. ("C--", for some
reason called `C-_', is better.)

Perhaps I'll do a combination of "what-char" and
"what-face" with the universal argument as it works
with `what-cursor-position'.
0 new messages