Search for unicode character

2,034 views
Skip to first unread message

Marek Stepanek

unread,
Jul 18, 2018, 4:40:57 AM7/18/18
to bbe...@googlegroups.com

Hello all!


I pasted a text from a open Office into BBEdit TeX file. But the
compiling stops with:

! Package inputenc Error: Unicode character ̈ (U+308)
(inputenc) not set up for use with LaTeX.

I switched on "show invisibles". I zapped gremlins. But nothing helps.
Looking through the help files, nowhere is a hint, how to look for a
Unicode character ... But I was pretty sure, this was possible in BBEdit.


Thank you for your help!



marek

Kjetil Rå Hauge

unread,
Jul 18, 2018, 6:09:18 AM7/18/18
to bbe...@googlegroups.com
On 18/07/2018, 10:41, "bbe...@googlegroups.com on behalf of Marek Stepanek" <bbe...@googlegroups.com on behalf of ms...@podiuminternational.org> wrote:
...
> I pasted a text from a open Office into BBEdit TeX file. But the
>compiling stops with:

> ! Package inputenc Error: Unicode character ̈ (U+308)
> (inputenc) not set up for use with LaTeX.

Check in System Preferences>Keyboard that you have selected "Show keyboard and emoji viewers in menu bar" - you will then have a little flag in the menu bar, and the full character viewer can be found there, hiding under the playful title "Show Emoji and Symbols". When the viewer appears, select "Unicode" at bottom, scroll down to "00000300 Combining Diacritical Marks". Bring up the "Find" window in BBEdit and click in it then, then click on 0308 "Combining diaeresis", and it will appear in the "Find" window. This should work - I don’t see any reason why it would - er, why it wouldn’t, I mean.

-- Kjetil Rå Hauge, ILOS, Oslo University



Marek Stepanek

unread,
Jul 18, 2018, 6:26:09 AM7/18/18
to bbe...@googlegroups.com
Thank you for your prompt and detailed answer! Unfortunately this is not
working. I insert the screen shot of the error code.

I was hoping for a direct solution, like for example U+308 directly in
the find dialogue of BBEdit.


Best greetings from Munich


marek
Screen Shot 2018-07-18 at 12.21.45.png

Maarten Sneep

unread,
Jul 18, 2018, 8:21:24 AM7/18/18
to bbe...@googlegroups.com
Close: From the manual (page 173, grep searching, matching non-printing characters):

\x{0308} should do it, make sure to enable grep.

Of course the character is a ‘combining diaeresis’, so I’m not 100% sure the part will be recognised separately.

Maarten

Rich Siegel

unread,
Jul 18, 2018, 8:40:50 AM7/18/18
to bbe...@googlegroups.com
On 7/18/18 at 8:21 AM, maarte...@xs4all.nl (Maarten Sneep) wrote:

>Close: From the manual (page 173, grep searching, matching non-printing characters):
>
>\x{0308} should do it, make sure to enable grep.

That is the correct character escape, but Grep is not required.

Enjoy,

R.
--
Rich Siegel Bare Bones Software, Inc.
<sie...@barebones.com> <http://www.barebones.com/>

Someday I'll look back on all this and laugh... until they
sedate me.

Kjetil Rå Hauge

unread,
Jul 18, 2018, 9:02:16 AM7/18/18
to bbe...@googlegroups.com
On 18/07/2018, 12:26, "bbe...@googlegroups.com on behalf of Marek Stepanek" <bbe...@googlegroups.com on behalf of ms...@podiuminternational.org> wrote:


>Thank you for your prompt and detailed answer! Unfortunately this is not
> working. I insert the screen shot of the error code.

Have a look at my setup in the enclosed screenshot (MacOs 10.13.5, BBEdit 12.1.5). I can type something into the "Replace" field and hit "Replace", and the first diaeresis in the text will be replaced. The character viewer provides a simple, application-independent way of putting exotic characters into any program’s search or replace field.
diaeresis.jpeg

Patrick Woolsey

unread,
Jul 18, 2018, 9:42:05 AM7/18/18
to bbe...@googlegroups.com
On 7/18/18 at 4:40 AM, ms...@podiuminternational.org (Marek
Stepanek) wrote:

>I pasted a text from a open Office into BBEdit TeX file. But the
>compiling stops with:
>
>! Package inputenc Error: Unicode character ̈ (U+308)
>(inputenc) not set up for use with LaTeX.
>
>I switched on "show invisibles". I zapped gremlins. But nothing helps.
>Looking through the help files, nowhere is a hint, how to look for a
>Unicode character ... But I was pretty sure, this was possible in BBEdit.
>

For future reference, if you apply Zap Gremlins with the
"Replace with code" option, BBEdit will replace all characters
within the current document that meet the chosen "Search for:"
criteria with hex escapes -- which you can then find by
searching and/or visual inspection.

By way of example, this operation will transform:

u (U+0075 LATIN SMALL LETTER U) + ̈ (U+0308 COMBINING DIAERESIS)

into

u\0x0308

Alternatively, if you want to identify any character(s) which
are visible within a document, you need only select it (them)
and bring up the Character Inspector palette (Window -> Palettes
-> Character Inspector), as per the attached screenshot.

Hope this helps. :-)


Regards,

Patrick Woolsey
==
Bare Bones Software, Inc. <http://www.barebones.com/>
Screen Shot 2018-07-18 at 09.32.43.png

Marek Stepanek

unread,
Jul 18, 2018, 11:49:31 AM7/18/18
to bbe...@googlegroups.com

Thank you Patrick for your detailed answer! Unfortunately I did not get
it. In my memory far away, I thought in BBEdit it is possible to search
directly for a Unicode, something like: \u{0x0308} ???

I discovered in my text filters an old filter, which I put up long time
ago and I called it "pdf_to_text.pl" If you are running it over the
text, which you copy and pasted from a pdf to your txt file, it replaces
the "wrong" Umlauts with the right ones.

In this script all special characters out of the range of ASCII are
replaced with the real ones (I can't explain it better). I only don't
know whether my email client will not transform it; so probably this
little perl-script will not work any more. In any case; here it was
working, and my special character is away! Uff!

#!/usr/bin/perl

# this filter are looking a bit strange.
# but copying from a pdf file text with
# German Umlauts gives you double? letters
# and not unicode letters. Don't know
# how to explain it more precisely.
# So the letter in the search and replace
# here are not the same.

use warnings;
use strict;


while (<>) {
s/- //g;
s/’/'/g;
s/„/"/g;
s/“/"/g;
s/­ //g;
s/- (?!Euro)//ig;
s/–/-/g;
s/ü/ü/g;
s/Ü/Ü/g;
s/ö/ö/g;
s/Ö/Ö/g;
s/ß/ß/g;
s/ä/ä/g;
s/ç/ç/g;
s/„/<em>„/g;
s/“/“<\/em>/g;
s/è/è/g;
s/é/é/g;
print;

Maarten Sneep

unread,
Jul 18, 2018, 12:15:01 PM7/18/18
to bbe...@googlegroups.com


> On 18 Jul 2018, at 17:49, Marek Stepanek <ms...@podiuminternational.org> wrote:
>
> Thank you Patrick for your detailed answer! Unfortunately I did not get
> it. In my memory far away, I thought in BBEdit it is possible to search
> directly for a Unicode, something like: \u{0x0308} ???

I already gave that answer: \x{0308}

Maarten

Marek Stepanek

unread,
Jul 19, 2018, 10:12:25 AM7/19/18
to BBEdit Talk

Thank you all for your insight and the answers!

In the BBEdit help, if I am searching for "unicode", there are no results. 


Best greetings 


marek

Patrick Woolsey

unread,
Jul 19, 2018, 11:18:23 AM7/19/18
to bbe...@googlegroups.com
On 7/18/18 at 2:33 AM, bbe...@googlegroups.com ('Marek Stepanek'
via BBEdit Talk) wrote:

>Thank you all for your insight and the answers!
>
>In the BBEdit help, if I am searching for "unicode", there are
>no results.

As a reminder, please refer to BBEdit's manual (Help -> User
Manual) instead of the help book as the former is far more
likely to have any info you need. :-)

For this purpose, there are tables which detail how you can
search for special characters in both Chapters 7 and 8 of the
manual, e.g. under "Matching Non-Printing Characters" (page 173
in the current edition).
Reply all
Reply to author
Forward
0 new messages