Replacing odd characters.

0 views
Skip to first unread message

John Culleton

unread,
Jan 11, 2008, 4:29:14 PM1/11/08
to v...@vim.org
On a regular basis I convert pdf files to text before further processing. In
the process characters such as smart quotes, apostrophes etc. get converted
to non-printing characters which in turn are represented by very strange
strings in Gvim. Actually most of these items are two bytes. For example
opening quotes are octal 200 followed by octal 230 and closing quotes are
octal 200 followed by octal 231. The Vim representation is pretty tough to
replicate, starting with the letter a under a circumflex accent and followed
by ~@~T for the octal 200 230 combination.

How would represent each of these in a substitute command that was effectively

:%s/[octal strings]/replacement character(s)/g

It is the octal strings that give me a fit. Must I prefix each digit with \o?
I use Gvim 7.1 on a Slackware Linux version 12 system.
--
John Culleton

Tony Mechelynck

unread,
Jan 11, 2008, 5:46:35 PM1/11/08
to vim...@googlegroups.com, v...@vim.org

What 'encoding' are you using? Neither 0x80 0x98 nor 0x80 0x99 are valid in
UTF-8, in both UTF-16be and UTF-16le they are both CJK hanzi/kanji/hanja.

Well, to replace either sequence by a single quote you could use

:exe "%s/\<Char-128>[\<Char-152>\<Char-153>]/'/g"

see
:help expr-quote
:help <Char>


Best regards,
Tony.
--
I realize that today you have a number of top female athletes such as
Martina Navratilova who can run like deer and bench-press Chevrolet
trucks. But to be brutally frank, women as a group have a long way to
go before they reach the level of intensity and dedication to sports
that enables men to be such incredible jerks about it.
-- Dave Barry, "Sports is a Drag"

Reply all
Reply to author
Forward
0 new messages