delete chinese character quickly

260 views
Skip to first unread message

Yang Luo

unread,
Mar 13, 2016, 7:51:43 AM3/13/16
to vim_use
this is a subtitle.srt, I want to delete all the Chinese character(like this:距离地球4亿英里,存在这一个迷你太阳系,), how can I do it using vim command?
I know dd can delete a line, but there are too many liens.

1
00:00:03,400 --> 00:00:07,730
距离地球4亿英里,存在这一个迷你太阳系,
Four hundred million miles from Earth, exists a mini solar system

2
00:00:07,760 --> 00:00:12,390
60多个卫星围绕着一个强大的气态行星旋转。
of over 60 moons rotating around a powerful planet of gas.

3
00:00:17,770 --> 00:00:20,980
它流动的颜色和斑点保持了奇异的美景,
Its flowing colors and spots hold strange beauty,

4
00:00:21,530 --> 00:00:24,390
然而也包含了强烈的风暴、喷射气流。
but contain violent storms and jet streams.

bst...@gmail.com

unread,
Mar 13, 2016, 9:18:41 AM3/13/16
to vim_use
在 2016年3月13日星期日 UTC+8下午7:51:43,Yang Luo写道:
%s/[^ -~]//g

Erik Christiansen

unread,
Mar 13, 2016, 11:12:31 PM3/13/16
to vim_use
On 13.03.16 06:18, bst...@gmail.com wrote:
> 在 2016年3月13日星期日 UTC+8下午7:51:43,Yang Luo写道:
> > this is a subtitle.srt, I want to delete all the Chinese character(like this:距离地球4亿英里,存在这一个迷你太阳系,), how can I do it using vim command?
...
> %s/[^ -~]//g

Hey, that's good enough to try here. Might want to tweak it, though.
It may be OK for it to scrub e.g. », Ω, and ³, but it's doubtful that tab
(0x09) should go. (Space is 0x20)

[^\t-Ω] (Tab to Omega) is a step forward, covering all those cases.

Really good would be to look up a utf-8 table, and pick min and max
Chinese characters. (Presumably, the presence of ASCII characters
precludes that the chinese is in another encoding, such as GB2312, Big5,
or CNS-11643.)

Erik

Alexas Chee

unread,
Mar 14, 2016, 12:42:36 PM3/14/16
to vim_use
在 2016年3月13日星期日 UTC+8下午7:51:43,Yang Luo写道:
%s/[^\x00-\xFF]//g

or, even better:

g/[^\x00-\xff]/d

Yang Luo

unread,
Apr 6, 2016, 11:35:15 AM4/6/16
to vim_use
thanks a lot.
After run the command, but there is another problem. Here also exist "4" "60" non-chinese character in chinese character line. I want to delete the full line if there is a chinese character.
Reply all
Reply to author
Forward
0 new messages