Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

VT100 to text utility

20 views
Skip to first unread message

Gary Dalton

unread,
Dec 19, 2001, 12:39:38 PM12/19/01
to
I need to remove all escape and control sequences from
a screen VT100 capture file. Does anyone know of a utility
or some regexes that will do the job?

It sure would save me alot of time! I have the VT100 manual
with the codes.

Thanks very much in advance!

Gary Dalton - Verizon

Ben Harris

unread,
Dec 19, 2001, 12:57:31 PM12/19/01
to
In article <9vqjcr$l4r$1...@news.gte.com>,

Gary Dalton <gda...@verizon.com> wrote:
>I need to remove all escape and control sequences from
>a screen VT100 capture file. Does anyone know of a utility
>or some regexes that will do the job?

The fragment of Perl I use to do this goes:
# Kill ANSI sequences
s!(\e\[|\x9b)[0-?]*[ -/]*[@-~]!!g;
s!\e[ -/]*[0-~]!!g;

This matches any ANSI control sequence (beginning with CSI), and then any
other ANSI escape sequence. It leaves in all other single-character
controls, but you can easily strip out whichever ones you happen not to
want (you probably want to keep LF, for instance.

--
Ben Harris
Unix Support, University of Cambridge Computing Service.
If I wanted to speak for the University, I'd be in ucam.comp-serv.announce.

Gary Dalton

unread,
Dec 19, 2001, 2:45:13 PM12/19/01
to

Thanks very much Ben!!

Gary Dalton - Verizon

"Ben Harris" <bj...@cus.cam.ac.uk> wrote in message
news:9vqkeb$5d6$1...@pegasus.csx.cam.ac.uk...

James Carlson

unread,
Dec 20, 2001, 10:01:05 AM12/20/01
to
"Gary Dalton" <gda...@verizon.com> writes:
> > >I need to remove all escape and control sequences from
> > >a screen VT100 capture file. Does anyone know of a utility
> > >or some regexes that will do the job?

Here's another rotten little program to do this. I wrote it for
scrubbing the output of "man." It implements the state machine found
in the real DEC VT-102, VT-220, and VT-320 terminals. It's pretty old
and stale, but it does the job ...

#include <stdio.h>

int nobs = 0;

#define BEL 0x07
#define BS 0x08
#define TAB 0x09
#define LF 0x0A
#define VT 0x0B
#define FF 0x0C
#define CR 0x0D
#define SO 0x0E
#define SI 0x0F
#define CAN 0x18
#define SUB 0x1A
#define ESC 0x1B
#define SS2 0x8E
#define SS3 0x8F
#define DCS 0x90
#define CSI 0x9B
#define ST 0x9C
#define OSC 0x9D
#define PM 0x9E
#define APC 0x9F

enum {
Normal, Esc, Csi, Dcs, DcsString, DropOne, CSInitial
} vtstate = Normal;

void
filter_out_vt(fp)
FILE *fp;
{
int chr, prevc;

prevc = 0;
while ((chr = getc(fp)) != EOF) {
chr &= 0xFF;
if (vtstate == DropOne) {
vtstate = Normal;
continue;
}
/* Handle normal ANSI escape mechanism */
/* (Note that this terminates DCS strings!) */
if (vtstate == Esc && chr >= 0x40 && chr <= 0x5F) {
vtstate = Normal;
chr += 0x40;
}
switch (chr) {
case CAN:
case SUB:
vtstate = Normal;
break;
case ESC:
vtstate = Esc;
break;
case CSI:
vtstate = Csi;
break;
case DCS:
case OSC: /* VT320 commands */
case PM:
case APC:
vtstate = Dcs;
break;
default:
if ((chr & 0x6F) < 0x20) { /* Check controls */
switch (chr) {
case BS:
if (nobs) {
prevc = 0;
break;
}
/* VT oddity -- controls go through regardless of state. */
case BEL: /* Pass these through */
case TAB:
case LF:
case VT:
case FF:
case CR:
if (prevc != 0)
putchar(prevc);
prevc = chr;
break;
}
break;
}
switch (vtstate) {
case Normal:
if (prevc != 0)
putchar(prevc);
prevc = chr;
break;
case Esc:
vtstate = Normal;
switch (chr) {
case 'c': case '7': case '8':
case '=': case '>': case '~':
case 'n': case '\123': case 'o':
case '|':
break;
case '#': case ' ':
vtstate = DropOne;
break;
case '(': case ')': case '*': case '+':
vtstate = CSInitial;
break;
}
break;
case CSInitial:
case Csi:
case Dcs:
if (chr >= 0x40 && chr <= 0x7E)
if (vtstate == Dcs)
vtstate = DcsString;
else
vtstate = Normal;
break;
case DcsString:
/* Just drop everything here */
break;
}
}
}
if (prevc != 0)
putchar(prevc);
}

int
main(argc,argv)
int argc;
char **argv;
{
char *arg;
FILE *fp;

if (argv[1] != NULL && strcmp(argv[1], "-b") == 0) {
nobs = 1;
argv++;
argc--;
}

if (argc <= 1)
filter_out_vt(stdin);
else
while ((arg = *++argv) != NULL)
if ((fp = fopen(arg,"r")) == NULL)
perror(arg);
else {
filter_out_vt(fp);
fclose(fp);
}
return 0;
}


--
James Carlson, Solaris Networking <james.d...@east.sun.com>
SUN Microsystems / 1 Network Drive 71.234W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.497N Fax +1 781 442 1677

Gary Dalton

unread,
Dec 21, 2001, 11:34:02 AM12/21/01
to

Thanks James - works really good!
I really appreciate it.

Gary Dalton - Verizon

"James Carlson" <james.d...@sun.com> wrote in message
news:xoavbsgu...@sun.com...

Gary Dalton

unread,
Dec 21, 2001, 11:35:18 AM12/21/01
to
Thanks Rob, very much - an excellant utility!

Gary Dalton - Verizon

"Robert de Bath" <rd10...@mayday.cix.co.uk> wrote in message
news:6393af3f...@mayday.cix.co.uk...
> Wrote this ages ago. It does AVATAR too, and you can keep colors if you
> like, and do a CP437->ASCII.
>
> --
> Rob. (Robert de Bath <robert$ @ debath.co.uk>)
> <http://www.cix.co.uk/~mayday>
>


0 new messages