Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Best way to remove a page break from a .txt file

963 views
Skip to first unread message

Saucer Man

unread,
Jun 21, 2005, 4:48:44 PM6/21/05
to
I have a .txt file with a page break character somewhere in it. It might be
the last or second to the last character in the file. What would be the
best way to locate and remove this page break using a batch file?

--

Thanks.


Phil Robyn

unread,
Jun 21, 2005, 5:23:59 PM6/21/05
to

What is a page break character?

--
Phil Robyn
University of California, Berkeley

William Allen

unread,
Jun 21, 2005, 5:22:27 PM6/21/05
to
"Saucer Man" wrote in message

> I have a .txt file with a page break character somewhere in it. It might be
> the last or second to the last character in the file. What would be the
> best way to locate and remove this page break using a batch file?

One way to remove a form-feed character is with SED (the
Stream EDitor). The metacharacter for a form-feed is \f

Typical SED syntax to remove a form-feed character:
sed s/\f// input.txt>OUTPUT.TXT

Use Google for SED for your OS, or download the version I
happen to use (GNU sed version 3.02.80) from:
http://www.allenware.com/mcsw/bus.htm#ThirdParty

Alternatively, for small files (less than around 64Kbytes),
you could try this. It creates a small assembly language
program that removes all instances of a character, in this
case set to hex 0c (form-feed). It needs no third-party software.

Lines that don't begin with two spaces have wrapped accidentally
====Begin cut-and-paste (omit this line)
@ECHO OFF
:: Set the hex code for elided character here
SET CH=c
IF [%1]==[] GOTO USAGE
IF NOT EXIST %1 GOTO USAGE
ECHO.a90>_SCRIPT
ECHO.mov si,100>>_SCRIPT
ECHO.mov di,100>>_SCRIPT
ECHO.cmp byte ptr [si],%CH%>>_SCRIPT
ECHO.movsb>>_SCRIPT
ECHO.jnz 9d>>_SCRIPT
ECHO.dec di>>_SCRIPT
ECHO.loop 96>>_SCRIPT
ECHO.mov cx,di>>_SCRIPT
ECHO.sub cx,100>>_SCRIPT
ECHO.>>_SCRIPT
ECHO.g=90 a5>>_SCRIPT
FOR %%F IN (w q) DO ECHO.%%F>>_SCRIPT
debug %1<_SCRIPT>NUL
ping -n 1 127.0.0.1 >NUL
DEL _SCRIPT
GOTO EOF
:USAGE
ECHO. %0 FileName.txt
ECHO. Formfeeds are removed from FileName.txt
:EOF

====End cut-and-paste (omit this line)
For Win95/98/ME study/demo use. Cut-and-paste as plain-text Batch file.
Batch file troubleshooting: http://www.allenware.com/find?UsualSuspects

It may work in Windows NT/2000/XP (not tested).

--
William Allen
Free interactive Batch Course http://www.allenware.com/icsw/icswidx.htm
Batch Reference with examples http://www.allenware.com/icsw/icswref.htm
Header email is rarely checked. Contact us at http://www.allenware.com/


Ted Davis

unread,
Jun 21, 2005, 8:05:19 PM6/21/05
to

tr might do well enough ... or gsar. tr is translates or deletes
characters in a stream and gsar is a general search and replace
utility. Both are GnuWin32 utilities - free and open source. tr is
part of the core utils package; gsar is its own package.
<http://gnuwin32.sourceforge.net/packages.html>


--
T.E.D. (tda...@gearbox.maem.umr.edu)

Dr John Stockton

unread,
Jun 22, 2005, 9:27:21 AM6/22/05
to
JRS: In article <d9a0hg$2b7h$1...@agate.berkeley.edu>, dated Tue, 21 Jun
2005 14:23:59, seen in news:alt.msdos.batch.nt, Phil Robyn
<pro...@berkeley.edu> posted :

>Saucer Man wrote:
>> I have a .txt file with a page break character somewhere in it. It might be
>> the last or second to the last character in the file. What would be the
>> best way to locate and remove this page break using a batch file?

>What is a page break character?

It's standard ASCII, see <URL:http://www.merlyn.demon.co.uk/asciihex.txt>
- number 12, Hex 0C, Ctrl-L, a.k.a. FF or Form Feed.

The OP might do better to replace the character with carriage return,
line feed, or both, particularly if it may be not at the end; a line
separator may be needed at that position.

--
© John Stockton, Surrey, UK. ?@merlyn.demon.co.uk DOS 3.3, 6.20; Win98. ©
Web <URL:http://www.merlyn.demon.co.uk/> - FAQqish topics, acronyms & links.
PAS EXE TXT ZIP via <URL:http://www.merlyn.demon.co.uk/programs/00index.htm>
My DOS <URL:http://www.merlyn.demon.co.uk/batfiles.htm> - also batprogs.htm.

Timo Salmi

unread,
Jun 22, 2005, 10:00:58 AM6/22/05
to

findstr /v "<FF>" myfile.txt

You'll have to use a literal in your editor for the <FF> formfeed
(ascii 12) or whatever your pagebreak is.

All the best, Timo

--
Prof. Timo Salmi ftp & http://garbo.uwasa.fi/ archives 193.166.120.5
Department of Accounting and Business Finance ; University of Vaasa
mailto:t...@uwasa.fi <http://www.uwasa.fi/~ts/> ; FIN-65101, Finland
Useful script files and tricks ftp://garbo.uwasa.fi/pc/link/tscmd.zip

Saucer Man

unread,
Jun 22, 2005, 5:14:53 PM6/22/05
to
Thanks for all the suggestions!


"Timo Salmi" <t...@poiju.uwasa.fi.uwasa.fi> wrote in message
news:d9bquq$raf$1...@haavi.uwasa.fi...

0 new messages