Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Batch to capitalize all words in a text file?

233 views
Skip to first unread message

Robert

unread,
Jan 16, 2005, 1:26:11 AM1/16/05
to
I need a batch file that will change the first letter in every word in
a text file to upper case.

Thanks

Robert

Timo Salmi

unread,
Jan 16, 2005, 3:14:11 AM1/16/05
to
Robert <rm...@hotmail.com> wrote:
> I need a batch file that will change the first letter in every word in
> a text file to upper case.

@echo off & setlocal enableextensions
if not defined mytemp set mytemp=%temp%
echo s/ a/ A/g> %mytemp%\sedcmd.tmp
echo s/ b/ B/g>> %mytemp%\sedcmd.tmp
echo s/ c/ C/g>> %mytemp%\sedcmd.tmp
echo s/ d/ D/g>> %mytemp%\sedcmd.tmp
echo s/ e/ E/g>> %mytemp%\sedcmd.tmp
echo s/ f/ F/g>> %mytemp%\sedcmd.tmp
echo s/ g/ G/g>> %mytemp%\sedcmd.tmp
echo s/ h/ H/g>> %mytemp%\sedcmd.tmp
echo s/ i/ I/g>> %mytemp%\sedcmd.tmp
echo s/ j/ J/g>> %mytemp%\sedcmd.tmp
echo s/ k/ K/g>> %mytemp%\sedcmd.tmp
echo s/ l/ L/g>> %mytemp%\sedcmd.tmp
echo s/ m/ M/g>> %mytemp%\sedcmd.tmp
echo s/ n/ N/g>> %mytemp%\sedcmd.tmp
echo s/ o/ O/g>> %mytemp%\sedcmd.tmp
echo s/ p/ P/g>> %mytemp%\sedcmd.tmp
echo s/ q/ Q/g>> %mytemp%\sedcmd.tmp
echo s/ r/ R/g>> %mytemp%\sedcmd.tmp
echo s/ s/ S/g>> %mytemp%\sedcmd.tmp
echo s/ t/ T/g>> %mytemp%\sedcmd.tmp
echo s/ u/ U/g>> %mytemp%\sedcmd.tmp
echo s/ v/ V/g>> %mytemp%\sedcmd.tmp
echo s/ w/ W/g>> %mytemp%\sedcmd.tmp
echo s/ x/ X/g>> %mytemp%\sedcmd.tmp
echo s/ y/ Y/g>> %mytemp%\sedcmd.tmp
echo s/ z/ Z/g>> %mytemp%\sedcmd.tmp
rem
echo s/^^a/A/g>> %mytemp%\sedcmd.tmp
echo s/^^b/B/g>> %mytemp%\sedcmd.tmp
echo s/^^c/C/g>> %mytemp%\sedcmd.tmp
echo s/^^d/D/g>> %mytemp%\sedcmd.tmp
echo s/^^e/E/g>> %mytemp%\sedcmd.tmp
echo s/^^f/F/g>> %mytemp%\sedcmd.tmp
echo s/^^g/G/g>> %mytemp%\sedcmd.tmp
echo s/^^h/H/g>> %mytemp%\sedcmd.tmp
echo s/^^i/I/g>> %mytemp%\sedcmd.tmp
echo s/^^j/J/g>> %mytemp%\sedcmd.tmp
echo s/^^k/K/g>> %mytemp%\sedcmd.tmp
echo s/^^l/L/g>> %mytemp%\sedcmd.tmp
echo s/^^m/M/g>> %mytemp%\sedcmd.tmp
echo s/^^n/N/g>> %mytemp%\sedcmd.tmp
echo s/^^o/O/g>> %mytemp%\sedcmd.tmp
echo s/^^p/P/g>> %mytemp%\sedcmd.tmp
echo s/^^q/Q/g>> %mytemp%\sedcmd.tmp
echo s/^^r/R/g>> %mytemp%\sedcmd.tmp
echo s/^^s/S/g>> %mytemp%\sedcmd.tmp
echo s/^^t/T/g>> %mytemp%\sedcmd.tmp
echo s/^^u/U/g>> %mytemp%\sedcmd.tmp
echo s/^^v/V/g>> %mytemp%\sedcmd.tmp
echo s/^^w/W/g>> %mytemp%\sedcmd.tmp
echo s/^^x/X/g>> %mytemp%\sedcmd.tmp
echo s/^^y/Y/g>> %mytemp%\sedcmd.tmp
echo s/^^z/Z/g>> %mytemp%\sedcmd.tmp
rem
sed -f%mytemp%\sedcmd.tmp myfile.txt
del %mytemp%\sedcmd.tmp
endlocal & goto :EOF

All the best, Timo

--
Prof. Timo Salmi ftp & http://garbo.uwasa.fi/ archives 193.166.120.5
Department of Accounting and Business Finance ; University of Vaasa
mailto:t...@uwasa.fi <http://www.uwasa.fi/~ts/> ; FIN-65101, Finland
Useful script files and tricks ftp://garbo.uwasa.fi/pc/link/tscmd.zip

Harlan Grove

unread,
Jan 16, 2005, 4:46:33 AM1/16/05
to
"Timo Salmi" <t...@UWasa.Fi> wrote...
...
> sed -f%mytemp%\sedcmd.tmp myfile.txt
...

You're assuming the existence of sed. You're also assuming spaces and
newlines would be the only characters separating words. Finally, you're not
redirecting sed's output.

It's more likely the OP has VBScript and Windows Script Host. So an
alternative approach would be

@echo off
echo Do While Not WScript.StdIn.AtEndOfStream > %TEMP%\proper.vbs
echo WScript.StdOut.WriteLine _ >> %TEMP%\proper.vbs
echo Proper(WScript.StdIn.ReadLine) >> %TEMP%\proper.vbs
echo Loop >> %TEMP%\proper.vbs
echo. >> %TEMP%\proper.vbs
echo Function Proper(s) >> %TEMP%\proper.vbs
echo Const LOWERALPHA = "abcdefghijklmnopqrstuvwxyz" >> %TEMP%\proper.vbs
echo Dim re, i, c >> %TEMP%\proper.vbs
echo. >> %TEMP%\proper.vbs
echo Set re = New RegExp >> %TEMP%\proper.vbs
echo re.IgnoreCase = False >> %TEMP%\proper.vbs
echo re.Global = True >> %TEMP%\proper.vbs
echo. >> %TEMP%\proper.vbs
echo Proper = s >> %TEMP%\proper.vbs
echo. >> %TEMP%\proper.vbs
echo For i = 1 To 26 >> %TEMP%\proper.vbs
echo c = Mid(LOWERALPHA, i, 1) >> %TEMP%\proper.vbs
echo If InStr(1, Proper, c, 0) > 0 Then >> %TEMP%\proper.vbs
echo re.Pattern = "\b" & c >> %TEMP%\proper.vbs
echo Proper = re.Replace(Proper, UCase(c)) >> %TEMP%\proper.vbs
echo End If >> %TEMP%\proper.vbs
echo Next >> %TEMP%\proper.vbs
echo. >> %TEMP%\proper.vbs
echo End Function >> %TEMP%\proper.vbs
rem
type myfile.txt | cscript %TEMP%\proper.vbs //B //NoLogo > myfile.txt
del %TEMP%\proper.vbs


Then again, if I assume the OP has Perl on his system, the batch file could
be shrunk to 2 lines.

@echo off
perl -n -i -e "s/(\w+)/\u$1/g;" myfile.txt

Posted Via Nuthinbutnews.Com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.nuthinbutnews.com

Timo Salmi

unread,
Jan 16, 2005, 8:50:04 AM1/16/05
to
In article <41ea366b$1...@127.0.0.1>, Harlan Grove <hrl...@aol.com> wrote:
> "Timo Salmi" <t...@UWasa.Fi> wrote...
> > sed -f%mytemp%\sedcmd.tmp myfile.txt

> You're assuming the existence of sed.

Yes. That indeed can be a problem, especially if the purpose is to
manage a number of computers as is typical for, say, sysadmin tasks.

> You're also assuming spaces and newlines would be the only
> characters separating words.

Yes. Fairly easily customized.

> Finally, you're not redirecting sed's output.

Sure, but that is totally trivial.

> It's more likely the OP has VBScript and Windows Script Host. So an
> alternative approach would be

Excellent! It is good to have various options. In fact, I had the
same alternative next in mind, but you beat me to it. Thus I'll just
point to your posting's Message-Id when I include this task into my
FAQ.

Timo Salmi

unread,
Jan 16, 2005, 12:14:22 PM1/16/05
to
In article <csdrec$8...@poiju.uwasa.fi>, Timo Salmi <t...@UWasa.Fi> wrote:
> In article <41ea366b$1...@127.0.0.1>, Harlan Grove <hrl...@aol.com> wrote:
> > "Timo Salmi" <t...@UWasa.Fi> wrote...
> > > sed -f%mytemp%\sedcmd.tmp myfile.txt
> > You're assuming the existence of sed.
> Yes. That indeed can be a problem, especially if the purpose is to
> manage a number of computers as is typical for, say, sysadmin tasks.

Apropos:

Sun 16-Jan-2005: Obtained to Garbo archives the update
878915 Oct 25 2003 ftp://garbo.uwasa.fi/win95/unix/UnxUpdates.zip
UnxUpdates.zip Updates for UnxUtils GNU utilities for native Win32

Length Method Size Ratio Date Time CRC-32 Attr Name
------ ------ ----- ----- ---- ---- -------- ---- ----
16384 DeflatN 7638 54% 20.06.2003 16:57 23e2c927 --w- cat.exe
20992 DeflatN 10884 49% 20.06.2003 16:57 cc74c103 --w- cksum.exe
15360 DeflatN 7055 55% 20.06.2003 16:57 b3aecc5c --w- comm.exe
55296 DeflatN 25953 54% 20.06.2003 16:57 b780d6f2 --w- csplit.exe
17920 DeflatN 8263 54% 20.06.2003 16:57 12f077a8 --w- cut.exe
15360 DeflatN 7068 54% 20.06.2003 16:57 ea59da0b --w- expand.exe
18432 DeflatN 8967 52% 20.06.2003 16:57 c1e12fc6 --w- fmt.exe
15872 DeflatN 7410 54% 20.06.2003 16:57 2232d1be --w- fold.exe
204800 DeflatN 93442 55% 02.10.2003 08:17 1cc802e7 --w- gawk.exe
19456 DeflatN 9180 53% 20.06.2003 16:57 0c156523 --w- head.exe
20992 DeflatN 10167 52% 20.06.2003 16:57 dd7be4c8 --w- join.exe
30720 DeflatN 13201 58% 20.06.2003 16:57 c959a7c2 --w- md5sum.exe
44544 DeflatN 20292 55% 20.06.2003 16:57 f49bb170 --w- nl.exe
30208 DeflatN 14777 52% 20.06.2003 16:57 00fc008e --w- od.exe
15872 DeflatN 7082 56% 20.06.2003 16:57 364b7f92 --w- paste.exe
31232 DeflatN 14062 55% 20.06.2003 16:57 b00f2055 --w- pr.exe
53760 DeflatN 24491 55% 20.06.2003 16:57 49a786ba --w- ptx.exe
94720 DeflatN 51029 47% 25.10.2003 13:19 62d5366d --w- sed.exe
30720 DeflatN 13201 58% 20.06.2003 16:57 6d9d2093 --w- sha1sum.exe
49152 DeflatN 26174 47% 23.10.2003 17:55 a46bb753 --w- sort.exe
17920 DeflatN 8282 54% 20.06.2003 16:57 1feafe17 --w- split.exe
20480 DeflatN 9977 52% 20.06.2003 16:57 aa3d5dc0 --w- sum.exe
41472 DeflatN 18926 55% 20.06.2003 16:57 438c3b4e --w- tac.exe
32768 DeflatN 15856 52% 20.06.2003 16:57 cdce0904 --w- tail.exe
26624 DeflatN 12887 52% 20.06.2003 16:57 d08d432a --w- tr.exe
15872 DeflatN 7561 53% 20.06.2003 16:57 8625f342 --w- tsort.exe
15360 DeflatN 7218 54% 20.06.2003 16:57 eee4f3e2 --w- unexpand.exe
19456 DeflatN 9271 53% 20.06.2003 16:57 7ec466a9 --w- uniq.exe
22016 DeflatN 10687 52% 20.06.2003 16:57 1bccf6fb --w- wc.exe
372736 DeflatN 195949 48% 02.10.2003 08:17 4319be70 --w- zsh.exe
135680 DeflatN 68116 50% 23.10.2003 17:35 c3248bb6 --w- grep.exe
108544 DeflatN 46582 58% 02.10.2003 08:17 6b72f193 --w- less.exe
11264 DeflatN 5029 56% 02.10.2003 08:17 c535e5ca --w- lesskey.exe
168448 DeflatN 74870 56% 05.10.2003 15:49 f1f154b2 --w- make.exe
------ ------ --- -------
1810432 871547 52% 34

Ted Davis

unread,
Jan 16, 2005, 12:57:03 PM1/16/05
to
On Sun, 16 Jan 2005 01:26:11 -0500, Robert <rm...@hotmail.com> wrote:

>I need a batch file that will change the first letter in every word in
>a text file to upper case.
>

gawk "{for(x=1;x<=NF;x++){sub(/^./,toupper(substr($x,1,1)),$x)}print
$0}" source > target

where gawk is a free Windows port of GNU awk, downloadable from
<http://gnuwin32.sourceforge.net/packages.html>, 'source' is the file
containing the words, and 'target' is the file to put the result in.
"Word" is defined as any substring delimited by newlines, BOF/EOF, or
whitespace. If "word" has to be defined in symantic terms, the
program becomes too complex to use as an in-line script.


--
T.E.D. (tda...@gearbox.maem.umr.edu)

Timo Salmi

unread,
Jan 16, 2005, 2:41:47 PM1/16/05
to
Ted Davis <tda...@gearbox.maem.umr.edu> wrote:
> gawk "{for(x=1;x<=NF;x++){sub(/^./,toupper(substr($x,1,1)),$x)}print
> $0}" source > target

That is a neat, concise solution for the English language. The sed
and the vbs solutions have the slight advantage of being easier to
port into other languages, if need be, including Finnish.

Dr John Stockton

unread,
Jan 16, 2005, 12:41:16 PM1/16/05
to
JRS: In article <602ku01e1464n1vf1...@4ax.com>, dated
Sun, 16 Jan 2005 01:26:11, seen in news:alt.msdos.batch.nt, Robert
<rm...@hotmail.com> posted :

>I need a batch file that will change the first letter in every word in
>a text file to upper case.

Firstly, what is a word? Is cheese-grater one word or is it two? Is
pq5rs 0, 1, or 2 words?

The following should upper-case, in file.txt, each instance of a letter
which is not preceded by a blank (lightly tested)

mtr -x+ -n file.txt - (\b)(\w) = \1\u\2

There, mtr is MiniTrue, via Garbo or below; OK for UNIX, Win32, and DOS
(there, use just mt).

To increase efficiency, adjust to only recognise a-z.

I don't know whether SED has matching recognition capabilities.

Of course, you don't say where you are; a Scandinavian will want to
recognise words starting with accented letters, and a Netherlander may
want to treat ijsselmeer properly.

--
© John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 MIME. ©
Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
I find MiniTrue useful for viewing/searching/altering files, at a DOS prompt;
free, DOS/Win/UNIX, <URL:http://www.idiotsdelight.net/minitrue/> Update hope?

Ted Davis

unread,
Jan 16, 2005, 8:25:13 PM1/16/05
to
On 16 Jan 2005 21:41:47 +0200, t...@UWasa.Fi (Timo Salmi) wrote:

>Ted Davis <tda...@gearbox.maem.umr.edu> wrote:
>> gawk "{for(x=1;x<=NF;x++){sub(/^./,toupper(substr($x,1,1)),$x)}print
>> $0}" source > target
>
>That is a neat, concise solution for the English language. The sed
>and the vbs solutions have the slight advantage of being easier to
>port into other languages, if need be, including Finnish.

There seems to be an effort under way to add Unicode capability to
gawk - that should solve the problem with non-ASCII alphabets.

However, the existing versions are not completely English specific -
the DOS character set has some accented characters in both cases, and
at least the ones I tested capitalize properly in gawk 3.1.3. It may
have something to do with the character set used by the OS - I have no
real clue.

--
T.E.D. (tda...@gearbox.maem.umr.edu)

Timo Salmi

unread,
Jan 17, 2005, 12:16:02 AM1/17/05
to
Ted Davis <tda...@gearbox.maem.umr.edu> wrote:
> On 16 Jan 2005 21:41:47 +0200, t...@UWasa.Fi (Timo Salmi) wrote:
> >Ted Davis <tda...@gearbox.maem.umr.edu> wrote:
> >> gawk "{for(x=1;x<=NF;x++){sub(/^./,toupper(substr($x,1,1)),$x)}print
> >> $0}" source > target
> >
> >That is a neat, concise solution for the English language. The sed

> There seems to be an effort under way to add Unicode capability to


> gawk - that should solve the problem with non-ASCII alphabets.

It is nice to hear that we could capitalize on gawk as well.

Sorry, Ted, could not resist.

All the best, Timo (aka Perfesser Pundit in news:rec.humor)

--
Prof. Timo Salmi ftp & http://garbo.uwasa.fi/ archives 193.166.120.5
Department of Accounting and Business Finance ; University of Vaasa
mailto:t...@uwasa.fi <http://www.uwasa.fi/~ts/> ; FIN-65101, Finland

Perfesser's nauseating puns: ftp://garbo.uwasa.fi/pc/ts/tspun23.zip

0 new messages