Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to count how many times string appears in multiple text file

4,295 views
Skip to first unread message

Mikko Aalto

unread,
Jul 15, 2009, 6:58:33 PM7/15/09
to
Is it a possible to count in natural batch file (without any 3rd software) ;
How many times a string appers in multiple text file? Example; I would like
to count, how many orange -word are inside all my *.txt files.

Find can count, but if I running it a loop, it's counting every file
separate.

Is this a not possible with batch file, can anyone recommend some free
software, what can handle this kind a counting?


foxidrive

unread,
Jul 15, 2009, 7:17:39 PM7/15/09
to

file.bat "*.txt" "orange"

@echo off
for /f "delims=" %%a in ('dir "%~1" /a:-d /b') do call :next "%%a" "%~2"
echo found %count% occurances of "%~2"
pause
GOTO:EOF
:next
set num=
for /f "delims=" %%b in ('find /c %2 ^< %1') do set num=%%b
set /a count=count+num

Mikko Aalto

unread,
Jul 15, 2009, 7:36:26 PM7/15/09
to
Thank you very mutch! Your batch working great!


"foxidrive" <got...@woohoo.invalid> wrote in message
news:2qos55tvoo02atpa7...@4ax.com...

sw0rdfish

unread,
Jul 16, 2009, 9:39:31 AM7/16/09
to

no, that will not find the number of occurances of your word
accurately. If there are same words on the same line, the find command
will also count it as 1. ie, it counts by number of lines the word is
found.

here's a vbscript and no, its not third party software. it comes with
your system by default, so use it to your advantage

Set objFS = CreateObject("Scripting.FileSystemObject")
Set objArgs = WScript.Arguments
strFolder = objArgs(0)
strWord = objArgs(1)
Set objFolder = objFS.GetFolder(strFolder)
For Each strFile In objFolder.Files
Set objFile = objFS.OpenTextFile(strFile)
c=0
Do Until objFile.AtEndOfStream
strLine = objFile.ReadLine
s = Split(strLine ," ")
For i=LBound(s) To UBound(s)
If s(i) = strWord Then
c=c+1
End If
Next
Loop
objFile.Close
WScript.Echo "There are "& c & " " & strWord & " in " & strFile.Name
Next

on the command line

c:\test> cscript /nologo myscript.vbs c:\temp orange

Tom Lavedas

unread,
Jul 16, 2009, 9:59:02 AM7/16/09
to
On Jul 15, 7:17 pm, foxidrive <got...@woohoo.invalid> wrote:
> On Thu, 16 Jul 2009 01:58:33 +0300, "Mikko Aalto"
>

Or maybe ...

@echo off
set count=0
for /f "tokens=2 delims=:" %%a in (
'find /c "%~2" %1') do set /a count +=%%a
echo %count%

Of course, neither of these actually counts the OCCURENCE of the
string - just the number of lines containing one or more occurrence.
The only way I can think to do that in batch would be to either build
a hybrid VBS/batch or maybe to send each line containing the string to
a secondary process that searches repeatedly for the presence of more
than one occurrence. FINDSTR with a regular expression, might do the
job. For example, (it turned out to be easier than I thought, using a
recursive routine ...

@echo off
if "%3"=="" set count=0 & %0 %1 %2 %2
for /f "delims=" %%f in ('findstr /m "%~2" %1') do (
for /f "delims=" %%a in ('findstr "%~2" ^< %%f') do (
set /a count +=1
)
%0 %1 "%~2.*%~3" "%~3"
)
echo %count%

Tom Lavedas
***********

Mikko Aalto

unread,
Jul 16, 2009, 9:59:23 AM7/16/09
to
It's a good point, what you say in same occurence in one line. I try the
script, but i get error message; scriptname.vbs(22, 4) Microsoft VBScript
compilation error: Syntax error.

"sw0rdfish" <levisl...@rocketmail.com> wrote in message
news:2c7dab77-7c0f-4043...@m3g2000pri.googlegroups.com...

sw0rdfish

unread,
Jul 16, 2009, 10:03:35 AM7/16/09
to
On Jul 16, 9:59 pm, "Mikko Aalto" <mikko.aa...@eiosoitetta.fi> wrote:
> It's a good point, what you say in same occurence in one line. I try the
> script, but i get error message; scriptname.vbs(22, 4) Microsoft VBScript
> compilation error: Syntax error.
>

it works fine for me.

Mikko Aalto

unread,
Jul 16, 2009, 10:04:44 AM7/16/09
to
My mistake. It's work but it echo every file and I must press enter every
time. It's a better if script not echo every separate file. It's should echo
after it search every file and tell how many occurance all files include.


"Mikko Aalto" <mikko...@eiosoitetta.fi> wrote in message
news:%mG7m.24460$vi5....@uutiset.elisa.fi...

sw0rdfish

unread,
Jul 16, 2009, 10:11:30 AM7/16/09
to
On Jul 16, 10:04 pm, "Mikko Aalto" <mikko.aa...@eiosoitetta.fi> wrote:
> My mistake. It's work but it echo every file and I must press enter every
> time. It's a better if script not echo every separate file. It's should echo
> after it search every file and tell how many occurance all files include.
>

why do you need to press enter everytime? my code doesn't ask for user
input.
if you want the total, then set another variable to capture all the
counts.

Set objFS = CreateObject("Scripting.FileSystemObject")
Set objArgs = WScript.Arguments
strFolder = objArgs(0)
strWord = objArgs(1)

total=0


Set objFolder = objFS.GetFolder(strFolder)
For Each strFile In objFolder.Files
Set objFile = objFS.OpenTextFile(strFile)
c=0
Do Until objFile.AtEndOfStream
strLine = objFile.ReadLine
s = Split(strLine ," ")
For i=LBound(s) To UBound(s)
If s(i) = strWord Then
c=c+1

total=total+1


End If
Next
Loop
objFile.Close

Next
WScript.Echo "There are total of " & total & " " & strWord


Mikko Aalto

unread,
Jul 16, 2009, 10:32:05 AM7/16/09
to
Script after "Or maybe" comment work fine, but the last one echoes something
odd. First it's count correct, but after that it's echo few time "(set /a
count +=1)"

I run the script command line like this: scriptname.cmd "*.txt" "orange".

The first script after "Or maybe" comment could fine also string ": orange".
The last one not handle this. But this is my fault, because I wasn't tell in
first place; Sometimes I want to find string, what start ":".

"Tom Lavedas" <tglb...@cox.net> wrote in message
news:188ee91f-7f9f-4d5b...@v20g2000yqm.googlegroups.com...

Mikko Aalto

unread,
Jul 16, 2009, 10:37:25 AM7/16/09
to
>>why do you need to press enter everytime?

In my enviroment (xp sp3) it's start messagebox after every file..


"sw0rdfish" <levisl...@rocketmail.com> wrote in message

news:c602fa25-5cfa-460b...@v15g2000prn.googlegroups.com...

Mikko Aalto

unread,
Jul 16, 2009, 10:40:53 AM7/16/09
to
Your new script work fine, it's doesn't start messagebox in my system.

"sw0rdfish" <levisl...@rocketmail.com> wrote in message

news:c602fa25-5cfa-460b...@v15g2000prn.googlegroups.com...

Mikko Aalto

unread,
Jul 16, 2009, 10:56:48 AM7/16/09
to
I have to tell, now the first script also work fine. Something odd happening
in my system.. Your both script work great. If I want to find string what
start ":", is it possible to resolve vbscript, little bit change a code?


"Mikko Aalto" <mikko...@eiosoitetta.fi> wrote in message

news:VZG7m.24466$vi5....@uutiset.elisa.fi...

Tom Lavedas

unread,
Jul 16, 2009, 11:03:43 AM7/16/09
to
On Jul 16, 10:32 am, "Mikko Aalto" <mikko.aa...@eiosoitetta.fi> wrote:
> Script after "Or maybe" comment work fine, but the last one echoes something
> odd. First it's count correct, but after that it's echo few time "(set /a
> count +=1)"
>
> I run the script command line like this: scriptname.cmd "*.txt" "orange".
>
> The first script after "Or maybe" comment could fine also string ": orange".
> The last one not handle this. But this is my fault, because I wasn't tell in
> first place; Sometimes I want to find string, what start ":".
>

I should have said that trings with delimiters need to be enclosed in
double quotes, so for the first example that counts the number of
lines with one or more occurrences, use something like this ...

firstbatname ": orange"

I don't know why the second problem occurred with the other batch.
However, I did find a problem with running the procedure without the
right number of arguments. There might have also been a problem with
the original IF test at the beginning that cause the behavior you saw.

So here is an update ...

@echo off
if "%~1%~2"=="%~1" goto :EOF
if "%~3"=="" (set count=0 & %0 %1 %2 %2)


for /f "delims=" %%f in ('findstr /m "%~2" %1') do (
for /f "delims=" %%a in ('findstr "%~2" ^< %%f') do (
set /a count +=1
)
%0 %1 "%~2.*%~3" "%~3"
)
echo %count%

Also, with this approach, the search is using a regular expression
search. This means special care must be used in creating the search
string. For example, one way to search for the string ": orange"
would be to replace the space with a period, which is a 'wildcard'
that stands for 'any character', that is ...

secondbatchname :.orange

Though this is not perfect, it's simple.

Tom Lavedas
***********

Mikko Aalto

unread,
Jul 16, 2009, 11:17:50 AM7/16/09
to
It's work, but still there is funny little "bug". Here is screen capture:

> c:\temp>(set /a count +=1 )
2727

c:\temp>(
for /F "delims=" %a in ('findstr ":.orange" < test.txt') do (set /a count
+=1 )
test.cmd.cmd "*.txt" ":.orange.*:.orange" ":.orange"
)

c:\temp>(set /a count +=1 )
28
c:\temp>(set /a count +=1 )
29
c:\temp>(set /a count +=1 )
3030

Right result is 30. Some strange reason the 30 is two time 3030. And same is
2727.


"Tom Lavedas" <tglb...@cox.net> wrote in message

news:00d81b4e-4a28-4fe8...@k19g2000yqn.googlegroups.com...

Tom Lavedas

unread,
Jul 16, 2009, 1:57:49 PM7/16/09
to
On Jul 16, 11:17 am, "Mikko Aalto" <mikko.aa...@eiosoitetta.fi> wrote:
> It's work, but still there is funny little "bug". Here is screen capture:
>
> > c:\temp>(set /a count +=1 )
>
> 2727
>
> c:\temp>(
> for /F "delims=" %a in ('findstr ":.orange" < test.txt') do (set /a count
> +=1 )
>  test.cmd.cmd "*.txt" ":.orange.*:.orange" ":.orange"
> )
>
> c:\temp>(set /a count +=1 )
> 28
> c:\temp>(set /a count +=1 )
> 29
> c:\temp>(set /a count +=1 )
> 3030
>
> Right result is 30. Some strange reason the 30 is two time 3030. And same is
> 2727.

OK. I think I've figured it out. It was caused by the recursion from
inside of a FOR that was causing the problem. I also found another
problem with files that ended without an end-of-line on the last
line. The procedure below should work better.

@echo off
set recurse=


if "%~1%~2"=="%~1" goto :EOF
if "%~3"=="" (set count=0 & %0 %1 %2 %2)
for /f "delims=" %%f in ('findstr /m "%~2" "%1"') do (

for /f %%a in ('type "%%f" ^| findstr "%~2"') do (


set /a count +=1
)

set recurse=%0 %1 "%~2.*%~3" "%~3"
)
%recurse%
echo %count%

Tom Lavedas

Mikko Aalto

unread,
Jul 16, 2009, 3:44:35 PM7/16/09
to
Yes, now it's working fine. Thank you very much!

And thank you all, who help me this batch project. This is wonderful place.
I was monitoring this newsgroup long time and I be amazed, how greate batch
here can see. All you guru people, plese keep going your excellent work!

"Tom Lavedas" <tglb...@cox.net> wrote in message

news:ad38bf0f-831d-4637...@c36g2000yqn.googlegroups.com...

Todd Vargo

unread,
Jul 16, 2009, 4:16:22 PM7/16/09
to
Mikko Aalto wrote:
> >>why do you need to press enter everytime?
>
> In my enviroment (xp sp3) it's start messagebox after every file..

This is because you double clicked the .vbs file instead running from
commandline with the command provided.

>
>
> "sw0rdfish" <levisl...@rocketmail.com> wrote in message
> news:c602fa25-5cfa-460b...@v15g2000prn.googlegroups.com...
> On Jul 16, 10:04 pm, "Mikko Aalto" <mikko.aa...@eiosoitetta.fi> wrote:
> > My mistake. It's work but it echo every file and I must press enter
every
> > time. It's a better if script not echo every separate file. It's should
> > echo
> > after it search every file and tell how many occurance all files
include.
> >
>
> why do you need to press enter everytime? my code doesn't ask for user
> input.


--
Todd Vargo
(Post questions to group only. Remove "z" to email personal messages)

foxidrive

unread,
Jul 16, 2009, 10:10:17 PM7/16/09
to
On Thu, 16 Jul 2009 16:16:22 -0400, "Todd Vargo" <tlv...@sbcglobal.netz>
wrote:

>Mikko Aalto wrote:
>> >>why do you need to press enter everytime?
>>
>> In my enviroment (xp sp3) it's start messagebox after every file..
>
>This is because you double clicked the .vbs file instead running from
>commandline with the command provided.

I haven't tested this but could it be that the default VBS tool in Windows
is wscript.exe and we have all made cscript.exe the default?

Todd Vargo

unread,
Jul 16, 2009, 11:01:42 PM7/16/09
to
foxidrive wrote:

> Todd Vargo wrote:
> >Mikko Aalto wrote:
> >> >>why do you need to press enter everytime?
> >>
> >> In my enviroment (xp sp3) it's start messagebox after every file..
> >
> >This is because you double clicked the .vbs file instead running from
> >commandline with the command provided.
>
> I haven't tested this but could it be that the default VBS tool in Windows
> is wscript.exe and we have all made cscript.exe the default?

Yes, wscript.exe is the default host, but I choose to keep wscript.exe as my
default. However, if the instructions at the bottom of sw0rdfish's post were
followed by OP, then cscript.exe would be invoked explicitly and the script
would operate as intended. FWIW, a simple host check could be included when
a scrolling output is required.

If InStr(Ucase(WScript.FullName), "WSCRIPT.EXE") Then
MsgBox "USAGE: Run script from command prompt as follows." _
& vbCRLF & "{Appropriate syntax example goes here.}", vbCritical
Wscript.Quit
End If

0 new messages