Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Compare 2 arrays and output difference

1,572 views
Skip to first unread message

Dan

unread,
Mar 7, 2011, 9:54:59 PM3/7/11
to
Hello,
I'm having a bit of an issue with comparing two arrays and outputing
the difference. The idea here is that there are two text files, the
first has a single sting and the second has two columns separated by a
comma. Each array could theoretically be huge and I have no idea
which will be larger each time the script is run.

For example:
file1.txt -
d...@domain.com
Fr...@domain.com
be...@otherdomain.com
j...@yetanotherdomain.com

file2.txt -
d...@domain.com,d...@someotherdomain.com
t...@something.com,t...@somethingelse.com
bettyotherdomain,be...@whatever.com
jo...@domain.com,jo...@somethingelse.com

What I'd like to happen is the following:
1. Take the data from the single column in file1 and match the first
column from file2 (ignoring column 2 in the match process).
2. For all those matches, pull column2 from file2 and column1 from
file1 and put that data into a third array/text file.

In my example above this would be returned:
d...@domain.com,d...@someotherdomain.com
bettyotherdomain,be...@whatever.com

At the end of the day (if someone possibly already has such a thing) I
will take the CN of the value of column 2 in the new output and apply
it to column1's altrecipient value in Active Directory.

Here's the code I've been trying to play with in order to accomplish
my tasK but it simply appends the two initial text files which is not
what I want at all.

Thanks for any help!

----------------------------

Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile _
("c:\file1.txt", ForReading)


'CREATING ARRAY1 FROM FILE
'Create a single string
Do Until objTextFile.AtEndOfStream
strFile1 = strFile1 & objTextFile.Readline
Loop

'Create the array
strFile1 = Left(strFile1,Len(strFile1))
array1 = Split(strFile1 , ",")


'Remove leading and trailing spaces
For i = 0 to Ubound(array1)
array1(i) = Trim(array1(i))
Next


'CREATING ARRAY2 FROM FILE
Set objTextFile = objFSO.OpenTextFile _
("c:\file2.txt", ForReading)

'Create a single string with
Do Until objTextFile.AtEndOfStream
strFile2 = strFile2 & objTextFile.Readline
Loop

'Create the array
strFile2 = Left(strFile2,Len(strFile2))
array2 = Split(strFile2 , ",")

'Remove leading and trailing spaces
For i = 0 to Ubound(array2)
array2(i) = Trim(array2(i))
Next


'Determine minimum array length
If UBound(array1) < UBound(array2) Then
arrayLength = UBound(array1)
Else
arrayLength = Int(UBound(array2))
End If

'Create array 3
ReDim array3(arrayLength)

'Create the optput file
Set objTextFile = objFSO.createTextFile ("output.txt", ForAppending)

'Populate array 3 and write to file
For i = 0 to arrayLength
array3(i) = array1(i) & " / " & array2(i)
objTextFile.writeline array3(i)
Next

Mayayana

unread,
Mar 8, 2011, 12:03:30 AM3/8/11
to
A few thoughts:

- Don't build a string by using ReadLine. It's
very slow and inefficient. Just use ReadAll.
Then split the two input strings into string1
and string2....

Dim A3(), A1, A2, i, i2, i3, s1, s2

A1 = Split(string1, vbcrlf)
string2 = Replace(string2, ",", vbcrlf)
A2 = Split(string2, vbCrLf)

Redim A3(UBound(A1))

'--now you can compare each element
'-- in A1 to every 2nd element in A2:

i3 = 0
For i = 0 to UBound(A1)
s1 = A1(i)
For i2 = 0 to UBound(A2) Step 2
s2 = A2(i2)
if s1 = s2 then
A3(i3) = s1 & "," & s2
i3 = i3 + 1
end if
next
next

newfile = Join(A3, vbCrLf)

'-- you'll have to add the extras like
'-- trimming spaces, accomodating
'-- case-insensitive matches, etc.

------------------------------------------------

Dan

unread,
Mar 8, 2011, 12:57:52 AM3/8/11
to
Hello,
Thanks for the thoughts... unfortunately I'm not fully
understanding...
1. ReadAll instead of ReadLine - no problem
2. Split the two input stings into sting1 and string2 - that's fine,
but I'm not sure I get your example:

string2 = Replace(string2, ",", vbcrlf)
I have the array defined and am not sure how to define string2 as the
array as you seem to indicate I would need to do before hand.
Thanks again!

> | F...@domain.com

> | j...@domain.com,j...@somethingelse.com

Dan

unread,
Mar 8, 2011, 2:20:29 AM3/8/11
to
Last email tonight and I think I'm close. I can see correct data in
both arrays but am not getting anything in the match. Is it possible
I'm comparing items from array1 to items 2,4,6,etc.. in array2 and not
getting any matches. It appears I should be matching on items
1,3,5,etc.. in array2 but can't tell if that's what is happening.
Thanks!

Tom Lavedas

unread,
Mar 8, 2011, 9:11:38 AM3/8/11
to Dan
On Mar 8, 2:20 am, Dan <da...@google.com> wrote:
> Last email tonight and I think I'm close.  I can see correct data in
> both arrays but am not getting anything in the match.  Is it possible
> I'm comparing items from array1 to items 2,4,6,etc.. in array2 and not
> getting any matches.  It appears I should be matching on items
> 1,3,5,etc.. in array2 but can't tell if that's what is happening.
> Thanks!
>

I like using a dictionary for tasks like this. The Exists function is
so helpful, once you build the dictionary, something like this ...

With CreateObject("Scripting.FileSystemObject")
aList1 = Split(.OpenTextFile("f1.txt", 1).ReadAll, vbNewLine)
aList2 = Split(.OpenTextFile("f2.txt", 1).ReadAll, vbNewLine)
set out = .OpenTextFile("Out.txt", 2, true)
end With

Set dUnique = CreateObject("Scripting.Dictionary")
dUnique.CompareMode = vbTextCompare ' case insensitive

For each sLine in aList2
sline = trim(sline)
if sline <> "" Then
aLine = split(sLine, ",")
dUnique.Add aLine(0), aLine(1)
end if
Next

for each sElement in aList1
if dUnique.exists(sElement) then _
out.writeLine sElement & ", " & dUnique.item(sElement)
next
_____________________
Tom Lavedas

Mayayana

unread,
Mar 8, 2011, 10:17:36 AM3/8/11
to
Last email tonight and I think I'm close. I can see correct data in
both arrays but am not getting anything in the match. Is it possible
I'm comparing items from array1 to items 2,4,6,etc.. in array2 and not
getting any matches. It appears I should be matching on items
1,3,5,etc.. in array2 but can't tell if that's what is happening.
Thanks!
>

You should be able to test that easily with
a couple of short sample strings:

string2 = "a-b-4-t-a-b-7-f-e-2"
string1 = "1-2-3-4-5-6-7-8"

If you split those two lines at "-" and
run them you should get matches of 4,4
and 7,7 -- but not 2,2 because that's at
an odd-numbered array index.

The idea
with the Replace call was to create the second
array with each email address as an array
element. After Replace,

d...@domain.com,d...@someotherdomain.com

becomes

d...@domain.com
d...@someotherdomain.com

Then the For/Next loop uses Step 2 to make
the comparison only with every other element.
It sounds like you figured that out. I realize
now that I missed something, though, in my
sample. The way I wrote it gives you the first
file item plus the first column in the second
file. You wanted to match the first column
but take the second column. So it should be
something like this:

For i2 = 0 to UBound(A2) Step 2
s2 = A2(i2)
if s1 = s2 then

s2 = A2(i2 + 1) '-- get second column


A3(i3) = s1 & "," & s2
i3 = i3 + 1
end if
next

Unfortunately, VBS doesn't seem to be able to
directly assign an array element to another array
element. That's why I'm making the string assignments.

If you end up using Tom's method instead, and
you have *very* big files, it'd be interesting to
hear back whether there was any notable speed
difference.


Mayayana

unread,
Mar 8, 2011, 11:21:20 AM3/8/11
to

For i2 = 0 to UBound(A2) Step 2
s2 = A2(i2)
if s1 = s2 then
s2 = A2(i2 + 1) '-- get second column
A3(i3) = s1 & "," & s2
i3 = i3 + 1
Exit For '-- another small improvement
end if
next

The Exit For makes this a bit more efficient.
The original code I posted was just "air code"
to show the basics.


Dan

unread,
Mar 8, 2011, 3:06:17 PM3/8/11
to
@Tom - Wow, that worked incredibly well... and pretty darn fast to
boot! I guess I have yet more things to check out and learn to
understand.

@Mayayan - I still haven't gotten your code to work just right. I'm
still returning null for some reason. I've posted here so if you want
to help clean up I'm happy to run the time comparison test and see
which is more efficient as that will only benefit everyone in the long
run.

Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile _
("c:\file1.txt", ForReading)

'CREATING ARRAY1 FROM FILE
'Create a single string
Do Until objTextFile.AtEndOfStream

s1 = strFile1 & objTextFile.ReadAll
a1 = Split(string1, vbcrlf)
Loop


'CREATING ARRAY2 FROM FILE
Set objTextFile = objFSO.OpenTextFile _

("c:\f2.txt", ForReading)

Do Until objTextFile.AtEndOfStream
string2 = strFile2 & objTextFile.ReadAll

s2 = Replace(s2, ",", vbcrlf)
A2 = Split(s2, vbCrLf)
Loop

Redim A3(UBound(A1))

'--now you can compare each element
'-- in A1 to every 2nd element in A2:

i3 = 0


For i2 = 0 to UBound(A2) Step 2
s2 = A2(i2)
if s1 = s2 then
s2 = A2(i2 + 1) '-- get second column
A3(i3) = s1 & "," & s2
i3 = i3 + 1
Exit For '-- another small improvement
end if
next

newfile = Join(A3, vbCrLf)

Mayayana

unread,
Mar 8, 2011, 4:11:17 PM3/8/11
to
There are several mistakes in that code. You seem
to just be copying by rote:

s1 = strFile1 & objTextFile.ReadAll
a1 = Split(string1, vbcrlf)

There are 3 different variables there for
the string! Here's a version that I think
is fixed. I haven't tested it:

Const ForReading = 1

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objTextFile = objFSO.OpenTextFile _

("c:\file1.txt", ForReading)

'CREATING ARRAY1 FROM FILE

'Create a single string

Do Until objTextFile.AtEndOfStream

s1 = objTextFile.ReadAll

a1 = Split(s1, vbCrLf)

Loop

'CREATING ARRAY2 FROM FILE

Set objTextFile = objFSO.OpenTextFile _

("c:\f2.txt", ForReading)

s2 = objTextFile.ReadAll

s2 = Replace(s2, ",", vbCrLf)

A2 = Split(s2, vbCrLf)

Loop

ReDim A3(UBound(A1))

'--now you can compare each element

'-- in A1 to every 2nd element in A2:

i3 = 0

For i = 0 to UBound(A1)

s1 = A1(i)

For i2 = 0 to UBound(A2) Step 2

s2 = A2(i2)

If s1 = s2 Then

s2 = A2(i2 + 1) '-- get second column

A3(i3) = s1 & "," & s2

i3 = i3 + 1

Exit For '-- another small improvement

End If

Next

newfile = Join(A3, vbCrLf)


0 new messages