Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Maddening Dilemma - Compare each cell within column a to each cell in column b and find unique matches

168 views
Skip to first unread message

pog...@gmail.com

unread,
Sep 28, 2007, 2:42:59 PM9/28/07
to
I have been trying to find a way in which i can have excel search
through two columns to find each unique match between the columns and
somehow change the formatting (like highlighting) to fish out the
matches. The problem is that i cannot think of a way to get each cell
in column a compared to each cell in column b, rinse and repeat for
the entire column a. And i am talking about thousands of rows per
column, and the columns are not necessarily the same size. Maddening!

If anyone has any ideas that do not involve installing shareware
plugins, please let me know!

Thanks!

Tom Ogilvy

unread,
Sep 28, 2007, 3:08:03 PM9/28/07
to
use countif as the formula is part of Conditional Formatting

=countif(C:C,A2)

information on conditional formattiong

Debra Dalgleish
http://www.contextures.com/tiptech.html


Chip Pearson's site on duplicates and uniques
http://www.cpearson.com/Excel/Duplicates.aspx

--
Regards,
Tom Ogilvy

pog...@gmail.com

unread,
Sep 28, 2007, 4:24:49 PM9/28/07
to
Tom,
A great start upon a solution, i am attempting a workaround using the
info you provided, many thanks!

Although my problem is more intricate than i let on. Realistically, i
would like to find duplicate values within the same column, but some
of the values are negative and some ar positive. I need to find out,
and highlight, the values which would cancel each other out. Hence i
need to find duplicates, except that they are not EXACT duplicates,
but one is the positive version of the number and one is the negative
version of the number. Here is an small example of such a list:

A1
10 <-matches such as this and the negative 10 below it both need to
be highlighted
-10
11
12
13 <-does not cancel and doesnt need to be highlighted
-12
-11
15
-15
-15 <-also does not cancel since one positve already canceled with one
negative and doesnt need to be highlighted

as you can see, i am attempting to highlight the corresponding
positive and negative values (the ones which would cancel each other
out) and leave the singular values unhighlighted. The problem is
even futher complicated by the fact that some values might have
multiple duplicates. There may be 10 positive number 11's and
nineteen negative number 11's, meaning i need the 10 positive versions
of the number 11 to cancel and highlight along with the 10 negative
versions of the number 11, but to leave the other 9 number 11's
unhighlighted.

I apologize for how confusing this sounds, but i would choose
confusing over hours of manual labor infused with human error anyday,
seeing as how i have thousands of rows within this one column with
many random and changing values to identify and compare.

If you or anyone can think of a solution more closely tailored to this
particular situation, i would be indebted to you forever.

In the meantime, i will work with what you provided to me Tom.

Thank you kindly.

pog...@gmail.com

unread,
Sep 28, 2007, 4:41:57 PM9/28/07
to
Also, to clarify, the reason i was asking about two columns was
because i used absolute value on all of the negative numbers and put
them in a separate list next to the positive ones to find EXACT
matches. But if there is a way in which this wasnt necessary, that
would be far more ideal.

Thanks again guys!

Pete_UK

unread,
Sep 28, 2007, 8:40:21 PM9/28/07
to
Are you allowed to sort the data, as well as taking the absolute
value? If so, then you are looking for pairs of numbers in adjacent
cells. Assuming that you have your ABS formula in column B, starting
with B1, and that you have sorted the data by column B, then put this
formula in C1:

=IF(B1=B2,"yes1","no")

and this one in C2:

=IF(AND(B2=B1,C1="yes2"),IF(B2=B3,"yes1","no"),IF(B2=B1,"yes2",IF(B2=B3,"yes1","no")))

Copy this formula down column C by double-clicking the fill icon (the
small black square in the bottom right corner of the cursor. It will
give you pairs of "yes1"/"yes2" down the column indicating paired
duplicates, and the occasional "no" meaning a single unpaired (unique)
value.

If you wanted to use conditional formatting on the values in column A,
then you can use Formula Is and then =LEFT(C1,3)="yes" and set your
colour.

I've just realised that this doesn't necessarily pair +10 with -10, so
if you have, say, three +10s and one -10 this would indicate all four
10s would be paired - does this matter?

If you wanted the data in the original order, then you should first
enter a simple sequence in column D (say), i.e. 1, 2, 3, 4 etc before
sorting the data (including column D) on column B and using the
formulae. Then fix the values in column C and re-sort the data by
column D to get it back to the original sequence - column D can then
be deleted.

Anyway, hope this helps.

Pete

Max

unread,
Sep 28, 2007, 11:52:50 PM9/28/07
to
One play which can deliver both the CF desired and a way to extract the
cells in col A which do not cancel out ..

Assuming source numbers within A1:A100, as posted

Put in B1:
=IF(A1="","",COUNTIF($A$1:A1,A1))

Put in C1, then array-enter the formula by pressing CTRL+SHIFT+ENTER
(instead of just pressing ENTER):
=IF(B1="","",IF(ISNUMBER(MATCH(-A1&"_"&B1,$A$1:$A$100&"_"&$B$1:$B$100,0)),"",ROW()))
Select B1:C1, copy down to C100. Col C will "flag" only cells in col A which
do not cancel out.

Then you could easily apply CF to highlight col A pointing to col C
Select col A (A1 active), then apply CF using Formula Is:
=$C1<>""
Format the fill color to taste, OK out

And if you want to extract those cells in col A which do not cancel out in
another col (this might be useful), just put in say E1:
=IF(ROW()>COUNT(C:C),"",INDEX(A:A,SMALL(C:C,ROW())))
Copy down to E100. This will return all cells in col A which do not cancel
out, neatly bunched at the top in col E
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---
<pog...@gmail.com> wrote in message
news:1191011089....@w3g2000hsg.googlegroups.com...

Balan

unread,
Sep 30, 2007, 12:41:01 AM9/30/07
to
Mr Tom
Pl see my response to this question. I shall be grateful for your comments
/ suggestions.

Balan

unread,
Sep 30, 2007, 12:35:00 AM9/30/07
to
M/s Pogster,Pete , Max and Tom,
Hi everybody. I am a learner in usage of Excel functions as well as VBA. I
hope the Excel function solutions offered by you experts will provide me more
insights. Meanwhile, I looked at Pogster's problem as a VBA problem and
tried to write a programme ( the first full fledged one I am writing
struggling with the codes & Help feature in VBA). I, understand his
requirement is to mark off one value with equivalent negative value whenever
it appears in the data range. My other assumptions were: i) Even a
neagative value may precede a positive value. ii) If there is only a pair of
positive values or negative value both will remain unhighlighted. iii) The
first opposite value has to be the basis for markoff iv)The marked off
entries have to have colored so that they can be distinguished from the
outstanding ones v) The programme has to proceed to look into the entire
range until it encounters a blank cell in the same column.

I have called the programme markOff. It has to be run through VBA say
using F5. Before commencing the execution, the cursor has to be kept on the
first cell of the range in the excel sheet. I have not prepared any code for
a button or for a message box (say, for asking the user whether the cursor is
in the first cell and if not to keep it there to proceed further), as writing
this programme itself has been almost a day's job taking away my weekend and
I am also required to learning coding for buttons and msgbox. The code is
given below. I request the experts to look into it and suggest improvement,
if any needed to make it more efficient and economical. The data I have
taken as the basis for testing the programme is given first, followed by the
programme.

A
-12
12
15
14
-15
13
-16
15
16
13
16
17
(intentionally left blank to see whether execution stops here or not)
17

The code:
----------------------------------------------------------------
Sub markOff()
'
' markOff Macro
' Macro recorded 29/09/2007 by Balan
'

Dim Num As Range
Dim Val As Double
Dim addr As String
Dim rownum As Integer
Dim colnum As Integer

rownum = ActiveCell.Row + 1
colnum = ActiveCell.Column
addr = ActiveCell.Address
Val = ActiveCell.Value
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum, colnum)

Do While ActiveCell.Value <> ""
If ActiveCell.Value = -Val Then
If ActiveCell.Interior.ColorIndex = 6 Then
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)
addr = ActiveCell.Address
Val = ActiveCell.Value
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)

Else
ActiveCell.Interior.ColorIndex = 6
Range(addr).Interior.ColorIndex = 6
Application.Goto Reference:=Worksheets("Sheet1").Range(addr)
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)
Do While ActiveCell.Interior.ColorIndex = 6

rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)
Loop


addr = ActiveCell.Address
Val = ActiveCell.Value
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)

End If
Else
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)
If ActiveCell.Value = "" Then
Application.Goto Reference:=Worksheets("Sheet1").Range(addr)
rownum = ActiveCell.Row + 1
If ActiveCell.Value <> "" Then
Application.Goto
Reference:=Worksheets("Sheet1").Cells(rownum, colnum)
Do While ActiveCell.Interior.ColorIndex = 6
rownum = ActiveCell.Row + 1
Application.Goto
Reference:=Worksheets("Sheet1").Cells(rownum, colnum)
Loop
addr = ActiveCell.Address
Val = ActiveCell.Value
rownum = ActiveCell.Row + 1
Application.Goto
Reference:=Worksheets("Sheet1").Cells(rownum, colnum)
End If
End If
End If
Loop

End Sub


---------------------------------------------
I hope I am not troubling you.

Balan

Balan

unread,
Sep 30, 2007, 12:42:01 AM9/30/07
to
Mr.Max,

Pl see my response to this question. I shall be grateful for your comments
/ suggestions.

pog...@gmail.com

unread,
Sep 30, 2007, 1:42:58 PM9/30/07
to
Balan,

Thanks so much for taking the time to write a script for this
dilemma. In my attempts to execute the VBA macro script in excel, i
recieved a compile erorr: syntax error which caused the macro not to
run. The error occured at this point:

Application.Goto
Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)

Beneath the first DoWhile Loop.

I am not much a programmer and so am not sure as to why this
happened.


pog...@gmail.com

unread,
Sep 30, 2007, 2:04:34 PM9/30/07
to
Max,
thank you for your valiant effort on this problem of mine.

To comment on the solution you provided, i used your formulas and used
conditional formatting but received results that were not accurate.
Maybe if i emailed you the actual data set i am working with and
showed to you how your formatting worked out, it would be easier.

In the meantime, what happened was that in column B was that numbers
ranging from 1 to 16 showed up in various places. What did you mean
for this column to do? It had 1's next to values which i knew
repeated in the negative, so i dont think that count was accurate.

With column C, it basically counted each row position in order except
for about 10 or 15 rows in the middle of my long dataset in column
A.

As for the conditional formatting, since column C did not work out
exactly how i think you had anticipated, the formatting was also off.
Although the formatting did what it was supposed to do, just that
column C seemed to be of little use in this case.

I will email you the dataset so that you can see for yourself, the
results i achieved with your formulas.

What exactly did you mean for columns B and C to do? I am not the best
with understanding the code in excel.

THank you again for all of your help in this matter.


pog...@gmail.com

unread,
Sep 30, 2007, 2:07:47 PM9/30/07
to
Pete,

Thanks so much for your attempt at this problem.

It is very important that all of the positives cancel with all of the
negative and leave the unique values untouched. A -10 should only
cancel with a single +10, etc...

I have tried a variation of the suggestion you offered, and though it
works in a case where this does not come into play, my needs require a
very accurate canceling of unique opposing values.

I wish there was a way i could just post the raw list of values i am
working with so it would be easier to understand.

Thanks again for your help Pete.

Pete_UK

unread,
Sep 30, 2007, 3:51:02 PM9/30/07
to
I don't have time to code this for you, but a VBA solution could work
along the lines of:

add a sequence to column B

sort the data in both columns in (say) descending order of column A
(all the positive numbers will come first, down through zero then
negative numbers)

start looking at both ends of the list with two variables - top and
bottom

if cell(top) = cell(bottom) then colour both cells: increment top:
decrement bottom
else if cell(top) > magnitude of cell(bottom) then increment top
else decrement bottom

do this until top>=bottom

re-sort the data back to how it was using sequence in column B

delete column B.

Hope this helps.

Pete

Gord Dibben

unread,
Sep 30, 2007, 3:39:30 PM9/30/07
to
On Sun, 30 Sep 2007 11:07:47 -0700, pog...@gmail.com wrote:

>I wish there was a way i could just post the raw list of values i am
>working with so it would be easier to understand.

Use either of the 2 free filehosts below to upload a sample of your actual
worksheet.
(Desensitize it first, if needed). Then copy & paste the generated link to
your sample file in response here:

http://www.flypicture.com/
http://cjoint.com/index.php

Gord Dibben MS Excel MVP

ilia

unread,
Sep 30, 2007, 4:02:10 PM9/30/07
to
Let's suppose your column that contains the values indicated is a
named range called MyRange. The following multi-cell array formula
will return a unique occurence of each uncancelled item:

=INDEX(IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),
0),SMALL(IF(MATCH(IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),
0),IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),0),
0)=ROW(INDIRECT("1:"&ROWS(IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),
0)))),MATCH(IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),
0),IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),0),
0),""),ROW(INDIRECT("1:"&ROWS(IF(MyRange>0,ABS(MyRange)*(COUNTIF(MyRange,"="&MyRange)-
COUNTIF(MyRange,"="&-MyRange)),0))))))

If there's an extra negative, it will return the negative, and if
there's an extra positive, it will return the positive. One zero will
be returned for all matching pairs. It will return #NUM error if your
array range is longer than the list. So, in this set:

10
-10
-10
11
-11
12
13
14
-14
15
-15
-15
16
16
-16


The result is this, assuming you array-entered the above formula in an
8-row range:

-10
0
12
13
-15
16
#NUM!
#NUM!


Hope this helps.
-Ilia

pog...@gmail.com

unread,
Sep 30, 2007, 5:30:08 PM9/30/07
to
Heres a link to the source list i am trying to work with. There are
many more where this one came from.

[URL="http://www.flypicture.com/files/MzcyNzY="]Madenning
Dilemma2.xls[/URL]

Thanks Gord for the ftp suggestions, and Pete again for the great
idea. Ill see if i can whip up some code for that.

Ilia, i will try out your solution in a second, thanks!


Max

unread,
Sep 30, 2007, 6:16:29 PM9/30/07
to
Here's a working sample to illustrate:
http://cjoint.com/?kbafj66Yrv
NettOffPosnNeg.xls

Note that the CF suggested earlier
is to highlight cells is col A which *do not* cancel out.

If you want to CF it the other way round,
use the converse formula: =$C1=""

The formulas in col B and C need to be implemented correctly. They need to
be copied all the way down to the last row of data in source col A.

My suggestion as applied to your sample is available here:
http://www.flypicture.com/download/MzcyODY=
Madenning_Dilemma2_1.xls


--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---
<pog...@gmail.com> wrote in message

news:1191175474.6...@g4g2000hsf.googlegroups.com...

Max

unread,
Sep 30, 2007, 6:21:00 PM9/30/07
to
Typo, Line:

> is to highlight cells is col A which *do not* cancel out.

should read as:
> is to highlight cells in col A which *do not* cancel out.

Max

unread,
Sep 30, 2007, 7:23:45 PM9/30/07
to
Sorry, pl disregard this earlier sample as I forgot to adapt the range
in col C's array formula to suit the actual extent of your sample data

> My suggestion as applied to your sample is available here:
> http://www.flypicture.com/download/MzcyODY=
> Madenning_Dilemma2_1.xls


Here's the corrected sample:
http://www.flypicture.com/download/MzczMDU=
Madenning_Dilemma2_2.xls

The array formula in the top cell C1 should be:
=IF(B1="","",IF(ISNUMBER(MATCH(-A1&"_"&B1,$A$1:$A$2197&"_"&$B$1:$B$2197,0)),"",ROW()))

since your data was down to that row, 2197

No change to the formulas in B1 and E1.
All formulas in B1:C1, and in E1 to be copied down to row2197

Max

unread,
Sep 30, 2007, 7:52:05 PM9/30/07
to
Some explanations:

Col B serves to assign an arbitrary unique "suffix" to the data in col A.
This suffix assignment is necessary to distinguish between all multiple
occurences of the source numbers which you have in col A. Eg there could be
multiple occurences of 10, -10, 11, -11, etc all the way down in col A.

Col C then checks for the match of the original source numbers in col A
concatenated with the corresponding suffixes in col B against the full table
array composed. Where it matches (ie cancels out), the expression returns a
null string. Where it doesn't match, it returns an arbitrary row number as a
flag. This flag can then be used in the CF formula to format col A for all
the non-cancellations, ie all the unique source numbers in col A which do
not cancel out

The flag could also be used as well in another col E, to "float up" all the
non-cancellations from col A.

Easiest to see what's happening by referring
to this small working sample (as posted earlier):
http://cjoint.com/?kbafj66Yrv
NettOffPosnNeg.xls

Balan

unread,
Sep 30, 2007, 10:02:09 PM9/30/07
to
Pogster,
I think the problem is on account of the wrapping up of the code to fit in
the page while posting it in this site. I have noticed this has happened in
all the lines "Application.Goto

Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)"
This should appear as a single continuous line or should be separated by an
underscore "_" at the place where the code is being cut and taken to next
line. Kindly to the end of the line after Cells(rownum, and press delete to
bring the "colnum)" up to the same line or try typing after one space an "_".
Either should work. This application.goto line is appearing more than once
and in all cases it has happened. You may have to correct all the way I have
mentioned above.

Then I hope the data is in sheet 1, otherwise, pl see the code and where
ever "Sheet 1" is appearing change the name of the sheet appropriately and
try.

Pl keep me posted. If the code worked, press "Yes" button at the bottom (Pl
see next to "was this post helpful to you", I shall understand.

Best Wishes. If you still encounter problems do not hesitate to write to
me, I shall try to correct the code suitably.

Balan

unread,
Oct 1, 2007, 10:53:03 AM10/1/07
to
Mr Pogster,
Pl see my suggestions to solve the problem. One of my sentences, I think,
is not clear. How it happened I do not know. May be while typing I might
have made some mistake. The thing I wanted to convey was that that the code

"Application.Goto
Reference:=Worksheets("Sheet1").Cells(rownum,
colnum)"
should entirely appear in a single line ( i.e., Application.Goto....colnum)
should be there in a single line or if you want to break it, it can be done
only using an underscore at the point where you want to break it . So, try
to bring every thing in one line.

As I mentioned this is not the only line, the same code is appearing in many
places and every where it is broken ( due to wrapping of text while copying
it from my PC to this site). Mend them also.

pog...@gmail.com

unread,
Oct 3, 2007, 1:41:00 PM10/3/07
to
Balan!

Thanks so much for the amazin macro you wrote. The Macro works as you
meant for it to, but unfortunatley it has some shortcomings.

First: On a column of data above 250 rows, the macro locks up and
freezes and can only be exited by END TASKING excel itself.
Second: With this dataset, the macro failed to identify all of the
pairs, but it did identify 1 set of pairs. The problem seems to be
that only consecutive double pairs are identified:

-10 -highlighted
10 -highlighted
10
11 - highlighted
11 -NOT highlighted
-11 -highlighted
-11 - NOT highlighted
12
12
-13
-13
9 -highlighted
-9 -highlighted
-program breaks at an empty cell as it should.
14 -ignored
-14 -ignored

Second Set of Data:
11 -highlighted
-11 -highlighted
11 -highlighted
-11 -highlighted
-break

As you can see, for some reason (and i do not understand the code
perfectly) the macro finds and identifies the first set of pairs and
any subsequent sets of pairs correctly but if the pairs are somehow
broken up, like in the first example, the macro fails to indentify
them. Did you mean for this to happen? For this particular
application it is essential that ALL sets are found, even duplicate
cancelling sets which are spaced apart (because in reality the
matching pairs could be in row 10 and row 1000. If there are 2 +11
and 2 -11, they must both cancel and be highlighted, regardless of
their position in relation to each other.

Is there a way to change the code to achieve this objective?

Balan, thank you so much for the time and effort you have already
spent on this side-project. Hopefully it will challenge you as much
as it has been challenging me!!!
Thanks again, and let me know if you figure out a way to update the
Macro!

Thanks again!

pog...@gmail.com

unread,
Oct 3, 2007, 2:11:22 PM10/3/07
to
Hi llia,

Thank you so much for your suggested solution! Turns out, that when
i enter your array-formula into cell C1, after naming my dataset
"MyRange", it gives me a formula error for some reason? I have no
clue why, seeing as how i have a hard time even understanding what
your formula does. Again, thanks for the attempt, but im not sure why
im getting an error? I cant get it working properly enough to test
it.

Any suggestions?

Thanks again!


Max

unread,
Oct 3, 2007, 7:14:29 PM10/3/07
to
Reply to OP's email received:

Welcome, but kindly keep all discussions online in the newsgroup thread, not
via private email to me. Upload the link to your new sample and post your
responses there.

> Particularly 60,000,000.00 and its opposite showed up as "not cancelling"

Assuming the formulas are all correctly installed, and all source numbers in
col A are real numbers, one possibility is that the 2 numbers are not
exactly equal. One may be fractionally off the other (the real underlying
value), despite how they appear in the cells.

Try a simple test.
If the 2 numbers are in cells A5 and A10 (say),
in an empty cell, put: =ABS(A5)=ABS(A10)
If the numbers are really equal,
the return should be TRUE

To cater for the above possibility, we can use ROUND() in the array formula
in col C to round off all source values to say, 2 dp in the comparison

So in C1, array-entered, copied down:
=IF(B1="","",IF(ISNUMBER(MATCH(-ROUND(A1,2)&"_"&ROUND(B1,2),ROUND($A$1:$A$100,2)&"_"&ROUND($B$1:$B$100,2),0)),"",ROW()))

Adapt the ranges to suit.


--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---

--- pogster wrote:

> Hey Max,
>
> Thanks so much for attempting a solution to my
> Madenning Dilemma.
> Your solution is a great one, although it does not
> completely work to
> my dataset.
> I attempted your formulas on my dataset on my own (i
> did not look at
> your application on my test sample), and the
> formulas failed to flag a
> bunch of cancelling pairs.
>
> Particularly 60,000,000.00 and its opposite showed
> up as "not
> cancelling", when they in fact should cancel out. I
> will upload or
> email to you a copy of MY test results sometime
> tonight.
>
> Thanks again for your great attempt! But this one
> seems uncrackable so
> far.
>
> -Pogster
>
>


Max

unread,
Oct 3, 2007, 7:28:13 PM10/3/07
to
Sorry, slight overkill there. Only the source numbers in col A need to be
rounded.

Should have read as


> So in C1, array-entered, copied down:

=IF(B1="","",IF(ISNUMBER(MATCH(-ROUND(A1,2)&"_"&B1,ROUND($A$1:$A$100,2)&"_"&$B$1:$B$100,0)),"",ROW()))

Balan

unread,
Oct 3, 2007, 9:03:00 PM10/3/07
to
Posgster,
About the macro freezing at row 250 I will see separately. I need some time
to think of some data and test them.

As regards the pairs not highlighted, I understood your requirement as one
in which only a negative number will be paired against its positive( i.e., a
11 to be marked off against a -11). I never thought you want a 11 to be
marked off against another 11. That is why two positive numbers are not
highlighted even if they are of same magnitude. If you confirm that you want
every first occurence of similar number whether it is a negative or positive
equivalent to be marked off, I can modify the code.

I thought a minus figures represents completion of a transaction. Pl reply

pog...@gmail.com

unread,
Oct 6, 2007, 11:39:21 AM10/6/07
to
Max,

Your solution works wonders, and achieves close to a perfect result
every time. What change to the formula in column C would i need to
make to round it off to the tenths place, instead of 2 decimal
places. I am working with foreign currencies and conversions to USD
usually throw it off a little bit.

Now i just need to work out a way to convert the process into a macro
since its essentially exactly the same for every spreadsheet i do.

I cant thank you and everyone else who has posted on this forum
enough, you guys n gals rock. Thanks so much!

-Pogster

Max

unread,
Oct 6, 2007, 6:24:28 PM10/6/07
to
> .. What change to the formula in column C would i need to

> make to round it off to the tenths place, instead of 2 decimal
> places.

Just amend the 2 in ROUND(..,2) to ROUND(..,10)

Place instead in C1, array-entered with CSE, then copy down:
=IF(B1="","",IF(ISNUMBER(MATCH(-ROUND(A1,10)&"_"&B1,ROUND($A$1:$A$100,10)&"_"&$B$1:$B$100,0)),"",ROW()))

(Remember to change the ranges to suit)

> Now i just need to work out a way to convert the process into a macro
> since its essentially exactly the same for every spreadsheet i do.

Suggest you try a new posting in .programming


--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---

<pog...@gmail.com> wrote in message
news:1191685161....@w3g2000hsg.googlegroups.com...

Pete_UK

unread,
Oct 6, 2007, 8:08:59 PM10/6/07
to
Hi Pogster,

I see you are still following this thread. Here's a macro I put
together a few days ago based on the algorithm I gave you last
weekend. It should be quite quick, even with large amounts of data. It
uses column B, so if you have any data in there you should insert a
new column B at the beginning of the macro (it gets deleted at the
end):

Sub Mark_duplicates()
'
' 04/10/2007, Pete Ashurst
'
Dim my_top As Long
Dim my_bottom As Long
Application.ScreenUpdating = False
Range("B1").Select
ActiveCell.Value = "1"
Range(Selection, Selection.End(xlDown)).Select
Selection.DataSeries Rowcol:=xlColumns, _
Type:=xlLinear, Date:=xlDay, _
Step:=1, Trend:=False
Columns("A:B").Select
Selection.Sort Key1:=Range("A1"), Order1:=xlDescending, _
Header:=xlGuess, OrderCustom:=1, _
MatchCase:=False, Orientation:=xlTopToBottom
Range("A1").Select
my_top = 1
my_bottom = Cells(Rows.Count, "A").End(xlUp).Row
Do Until my_top >= my_bottom
If Cells(my_top, 1).Value = Abs(Cells(my_bottom, 1).Value)
Then
Range("A" & my_top).Interior.ColorIndex = 4
Range("A" & my_bottom).Interior.ColorIndex = 4
my_top = my_top + 1
my_bottom = my_bottom - 1
ElseIf Cells(my_top, 1).Value > Abs(Cells(my_bottom, 1).Value)
Then
my_top = my_top + 1
Else
my_bottom = my_bottom - 1
End If
Loop
Columns("A:B").Select
Selection.Sort Key1:=Range("B1"), Order1:=xlAscending, _
Header:=xlGuess, OrderCustom:=1, _
MatchCase:=False, Orientation:=xlTopToBottom
Columns("B:B").Select
Selection.Delete Shift:=xlToLeft
Range("B1").Select
Application.ScreenUpdating = True
End Sub

Beware of spurious line-wraps on some of the long lines.

Hope this helps.

Pete

Pete_UK

unread,
Oct 6, 2007, 8:11:47 PM10/6/07
to
Yes, the lines starting with If and ElseIf have both wrapped - the
Then should be on the same line.

Pete

> > -Pogster- Hide quoted text -
>
> - Show quoted text -


Pete_UK

unread,
Oct 8, 2007, 4:22:17 AM10/8/07
to
I downloaded your file (2200 rows) and tried the macro which I posted
the other day with it - it took less than 2 seconds.

Pete

pog...@gmail.com

unread,
Oct 8, 2007, 9:23:17 AM10/8/07
to
Pete!

Your macro is very effecient and effective...and speedy too. Thanks
so much for the time you put into coding this!

On the sample data that I provided, i ran the macro and found that for
some odd reason, it brought the value: $41,711,328,951.02 to the top
of the list after running.
It achieved the right answer with the highlighting for sure, but i am
not sure why this one value was brought up. Specifically, it was
brought up to A1 from the position of A297. It is important that the
values remain in the same order, as i am copying them out of a larger
source-set that is sorted specifically, so the monetary ammounts need
to retain their original order.

Again, this only happened with this particular dataset, not sure why.

I ran it multiple times and had the same result, but this did not
happen with other datasets....strange... Did you notice this
happening?

Thanks again for your incredible effort, i will certainly use your
macro to aid my projects!

-pogster

pog...@gmail.com

unread,
Oct 8, 2007, 9:25:26 AM10/8/07
to
Pete,

Also, as in max's example, is there a way for the macro to round the
numbers it looks for to the TENTHS (thats one decimal place) place?
As in, a value of 1001.19 is simply rounded off at 1001.1. Not like
rounding up or down, but just ignoring the hundreths place all-
together. I posted my reasoning to Max for this, so you can see up
top for a description.

If this is too difficult, forget it, but it would help!

Thanks again.

-pogster

pog...@gmail.com

unread,
Oct 8, 2007, 9:33:38 AM10/8/07
to
Balan,

Thanks for looking into it. You understood my requirements correctly,
but your macro looks specifically for the first pair, or occurrence of
a match. Negatives must indeed cancel with positives, but every
occurrence of a pair of cancelling numbers, must cancel. Not just
some pairs, and not others.

In a simple dataset such as this:

11 - A1
11 - B1
-11 - A2
-11 - B2

A1 should cancel with the A2, and B1 should cancel with B2....but what
happens in the macro is that A1 correctly cancels with A2, but B1 does
not Cancel with B2 as it should. This only happens when the numbers
are arranged in this order...A1B1,A2B2. Does this make sense? This
is a case that your macro does not successfully handle, and i am not
sure why.

I would appreciate if you thought about it, but its okay if not. Take
a look at Pete's Macro, his correctly identifies the matches, maybe
you will find inspiration there?

Thank you again Balan for all of your effort. You rock!

-Pogster

Max

unread,
Oct 8, 2007, 9:40:48 AM10/8/07
to
Going by your latest explanation to Pete
for your earlier line
>> .. round it off to the tenths place

think the line:


> Just amend the 2 in ROUND(..,2) to ROUND(..,10)

should have read as:
> Just amend the 2 in ROUND(..,2) to ROUND(..,1)

Pete_UK

unread,
Oct 8, 2007, 9:59:36 AM10/8/07
to
Yes, I was able to reproduce this, and it is caused by the two sort
routines allowing Excel to guess if there is a header or not (there
shouldn't be). Change both instances of:

Header:=xlGuess, OrderCustom:=1, _

to this:

Header:=xlNo, OrderCustom:=1, _

then this shouldn't happen again.

Hope this helps.

Pete

Pete_UK

unread,
Oct 8, 2007, 10:05:34 AM10/8/07
to
To do this you need to change one line in the macro - the first IF, as
follows:

If Int(Cells(my_top, 1).Value * 10) / 10 = Int(Abs(Cells(my_bottom,
1).Value) * 10) / 10 Then

All one line - be wary of any line-breaks that the newsgroups put in.

This will not change any of your values. It merely ignores anything
beyond the first decimal place in the comparison.

Hope this helps.

Pete

Balan

unread,
Oct 8, 2007, 12:56:03 PM10/8/07
to
Pogster,
Thanks for the feedback. I thought you have given up.
Pete's macro is excellent and efficient. You can't compare his coding with
mine. He is an expert. As I have mentioned in my first reply to your
question, I am a novice learning programming. I have learnt from Pete's code
that "Application.Screenupdating " procedure will reduce the time taken by
the code to execute.

I have solved the problem of second pairs being ignored. But it is the same
approach which I had with the first macro which I posted here - adopting an
iterative process which consumes lot of time - the macro reads each entry and
compares it with every other until it reaches its pair. That takes time.
When you have 1000's of rows of data, the process takes lot of time. I think
you mistook this for freezing . When you use control break or end task you
perhaps saw only execution upt 250 or so rows. I tried with 1375 entries of
6 digits each. It took 2mts 40 secs to complete the task. From Pete's
macro, I realise there could be more efficient ways of solving this. But I
have a tight schedule of my own, which is not allowing me to look into the
code. May be for another 15-20 days, I will not be able to find time for
this. So I am presenting the revised one here for you to try:

( Cursor should be on the first row; the macro will identify pairs of one
positive and one negative of same magnitude, it will ignore pairs of same
sign [postive or negative], if numbers of opposite sign are not available for
mark off. (Pete's macro pairs, if I have seen the results properly, even
pairs of same sign - was it OK ? I am still not clear. (In that case I have
a modified one, which I am not posting for the present). You have mentioned
about rounding off to the first decimal. I have not attempted that also.
The macro will retains the values as they are. It will simply color the
pairs of values of opposite sign. The macro which I call as MarkOff3 is as
follows:

---------------------------------
Sub MARKOFF3()

Dim Num As Range
Dim Val As Double
Dim addr As String
Dim begrow As Integer
Dim endrow As Integer
Dim rownum As Integer
Dim colnum As Integer
begrow = ActiveCell.Row
colnum = ActiveCell.Column
rownum = ActiveCell.Row
Range("b14").End(xlDown).Select
endrow = Range(Cells(rownum, colnum), Cells(rownum, colnum)) _
.End(xlDown).Row + 1
rownum = 0
Application.Goto Reference:=Worksheets("Sheet1").Cells _
(begrow, colnum)
Do While ActiveCell.Row <> endrow

Do While ActiveCell.Interior.ColorIndex = 6

Application.ScreenUpdating = False
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells _
(rownum, colnum)
Loop

addr = ActiveCell.Address
Val = ActiveCell.Value
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells _
(rownum, colnum)

Do While ActiveCell.Row <> endrow
Application.ScreenUpdating = False
If ActiveCell.Value = -Val Then
If ActiveCell.Interior.ColorIndex = 6 Then
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1"). _
Cells(rownum, colnum)

Else
ActiveCell.Interior.ColorIndex = 6
Range(addr).Interior.ColorIndex = 6
Application.Goto Reference:=Worksheets("Sheet1").Range _
(addr)
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1").Cells _
(rownum, colnum)
Exit Do

End If
Else
If ActiveCell.Row + 1 = endrow Then
Application.Goto Reference:=Worksheets("Sheet1"). _
Range(addr)
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1"). _
Cells(rownum, colnum)
Exit Do
Else
rownum = ActiveCell.Row + 1
Application.Goto Reference:=Worksheets("Sheet1"). _
Cells(rownum, colnum)
End If


End If

Loop
Loop


Application.ScreenUpdating = True
End Sub

----------------------

Lots of editing may be necessary to keep the coding crisp. But as I have
mentioned I couldn't find time for the same.

Best Wishes. Keep me posted.

Balan

Balan

unread,
Oct 8, 2007, 1:18:02 PM10/8/07
to
Poster,
I am sorry in my today's post I had mentioned that Pete's macro pairs even
numbers of same sign, which is not correct. Only difference is it
presupposes that a positive number will precede a negative number. I had
coded my macro based on the presumption that a positive or negative value may
precede and it will be followed by the same value of opposite sign
subsequently for mark off. This means my first value could be a negative or
positive. The macro will start looking for a value of opposite sign and then
color it. One more thing, my macro takes lot of time, it will stop if you
break the execution. Pete's macro executes in a jiffy. "pog...@gmail.com"
wrote:

pog...@gmail.com

unread,
Oct 8, 2007, 4:07:27 PM10/8/07
to
Balan,

Your macro does in fact work, but you are correct, it is very time
consuming (and resource consuming) due to its recursive nature.

Thank you again for all of your time and effort, you have done very
well! Especially as some just getting into VBA programming!

Well Done, Bravo!

You do not need to worry about this particular problem anymore, do not
let it distract you.
I will work with what Pete and Max have provided and figure it out
from there.

Thank you again Balan, you have been invaluable!

-Pogster

pog...@gmail.com

unread,
Oct 8, 2007, 4:15:51 PM10/8/07
to
Pete,

The changes you suggested worked like a charm!

Your macro is fast, efficient, and does exactly what it is supposed
to.

It is amazing, the number of shortfalls Excel 2003 has compared to the
new 2007 version (which allows me to conditionally format by searching
for duplicates)
But your macro has brought 2003 up to speed, at least in this one
case.

Thanks so much for your help, this macro should definitley be posted
somewhere where it can be easily found, it is very useful, and
modifiable to suit other specific needs.

Thank you again for the time you spent in helping me with this issue,
and coding a great solution.
This forum is made great by people like yourself and the others who
posted with replies to my issue

You are all great! Many Thanks!

-pogster

Pete_UK

unread,
Oct 8, 2007, 6:10:16 PM10/8/07
to
Well, thanks for such effusive praise - I'm glad the macro does what
you want it to do.

As regards storing it somewhere, well Google maintains archives of
these newsgroups, so it should be available for anyone who searches
for it.

If after thorough testing you are convinced that it does the job, then
you might like to apply it directly to your other files, as you said
in one post that you extract this data from a file and that the order
was very important so that you can paste it back to the same file - so
this could save you a bit of time. Add two new lines near the
beginning of the macro to insert two new columns, so that it would
read:

Dim my_top As Long
Dim my_bottom As Long
Application.ScreenUpdating = False

Columns("B:C").Select 'new
Selection.Insert Shift:=xlToRight 'new
Range("B1").Select

and so on ...

Then near the end of the macro change this line:

Columns("B:B").Select

to this:

Columns("B:C").Select

You could now safely apply the macro to your other files, as the
contents of the other columns get shifted out of the way - well, only
if the numbers are in column A of the other files.

Perhaps you could explain what it is you are working with sometime ...

Pete

pog...@gmail.com

unread,
Oct 9, 2007, 4:33:15 PM10/9/07
to
Actually, the numbers are always in column O.

Thanks for the suggestion though.

What i am working with are balance sheets for clients of an insurance
company. I am weeding out all of the unnessecary entries (which
cancel) to track down the entries which actually contribute to the
final account balance. There are hundreds of balance sheets, spanning
an average of 2000 rows apiece. Its a big job to do manually, but this
macro really helps.

The only other variable to consider (one which is probably impossible
to code for), is that some of the numbers which DO NOT contribute to
the balance (and cancel out) are not just single numbres, but made up
of 2 or more other individual entries. A simple example: -100 cancel
with SUM(50+10+40) Except in reality the numbers are not that simple
or easy to locate.

To accomplish this, you would need a smart macro that could detect
numbers that could add together to create other numbers in this 2000
row sheet. That would take alot of code and thought, i think. No need
to bother with it. Its easier to try to do that part by hand. But
cancelling identical cancelling numbers is 90% of the task.

Like i said, hours of tedius head-ache inducing searching, simplified
to two clicks by your great macro. Cant thank you enough.

Hope this answers your question, which is a nice change from you
answering mine =)

Cheers.

-Pogster

Pete_UK

unread,
Oct 9, 2007, 6:24:10 PM10/9/07
to
Yeah, thanks for taking the trouble to post back - it's always
interesting to find out what the real situation is, as often we just
get a snippet.

The other problem you refer to crops up quite often in the groups,
where people want to reconcile payments against invoices. I think
Harlan Grove had a macro to do it, but it is a very number-intensive
exercise (not one that I'm going to attempt).

I dabbled a bit with the macro last night in making the colour cycle
through a range of values, rather than just one colour (green), i.e.
the colour changes each time a pairing is found - would you be
interested in a multi-colour version?

You can't actually do very much with coloured cells - would you like
the macro to put something in column B (eg "Y") to indicate that the
cell has been paired, so that it can help you to eliminate them and
thus concentrate on your other problem?

Pete

pog...@gmail.com

unread,
Oct 15, 2007, 3:43:15 PM10/15/07
to
Hey, sorry for the late post. Been busy with work lately.

As for the color changing, its really not necssary but thanks for the
suggestion.

As for the marker in column B, the trouble with moving the numbers out
of the order i have them in, or extracting unique values is that each
number also has about 11 columns of other information such as acct
numbers and journal ID's and dates and descriptions etc. that go along
with that particular value. So moving it out of the order i get it in
has the possibility of mucking everything up for me.

Its easier just to point out, visually, the values to ignore (or focus
on), and in the process retain the order.

If the marker just marked off the Unique values in column B, i could
of course insert a column next to col O and use it that way somehow,
to sort it, which would actually be quite helpful i think. Maybe if
the macro could look directly at column O, create a column to use for
its data, which it will delete, but also insert a column to the right
of Col O and place the markers there.

Would this be difficult to accomplish?

If its easier, yes, just placing a marker in column B would work just
as well, i can shift columns manually without a problem.

Thanks for the good idea though, as this would allow me to sort by
Unique or cancelled value (since you cannot sort by formatting in
2003)

-Pogster

Pete_UK

unread,
Oct 15, 2007, 4:53:05 PM10/15/07
to
I was thinking that you would do this in your extracted file - in
addition to ending up with numbers in column A which are highlighted
if paired, you would also have a column B which would have some marker
in to indicate these pairings (like "x"). You can still copy column A
back to column O of your original sheet, and copy column B to the next
empty column in your original sheet (which might be column P, but I
didn't know the layout of your data in the original file).

I thought you could then apply autofilter to this new column, along
the lines of:

Custom, Not Equal To, x ...

so that only the rows which have not been paired would then be
visible, so that you can then concentrate on doing what you need to do
with those. Of course, you could sort the data by this new column, to
bunch up the unpaired amounts and then concentrate on them - it's up
to you how you use it.

The thing is - would you find it useful?

OK, I'll ditch the multi-coloured idea.

Pete

pog...@gmail.com

unread,
Oct 16, 2007, 10:30:32 AM10/16/07
to
Hey Pete,

Yes, absolutely, that would probably be useful. Thing is, management
is going to look at what ive done and decide what format they want it
in anyway. So i could group them by paired or unpaired and then they
could just go and undo it with some stupid autofilter, but i believe
it would look better and be easier to digest in such a sorted format.

In short: yes this would be useful !!!

As for the multicolor idea which i shot down, i can see how it would
be useful if say a seperate color was applied to ammount under 50k,
50k - 150k, 150k - 250k, 250 - 500k, 500k - 1M, etc. etc.

This way the colors could represent ranges of values.

Again, i do not need this functionality, but it would be interesting
if you wanted to implement it for others use.

Thanks again for keeping up with this, i hope your code can be put to
good use outside of my particular dilemma, because i think its such
useful functionality.

-Pogster


Pete_UK

unread,
Oct 16, 2007, 8:31:44 PM10/16/07
to
Okay, here's an amended macro in full, along the lines you have
suggested:

Sub Mark_duplicates_a()
'
' 04/10/2007, Pete Ashurst
' amended 17/10/2007
'


Dim my_top As Long
Dim my_bottom As Long

Dim colour As Integer
Application.ScreenUpdating = False
Columns("B:D").Select
Selection.Insert Shift:=xlToRight
Range("C1").Select


ActiveCell.Value = "1"
Range(Selection, Selection.End(xlDown)).Select
Selection.DataSeries Rowcol:=xlColumns, _
Type:=xlLinear, Date:=xlDay, _
Step:=1, Trend:=False

Columns("A:C").Select


Selection.Sort Key1:=Range("A1"), Order1:=xlDescending, _

Header:=xlNo, OrderCustom:=1, _

MatchCase:=False, Orientation:=xlTopToBottom
Range("A1").Select
my_top = 1
my_bottom = Cells(Rows.Count, "A").End(xlUp).Row
Do Until my_top >= my_bottom

If Int(Cells(my_top, 1).Value * 10) / 10 =
Int(Abs(Cells(my_bottom, 1).Value) * 10) / 10 Then

Select Case Cells(my_top, 1).Value
Case Is < 50000
colour = 4 'Bright Green
Case Is < 150000
colour = 6 'Yellow
Case Is < 250000
colour = 8 'Turquoise
Case Is < 500000
colour = 39 'Lavendar
Case Else
colour = 15 'Grey
End Select
Range("A" & my_top).Interior.ColorIndex = colour
Cells(my_top, 2).Value = "Y"
Range("A" & my_bottom).Interior.ColorIndex = colour
Cells(my_bottom, 2).Value = "Y"


my_top = my_top + 1
my_bottom = my_bottom - 1
ElseIf Cells(my_top, 1).Value > Abs(Cells(my_bottom, 1).Value)
Then
my_top = my_top + 1
Else
my_bottom = my_bottom - 1
End If
Loop

Columns("A:C").Select
Selection.Sort Key1:=Range("C1"), Order1:=xlAscending, _


Header:=xlNo, OrderCustom:=1, _

MatchCase:=False, Orientation:=xlTopToBottom
Columns("C:D").Select
Selection.Delete Shift:=xlToLeft
Range("C1").Select


Application.ScreenUpdating = True
End Sub

This inserts a new column B, so any other data on the sheet will be
moved to the right. Column B will contain "Y" wherever there is a
pairing - you could change this to "paired" or some-such by making the
obvious two changes mid-way in the macro.

The macro also applies colour banding for the ranges you suggested. It
should be fairly obvious how to introduce other ranges in the CASE
part of the macro (just keep the numbers in sequence), and you can
easily change colours if you don't like mine - here's some other
numbers you might like to play about with:

Red - 3, Aqua - 42, Orange - 46, Pink - 7, Tan - 40

Maybe marginally slower, but still less than 3 seconds on my test data
of nearly 2200 values.

Hope this helps.

Pete

pog...@gmail.com

unread,
Oct 17, 2007, 1:22:12 PM10/17/07
to
Pete,

Great mod! The color coding works great, though in my version of the
macro i removed your Select case as i did not need the color coding.
But the column B identifier is also very useful. Thank you for this
valuable addition to your already superb macro!

The speed decrease is completely trivial, as it is barely noticeable.

I will toy around with this macro and possibly add more comments to
the file itself as i figure out its full functionality.

You are so the man.

Thanks pete.

-Pogster

Pete_UK

unread,
Oct 17, 2007, 8:28:26 PM10/17/07
to
Cheers! You're welcome.

Pete

Pete_UK

unread,
Oct 20, 2007, 8:34:54 PM10/20/07
to
Hi Pogster,

another variation - instead of just a "Y" marker in column B, I could
introduce a count that gets incremented each time a pairing is found.
That way the pairs can easily be identified later if there was any
query, and you could still use a Filter on column B (looking for
blanks) to hide the pairs and concentrate on accounting for the
unpaired values.

By the way, here's two links that might help in your second task:

http://www.tushar-mehta.com/excel/templates/match_values/index.html

http://groups.google.com/group/microsoft.public.excel.misc/browse_thr...

Hope this helps.

Pete

pog...@gmail.com

unread,
Oct 22, 2007, 11:27:05 AM10/22/07
to
Hi pete,

The numerical count is a good idea, would it assign a single value to
both numbers in each pair? Or a value to each number with the status
"paired"? Does this distinction make sense to you?

Either way, it is a good idea, i would like to see that code
modification, see if i can apply its functionality.

As for the links to aid in my second maddening dilemma, the first link
provides good explanations and some good solutions, though the second
link you posted is either incomplete or somehow wrong, the page does
not seem to exist.

Thanks for the info and yet another good code suggestion!

-pogster

Pete_UK

unread,
Oct 22, 2007, 7:14:46 PM10/22/07
to
Here's the second link in full:

http://groups.google.com/group/microsoft.public.excel.misc/browse_thread/thread/61a0a92a07c8c414/d483eb205d2d987b?hl=en

Hope this takes you there this time.

I'll post a revised macro later on - I had thought that the first pair
of numbers found would both be numbered 1, then 2 for the second pair,
3 for the 3rd pair etc. instead of just "Y".

Pete

Pete_UK

unread,
Oct 22, 2007, 9:06:14 PM10/22/07
to
Nearly forgot to post the revised code before going to bed - here it
is in full:

Sub Mark_duplicates_b()
'
' 04/10/2007, Pete Ashurst
' amended 17/10/07
' amended 22/10/07


'
Dim my_top As Long
Dim my_bottom As Long
Dim colour As Integer

Dim my_pair As Integer


Application.ScreenUpdating = False
Columns("B:D").Select
Selection.Insert Shift:=xlToRight

Selection.NumberFormat = "General"


Range("C1").Select
ActiveCell.Value = "1"
Range(Selection, Selection.End(xlDown)).Select
Selection.DataSeries Rowcol:=xlColumns, _
Type:=xlLinear, Date:=xlDay, _
Step:=1, Trend:=False
Columns("A:C").Select
Selection.Sort Key1:=Range("A1"), Order1:=xlDescending, _
Header:=xlNo, OrderCustom:=1, _
MatchCase:=False, Orientation:=xlTopToBottom
Range("A1").Select

my_pair = 1


my_top = 1
my_bottom = Cells(Rows.Count, "A").End(xlUp).Row
Do Until my_top >= my_bottom
If Int(Cells(my_top, 1).Value * 10) / 10 =
Int(Abs(Cells(my_bottom, 1).Value) * 10) / 10 Then
Select Case Cells(my_top, 1).Value
Case Is < 50000
colour = 4 'Bright Green
Case Is < 150000
colour = 6 'Yellow
Case Is < 250000
colour = 8 'Turquoise
Case Is < 500000
colour = 39 'Lavendar
Case Else
colour = 15 'Grey
End Select
Range("A" & my_top).Interior.ColorIndex = colour

Cells(my_top, 2).Value = my_pair


Range("A" & my_bottom).Interior.ColorIndex = colour

Cells(my_bottom, 2).Value = my_pair


my_top = my_top + 1
my_bottom = my_bottom - 1

my_pair = my_pair + 1


ElseIf Cells(my_top, 1).Value > Abs(Cells(my_bottom, 1).Value)
Then
my_top = my_top + 1
Else
my_bottom = my_bottom - 1
End If
Loop
Columns("A:C").Select
Selection.Sort Key1:=Range("C1"), Order1:=xlAscending, _
Header:=xlNo, OrderCustom:=1, _
MatchCase:=False, Orientation:=xlTopToBottom
Columns("C:D").Select
Selection.Delete Shift:=xlToLeft
Range("C1").Select
Application.ScreenUpdating = True
End Sub

I've left the colour banding in, as per the previous version. One
advantage of this approach is that you can see exactly how many pairs
have been found (523 in my test data of approx 2200 numbers). You
might want to change the line:

Cells(my_bottom, 2).Value = my_pair
to:
Cells(my_bottom, 2).Value = - my_pair

to show these as negative numbers.

By the way, the revised second link works okay for me, even though it
doesn't appear in full in the post and seems exactly the same as I had
posted previously - strange !!

Hope this helps.

Pete

On Oct 23, 12:14 am, Pete_UK <pashu...@auditel.net> wrote:
> Here's the second link in full:
>

> http://groups.google.com/group/microsoft.public.excel.misc/browse_thr...


>
> Hope this takes you there this time.
>
> I'll post a revised macro later on - I had thought that the first pair
> of numbers found would both be numbered 1, then 2 for the second pair,
> 3 for the 3rd pair etc. instead of just "Y".
>
> Pete
>
> On Oct 22, 4:27 pm, pogs...@gmail.com wrote:
>
>
>
> > Hi pete,
>
> > The numerical count is a good idea, would it assign a single value to
> > both numbers in each pair? Or a value to each number with the status
> > "paired"? Does this distinction make sense to you?
>
> > Either way, it is a good idea, i would like to see that code
> > modification, see if i can apply its functionality.
>
> > As for the links to aid in my second maddening dilemma, the first link
> > provides good explanations and some good solutions, though the second
> > link you posted is either incomplete or somehow wrong, the page does
> > not seem to exist.
>
> > Thanks for the info and yet another good code suggestion!
>

> > -pogster- Hide quoted text -

pog...@gmail.com

unread,
Oct 23, 2007, 3:57:16 PM10/23/07
to
Pete,

I believe, after autofiltering the numbers you generated, i realized
the macro finds the largest values first. The color banding gave that
one away, and i really did not notice that before. Very nice touch
with marking negative pairs with a negative numbered value. This
modification could be mighty useful for data analysis later on, thanks
so much for the code!

As for the link, which does in fact work now, i read into it and it
seems that the process is mighty complicated, and rather impractical
given the size of my data. With nearly 2200 entries being my average
data size, a FindSum program like Harlons would take a billion years
to complete on my computer...literally. And even if it could complete
in 2 seconds, it would generate billions of possible results, which i
could never sift through to find my desired answer. Unfortunatley, i
do not think i will be investigating those possibilities any
further.

But i thank you for referring me to those great sources!

Again, you have thought of a positive addition to an already great
macro. Thank you again for all of your effort, and for sticking with
this thread and with me throughout this learning process!!

You have taught me much ;)

I really cant think of any ways in which this macro could get any
better, at least for my application. I think you are officially off
the hook pete. But if you think of anything else, feel free to post
here, ill check back if i see updates!

Thanks pete.

-Pogster

Pete_UK

unread,
Oct 23, 2007, 5:53:34 PM10/23/07
to
I think one of the main differences with your application, though, is
that you have both positive and negative numbers, and that you want to
match positive with negative. Presumably you could have one positive
number which matched with two, three or more negatives (or is it the
other way round?). Obviously after the first scan in the current macro
to get a one-to-one pairing, there are fewer numbers left, and a
similar scan could then try to get a one-to-two matching, then a one-
to-three matching etc. Once a number has been matched, then it
wouldn't need to be considered in later scans, thus speeding up the
process further.

You have the experience of doing this manually - do you have many one-
to-ten or one-to-twenty pairings? Do you have to account for every
number in the list?

Perhaps if you could put up another set of test data which shows how
you have matched the numbers, I might have a go at revising the macro
further (sometime).

Thanks for your good wishes.

Pete

pog...@gmail.com

unread,
Nov 12, 2007, 1:44:06 PM11/12/07
to
Wow Pete,

Inredibly sorry for taking the better part of a month in getting back
to you here.

To answer your first question, the numbers do not have muliple
matches, i.e. -12 matches with +12 once, and they are counted, end of
story. That is how your macro works and that is the correct method.
But there can be multiple matches in the sense that there are more
than one -12 cancelling with more than one +12, and obv. if there is
an odd number of either, one will be uncancelled in the end, or a
numerous ammount of equal value and sign, which also remain
uncancelled.

Every number in the list is accounted for, but only in the sense of a
1-to-1 pairing, or lack thereof.

In continuing to use your very successful macro for most of this past
month, i have noticed that, the order my data is in (by date)
conflicts with the way the macro searches through the data.

For example, one entry of -100 should cancel with a +100 about three
entires down. But there is a +100 at the very bottom of the list. So
the first, and last -/+ 100 cancel out, are highlighted and forgotten
by the macro...whereas that second +100 a few down from the first
value, was the one which actually cancelled with that first value,
while the one at the bottom is simply a new outstanding balance.

Your macro processes one by one top value, compared to bottom, and
advances each variable position closer until they essentially meet in
the middle, am i correct?

Given this pattern, i sometimes have problems with the wrong pairs
being created. Usually not too many so going through by hand and
correcting doesnt take too long. The pairs are not wrong when
speaking strictly numerically, but in regards to real-world scenarios,
the wrong ammounts are being paired.

I have tried resorting my data in a way that would better suit the
macro's processes, but theres really no way i can think of to do it
properly.

To explain the reason for this (and i will attach an expanded dataset
as well) is that some of these journal entires are marked "N" for
manual, "B" for Will reverse at the beginning of next period, and "R"
for this is a reversal entry. Any reversal entries will obviously be
cancelling something out prior to the date at which that reversal is
posted and should be highlighted. The N is simply a manual entry
which could be unique (uncancelled) or a manual reversal as well.
Whereas the B marks an entry which will be automatically reversed at
the beginning of the next month/period. If the B is in this period, it
is highly probable that it remains uncancelled, unless some idiot went
in and reversed it manually...which would only create a problem later
since the system will still autoreverse.

I guess this will be easiest to understand with a dataset so i will
attach one for you to look at.
A Few tips:
The journal ID's usually coincide with B's and their corresponding
R's...unless it is a manual N entry, in which case the Journal ID's
could differ. I thought about this when i was trying to resort my data
to better cater to the macro's process.

I hope this answers your questions and doesnt cause too many
problems. If you do not feel the need to work on this project, that
is totally fine, as you have done so much for me already. I will
simply continue forward manually(highly likely) until i become a VBA
guru (highly unlikely).

If you are still interested in taking a further crack at this problem,
your help would be greatly appreciated!!
I will post a link to a secure website containing the dataset of which
I speak tonight (as i cant upload anything here at work). It will
expire in 7 days, let me know if the link does not work and i will
repost.

Thanks again pete.

-Pogster

P.S. - i swear not to let this forum get too dusty, i will check back
regularly from now on!

pog...@gmail.com

unread,
Nov 12, 2007, 2:17:48 PM11/12/07
to
Pete,

Also, part of it may be that i requested the macro just check till the
TENTHS place (first after the decimal), when in fact, it may be wiser
to check to the hundreth's place, just to be safe.

Can you modify the code to do that? Or how can I?

Link to sample will be up tonight.

Thanks Pete.

Pete_UK

unread,
Nov 12, 2007, 6:03:22 PM11/12/07
to
Thanks for resurrecting the thread and giving some feedback on using
the macro.

As I understand it, you extract just a single column of numbers from a
multi-column file, and the macro works on that column, ensuring it
remains in the same order at the end so that you can paste it back
into the original file.

But, if you also copied the dates associated with each number into
your extracted file, then it would be possible to sort the data not
only by number, but also by date, such that you have the largest
positive number going down to the largest negative number, but also
ensuring that equal (positive) values are listed in increasing date
order and equal negative values are listed in decreasing date order.
That way the macro would find the earliest +100 and match it with the
earliest -100.

However, this might not be exactly what you want, though. Imagine you
have -100 on 1st Nov, +100 on 3rd Nov and -100 on 4th Nov. YOU might
have reasons for pairing the 3rd and 4th November numbers and leaving
the 1st Nov value unpaired, but if these were the only values of 100
then the (date-enhanced) macro would pair the 1st and 3rd Nov values
together.

This being the case, would you prefer to match equal numbers on the
minimum days spread? I'd have to think a bit about an appropriate
algorithm for that !! Or could it be that you always have a +ve number
first, and the -ve match to this always occurs after the +ve? (or vice-
versa). Is it possible to incorporate unpaired values from an earlier
month (or whatever period your file covers), and then begin matching
those?

Of course, another approach would be to have the macro applied
directly to your original file, as I mentioned in an earlier post.
There might be other items of data in there that would help to
identify the appropriate pairings, and it would cut down on you having
to extract the data as you do now. You can easily use File | Save As
to save the file with a different name, so that the original is not
changed, and even this can be automated quite easily within the macro.

If you feel you can trust me, you might like to send a file directly
to me:

pashurst <at> auditel.net

Change the obvious.

Though I prefer to keep discussions going in the newsgroups, so that
everyone can benefit, there are times when sensitive data needs to be
kept more secure.

Pete

Pete_UK

unread,
Nov 12, 2007, 6:30:44 PM11/12/07
to
I've just been checking back on earlier posts. My first macro was
comparing exact values, so you could re-instate that line if you wish.
A few days later I gave you a modification to look only at the tenths.
If you now want to look at the hundredths, then you should modify that
line in the macro and change 10 to 100 (four times in the line).

I also noticed that your posting on October 9th said that you may have
a one-to-three way cancelling, which contradicts what you said earlier
- just trying to get a more complete understanding. <bg>

Pete

pog...@gmail.com

unread,
Nov 14, 2007, 9:10:40 AM11/14/07
to
Hey Pete,

Well now that you have the source file, you can understand the kind of
situation i am dealing with. The order matters because of the other
data attached to each value.

As for the one-to-three way cancelling, im not sure exactly what i may
have said before, but the definitive version is that one value will
cancel with its opposite once. There may be multiples of that value
in a + or - fashion, but if there are an even number of each sign,
all of the values will probably cancel.

As for the macro running directly in the source file, it would be
useful. I could just sort the numbers how i saw fit, and then run the
macro right there. Unfortunatley, dates and value sizes arent the
only things i need to worry about when developing a solution to
cancelling out values without making any mistakes. Its hard to cover
all of the bases because there are so many variables...date of
posting, current period, journal status (N, B, or R), etc.

If you wanted to alter the macro to make it run in the source file
(which resembles what i sent you), i would welcome that change. It
would be nice if you could just add comments to the code so i knew how
to change which columns it looked at, etc...

I will revert the tenths to the hundreths place in the macro's search,
that was my mistake for changing it, it is better off being more
precise.

As for my earlier problem of the wrong pairs being created (unique N's
being highlighted instead of the B and R entires that should cancel),
it is a pretty easy fix manually, all i do is sort out the uncancelled
values, and sort by Journal type to weed out all the R's which i know
should cancel. Then i find their corresponding N value which was
highlighted mistakenly, and just reverse the highlighting so the
correct one is left unhighlighted (N).

I believe a combination of resorting the data, possibly in the way in
which you suggested, along with comparing numbers to the hundreths
place, and possible having the macro pay attention to journal entry
type (N, B, R) will solve this problem.

Let me know once you have looked at the dataset if you had any ideas
as to how to cater to this particular issue.

Thanks again for all of your help Pete.

-Pogster

Pete_UK

unread,
Nov 14, 2007, 2:57:59 PM11/14/07
to
Hi Pogster,

You said in an earlier thread that your data was from column O of your
source file, but the file I now have has this in column G, so it is
not an exact copy of one of your source files. To apply the macro to
the source file I would need to know what columns are used in your
source file and where the data actually starts, as you seem to have a
header row now. Could you send me one of your source files with all
the data removed, so I can see what it looks like? You can add
comments to the file instead of data to point out anything you think I
should know.

What I have in mind is for the macro to be located in its own file -
let's say Compare_master.xls. When the macro is run it will ask you to
identify the folder and file with a normal File|Open dialogue, and it
will then open the source file and act upon it. At the end of the
macro the amended file will be saved automatically with a different
name (eg with "_a" added on to the filename), so that the original
source file and the Compare_master file remain unchanged, and you can
do what you like with the amended file.

In the first file I downloaded, your values only had 2 decimal places,
whereas the latest has 3. The macro could easily be made to scan
through the data looking for exact matches first, then looking for
matches on 1/100ths, and finally looking for matches on the tenths
(maybe an appropriate use of the colour coding?!). This shouldn't slow
things down too much - maximum it would be is three times the time
taken now (i.e. 6 seconds).

To get the date sorted as I suggested the other day, it's just a
matter of making the sign of the number representing the date the same
as the dollar amount, so a negative number will have a negative date
(this won't display correctly, but we're not bothered about that).
However, I have to bear in mind what you have said about the other
codes, so I'm still getting my head around that. These codes might be
obvious to you because you use them everyday, but it will take me a
while to get used to them. Any other codes I should know about?

Pete

pog...@gmail.com

unread,
Nov 16, 2007, 11:10:45 AM11/16/07
to
Pete,

Having the macro in its own file and having it run directly within my
source file is a great idea!

As for scanning multiple times with different criteria of decimal
place; the issue here isnt even matching just numbers anymore, that is
the easy part as you have already coded a suitable solution. The
problem now is cancelling the correct matching numbers, with the added
difficulty of "number status" as i explained earlier with the Reversal
column (B,R,N). I don't think that feature will be necessary,
scanning to the hundreths should be sufficient for my application.
There will always be one or two mistakes, i will do a visual search
myself to insure accuracy. No application is perfect.

But it is a good suggestion, maybe just one that i wont need to
implement at this moment. But if you wish to have two seperate
versions of the macro, please do, by all means. But the one without
this added code may be more useful to me.

That is a very interesting suggestion with the date sorting and making
negative numbers negative dates. Very smart solution. If the macro
could auto sort in the desired pattern, run itself, and then get rid
of the negative dates (normalize them again), that would be immensly
useful. The sort can remain, i can always autofilter and sort later
on without a problem.

As for the file, i have sent the full file to you, but it comes with a
sidenote: the queries run to generate these files change all the
time, different people modify them for different reasons, and i do as
well. Therefore, the order and placement of the columns is not
completely static. Therefore, if the macro is column-centric, it
would be nice to have some comments along with the code so i can
revise it myself if need be. If its not too much trouble.

One other way around this would be to look for the column names (which
are in fact static), and then adjust the column values automatically
at which the macro looks. Would this be a possible solution? Dont
worry about it for now though. Feel free to assume the columns dont
move, but just letting you know this is not always the case.

I hope you are being challenged with this macro dilemma, so far you
have been passing with flying colors, and helping me alot in the
process.

Thanks again Pete,

-Pogster

Pete_UK

unread,
Nov 16, 2007, 12:44:39 PM11/16/07
to
The main challenge is actually finding time to sit down undisturbed
and get the coding done - perhaps over the weekend, as it may be too
cold to go digging in the garden !! <bg>

From what you have said about the Reversal column, this has to take a
higher priority than the date, although date may come into it if we
have matching reversal codes. In my earlier posts I was hoping that
the date would be the key to identifying the matches, but it is clear
now that that is not the only criterion.

It's a pity about the moveable columns in your source sheet, but I
could include in the Compare_master file a facility whereby you could
specify which column is which before running the macro. Picking up on
the column names is another way, but I don't like relying on "... this
will never change ...", as invariably it will sometime in the future.

I had been planning to use the sheet in the compare_master file,
anyway, to allow you to specify the default path to the folder where
your file is, so it's just an extension of that.

Pete

pog...@gmail.com

unread,
Nov 16, 2007, 2:27:24 PM11/16/07
to
Pete,

You are probably right, its better to plan around every possibility
instead of settling on assumptions. Specifying which is which column
would probably be best.

It is going to be freezing this weekend, but please, by all means,
take your time. There is no rush on this, the last thing i want is
for you to lose sleep or massive ammounts of time over this. Like i
said before, the original macro is almost perfect, and finding the
errors does not take that much time, just a matter of simple
autofiltering.

Please let me know how it goes though, either way, or if you need any
more information from me. Ill be checking back here as often as
possible.

Again, thank you for all of your effort Pete.
Have a nice weekend!

-pogster

0 new messages