Similar arrays

11 views
Skip to first unread message

Nilu

unread,
Feb 11, 2012, 12:54:10 AM2/11/12
to nextge...@googlegroups.com
Find if two arrays are similar, i.e, both contain same number of elements equally.

For eg., first_array = {4, 5, 1, 4, 6, 6, 9, 2}, , and second_array = {1, 4, 4, 9, 2, 6, 6, 5} are similar because both contain same number of elements and the same set of elements.

Narendra Chennupati

unread,
Feb 11, 2012, 1:53:46 PM2/11/12
to nilu...@gmail.com, nextge...@googlegroups.com
I can think of two solutions

Soulution-1
-----------------
 Sort the both arrays and compare --> O (N log N) time complexity, and constant space complexity

Solution-2
---------------
Hash the elements in Array-1 with their counts
Iterate through Array-2 and check if each element is present in the hash, if yes reduce the count....if the count becomes zero, don't consider that element.

Time complexity --> O (N) and Space complexity O(N)

-Narendra

Nilu

unread,
Feb 12, 2012, 12:31:41 PM2/12/12
to nextge...@googlegroups.com, nilu...@gmail.com
Yes, the above two solutions would work, but how about any other way of findoing out (i.e, not using sorting or hashing)?

Narendra Chennupati

unread,
Feb 12, 2012, 12:48:28 PM2/12/12
to nextge...@googlegroups.com, nilu...@gmail.com
Yeah...XOR both the arrays and if both are same then the result should be zero.

Jyoti

unread,
Feb 13, 2012, 12:21:45 AM2/13/12
to nextge...@googlegroups.com, nilu...@gmail.com
Curious problem, ingenious solution in O(n)... In practical situation we would want to make it O(1)... how?

siddharth agrawal

unread,
Feb 13, 2012, 2:35:50 AM2/13/12
to nextge...@googlegroups.com, nilu...@gmail.com
I don't think this is correct solution. Let's say A1={1,2,3,4,4} and A2={1,2,3}; then even in this case A1^A2==0

Jyoti

unread,
Feb 13, 2012, 2:38:02 AM2/13/12
to nextge...@googlegroups.com, nilu...@gmail.com
Nice catch! Any comment on the O(1) challenge that I put?

Narendra Chennupati

unread,
Feb 13, 2012, 3:12:36 AM2/13/12
to nextge...@googlegroups.com, nilu...@gmail.com
I am curious how you solve this with out even iterating through the array....

Narendra Chennupati

unread,
Feb 13, 2012, 3:11:42 AM2/13/12
to nextge...@googlegroups.com, nilu...@gmail.com
Two simple counters no_of_elements_in_array1 and no_of_elements_in_array2  will mitigate that problem.

Check if both the counters are sames.

Jyoti

unread,
Feb 13, 2012, 3:27:17 AM2/13/12
to nextge...@googlegroups.com, nilu...@gmail.com
You already know the solution. Maybe my question is not descriptive enough... if you have your class array what would you do to immediately figure out if two of your array objects are similar... the point is you have the liberty to add features to your array class, you designed it. It becomes simple now... There cannot be a solution in O(1) otherwise (that can be proven, right?).

Narendra Chennupati

unread,
Feb 13, 2012, 3:49:02 AM2/13/12
to nextge...@googlegroups.com
I see.

You might need to add two more member variables to your Class

XOR
No Of Elements

siddharth agrawal

unread,
Feb 13, 2012, 3:57:46 AM2/13/12
to nextge...@googlegroups.com, nilu...@gmail.com
Even that won't help in cases like A1={1,1} and A2={2,2}

Narendra Chennupati

unread,
Feb 13, 2012, 4:07:06 AM2/13/12
to nextge...@googlegroups.com, nilu...@gmail.com
storing element's sum may help?

Jyoti

unread,
Feb 13, 2012, 4:15:25 AM2/13/12
to nextge...@googlegroups.com, nilu...@gmail.com
Siddharth is "stress testing"! :-) And he would have already written a program to do the task and is sitting there and grinning at us... ;-)

There can be a number of ways once you have to achieve it if you have the liberty to design he class... basically the definition of 'similar array' above as given by Nilesh reminds of multiset acutely! And then... surprise... C++ set or multiset does not have an equality operator. That tells us at least that it is not trivial. And another surprise... the term multiset (and thus the study of this math construct) goes only as back as 1970s.

Nilu

unread,
Feb 13, 2012, 10:57:31 PM2/13/12
to nextge...@googlegroups.com, nilu...@gmail.com
I think there are few other ways but they would not be very efficient.

1) Let us take the example given, first_array = {4, 5, 1, 4, 6, 6, 9, 2}, , and second_array = {1, 4, 4, 9, 2, 6, 6, 5}. A number, in general, can be factored into its primes, so, if we consider each element of an array to be denoting nth prime, n being an element in array, and then computing its product. If products from both arrays are same, then they would be similar

So, as per example, for first_array, 4th prime * 5th prime * 1st prime * 4th prime * 6th prime * 6th prime * 9th prime * 2nd prime. Similarly, for second_array. So, product would be same, but with large data sets, it would overflow (it becomes a different problem then).

2) We could also consider first_array and second_array to be permutations of each other. In such a case, we could start with a smaller array (which in this case would be second_array because it is 14492665 which is lesser than first_array, i.e, 45146692), and then start computing next permutation(in C++, std::next_permutation). If first_array is not achieved from second_array, then they are not similar.

But with Narendra's solution with the following criteria :
1) Number of elements in both arrays
2) first_array ^ second_array == 0
3) sum(first_array) == sum(second_array)

Are there cases where it would fail?

siddharth agrawal

unread,
Feb 14, 2012, 1:25:50 AM2/14/12
to nextge...@googlegroups.com, nilu...@gmail.com
1st solution is considerable for smaller numbers, but u would need a database for all the prime numbers in this case. What would be the complexity in second solution? Regarding Narendra's solution, i think even this one is not correct. Let's say, 
A1={1,2,3} and A2={3,0,3}
now, 
  1. n(A1)==n(A2)
  2. A1^A2==0
  3. sum(A1)==sum(A2)
All the conditions are satisfied but as we can see that both the arrays are not similar.

-Siddharth

Narendra Chennupati

unread,
Feb 14, 2012, 1:57:52 AM2/14/12
to nextge...@googlegroups.com, 4u.sid...@gmail.com, nilu...@gmail.com
Awesome test case. I was about to say one more condition XOR(A1) == XOR(A2). But then I looked at your counter example

I think the time complexity of the solution 2 is O(n!) which may be considered as polynomial time complexity.
320.gif

Jyoti

unread,
Feb 14, 2012, 4:43:58 AM2/14/12
to nextge...@googlegroups.com, 4u.sid...@gmail.com, nilu...@gmail.com
I would rule out using prime numbers. Apart from the difficulty of 'bignums' what if they are 'objects'? Of course you can still take hash of object and work with 'that' number, but that is superfluous because you can directly use the hash value with a simpler algorithm anyways.

In my view I thought the simplest algorithm to do it would be O(pow(n,2)). You take each element from one array, go through each (unmarked) element in the second array and mark the element found (one n bit array used).The good part is that this algorithm will not need hashing and its 'worst' case is O(pow(n,2)).

Till now the best I see in the discussion here is the solution originally proposed by Narendra, which will work on generic objects and work in O(n) for time as well as space complexity and makes a great way to tackle the challenge in a generic way! I would be willing to believe that this is the best solution if I am not allowed to touch the design of the array data structure. Multisets are implemented as either hash-tables or some-sort-of-tree depending on what operations would you want to be faster.

Here is one link that tells you how mathematical equality of multisets is:

How am i even claiming that this can be done in constant-time? If there is hash function with the below mentioned property I would just maintain a hash value of the array at all the times: 
compute_new_hash(old_hash, operation_add_element, element)
compute_new_hash(old_hash, operation_removed_element, element)

Sad to say that the commonly known hash functions like md5 and sha1 do not have this property. Humm... Well it turns out that there are hash functions with the above property!  

Matlab does provide equality operation on multisets, I wonder if they do the O(pow(n,2)) or O(n) or O(1) internally? I would have implemented the O(pow(n,2)) or got someone else to implement and validate the O(1). ;-)

Reply all
Reply to author
Forward
0 new messages