Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to tell if an a date occurs multiple times in an array/collection

1 view
Skip to first unread message

laredotornado

unread,
Nov 18, 2009, 1:07:34 PM11/18/09
to
Hi,

I have an array of Calendar objects. How can I tell if any of the
object's values in the array occurs more than once (e.g. two objects
with values "01-01-1999 9:00 AM")? If it is easier, I can convert the
Calendar[] array to some other type of collection.

Thanks, - Dave

Peter Duniho

unread,
Nov 18, 2009, 1:31:19 PM11/18/09
to

The usual "most-efficient" approach would be to sort the data and then
look for consecutive duplicates.

A less efficient alternative is to examine each element and search the
array for a duplicate. But that's O(n^2) instead of O(n log n). You'd
only want to do it that way if your collection is extremely small or for
some reason sorting (optionally copying before) the data was impossible.

If you have some other advance knowledge about the array or the data,
there may be other more efficient ways to perform the test. But given
the information so far, it seems to me that those are your main options.

Pete

Patricia Shanahan

unread,
Nov 18, 2009, 1:54:13 PM11/18/09
to

Do your Calendar objects all have the same settings for the things,
other than time, that affect Calendar equality? See
http://72.5.124.55/javase/6/docs/api/java/util/Calendar.html#equals%28java.lang.Object%29

If so, you can do the test using a HashSet<Calendar>. Add each Calendar
to the set, checking the add method's return value. A false result
indicates a duplicate.

Patricia

Lew

unread,
Nov 18, 2009, 2:07:43 PM11/18/09
to

If you load the data into a 'Set <Calendar>' you cannot have any
duplicates.

Set <Calendar> calends =
new HashSet <Calendar> ( Arrays.asList( getArrayOfCalendars() ));

You can iterate over the array and load the counts into a
'Map <Calendar, Integer>'.

Map <Calendar, Integer> counts =
new HashMap <Calendar, Integer>
( getArrayOfCalendars() .length * 4 / 3 );
for ( Calendar calend : getArrayOfCalendars() )
{
Integer k = counts.get( calend );
counts.put( calend, (k == null? 1 : k + 1) );
}
for ( Map.Entry <Calendar, Integer> entry : counts.entrySet() )
{
if ( entry.getValue() > 1 )
{
System.out.println( "Duplicates found for "+ entry.getKey() );
}
}

This should be about O(n) for performance.

--
Lew

Dagon

unread,
Nov 18, 2009, 7:25:34 PM11/18/09
to
>laredotornado wrote:
>> I have an array of Calendar objects. How can I tell if any of the
>> object's values in the array occurs more than once (e.g. two objects
>> with values "01-01-1999 9:00 AM")? If it is easier, I can convert the
>> Calendar[] array to some other type of collection.

Heh, it's been a long time since I've seen this interview question. I have a
secret hope that it's a real problem, but it seems unlikely.

I'm going to assume that you care about millisecond precision, but do not care
about what non-Date data (like timezone) the Calendar contains. Thus, I'll
compare the getTimeInMillis() rather than just using .equals() or doing
something more complicated to truncate to minutes as your example shows.

Peter Duniho <NpOeS...@NnOwSlPiAnMk.com> wrote:
>The usual "most-efficient" approach would be to sort the data and then
>look for consecutive duplicates.

That's most efficient only if you don't want to allocate additional space.
If you don't mind having a separate data structure, it's more efficient
to build a hash as you check for duplicates.

O(n) for a hash-based solution, O(n * logn) for the sort.

>A less efficient alternative is to examine each element and search the
>array for a duplicate. But that's O(n^2) instead of O(n log n). You'd
>only want to do it that way if your collection is extremely small or for
>some reason sorting (optionally copying before) the data was impossible.

Build the search as you go. something like:

Map<Long, Calendar> calendars = new HashMap<Long, Calendar>();
for (Calendar c : calendarArray) {
long calValue = c.getTimeInMillis();
if (calendars.containsKey(calValue)) {
// you have a duplicate! Do whatever you must!
}
calendars.put(calValue, c);
}
--
Mark Rafn da...@dagon.net <http://www.dagon.net/>

Arne Vajhøj

unread,
Nov 18, 2009, 10:17:32 PM11/18/09
to

Set is probably the solution.

But the implementation will depend a bit on what you consider same
value.

same second, timezone does not matter
same millisecond, timezone does not matter
same second, same timezone
same millisecond, same timezone
same second localtime
same millisecond localtime

Arne

Roedy Green

unread,
Nov 19, 2009, 1:06:25 PM11/19/09
to
On Wed, 18 Nov 2009 10:07:34 -0800 (PST), laredotornado
<laredo...@zipmail.com> wrote, quoted or indirectly quoted someone
who said :

An array is probably the most convenient and fast format.

Just do a sort with Arrays.sort so the dups will be side by side.

then scan down the array comparing current with prev marking dups as
null.

Track how many are left. Allocate an array that big and copy non-nulls
into it.

see http://mindprod.com/jgloss/sort.html
--
Roedy Green Canadian Mind Products
http://mindprod.com
Finding a bug is a sign you were asleep a the switch when coding. Stop debugging, and go back over your code line by line.

Roedy Green

unread,
Nov 19, 2009, 1:08:33 PM11/19/09
to
On Wed, 18 Nov 2009 10:31:19 -0800, Peter Duniho
<NpOeS...@NnOwSlPiAnMk.com> wrote, quoted or indirectly quoted
someone who said :

>The usual "most-efficient" approach would be to sort the data and then
>look for consecutive duplicates.

Another approach is to use a HashSet. Fill the set, it will
automatically eliminate dups then extract the set as an array.
The code for this is simpler, but I suspect it is slower.

Peter Duniho

unread,
Nov 19, 2009, 1:45:37 PM11/19/09
to
Roedy Green wrote:
> On Wed, 18 Nov 2009 10:31:19 -0800, Peter Duniho
> <NpOeS...@NnOwSlPiAnMk.com> wrote, quoted or indirectly quoted
> someone who said :
>
>> The usual "most-efficient" approach would be to sort the data and then
>> look for consecutive duplicates.
>
> Another approach is to use a HashSet. Fill the set, it will
> automatically eliminate dups then extract the set as an array.
> The code for this is simpler, but I suspect it is slower.

Well, as others have pointed out, if it's acceptable to create a new
data structure (which if you're going to copy before sorting, it must
be), hashing is _faster_ than sorting. It will be highly dependent on
the input and the cost of computing the hash value as compared to simply
performing ordering comparisons.

But I think you'd have to work pretty hard to find an example where
hashing is slower than sorting, and even once you do I doubt it'd be all
that interesting an example (probably requires very long unique strings,
rather than Calendar objects, and also probably a relatively small data
set). And even if you have to extract the set back out as an array,
that's still only O(n) total (hashing and extraction) rather than O(n
log n).

Pete

Teraposa Lunodas

unread,
Nov 24, 2009, 7:34:24 AM11/24/09
to
Do this
0 new messages