array - array to be searched, where time member is ordered
count - size of array
t - time
i - desired index where (array[i].time<t<array[i+1].time)
int i=count/2,n=count;
for(;;)
{
n=n/2;
if (t>array[i].time)
i+=n;
else
i-=n;
if (n<=1)
return i;
}
It 'usually' works, but I can always find conditions where it doesn't
and I'm not sure why. It works fine for sorted integers.
Noting the wikipedia page on binary search, it seems most programmers
mis-program a binary so I've probably made the same mistake!
A linear search like this always gives me the right answer, but takes
longer:
for(int i=0;i<count;++i)
{
if (t>array[i].time && t<=array[i].time)
return i;
}
I need speed though!
Thanks.
I can't follow your logic, probably because of the flaw.
So is it always supposed to find a match? What happens if the match
does not exist?
I suggest calling your variables top and bottom.
And there are 3 cases, not two. One is your pivot point is less than
the target, your pivot point is greater than the target or your pivot
point equals the target. You exit if top<bottom.
And i+=n and i-=n make no sense at all. In a true binary search just
keep adjusting your top and bottom, and since you didn't match on your
pivot increment it or decrement it depending on the case.
To be honest, I'm surprised that code works ever.
If what you are trying to do is drastically different than finding a
value in a sorted list, then I apologize. I guess I don't understand.
Further, I suggest you make test cases and test any function you write
with a couple thousand test cases of random data with different array
sizes.
Also, are these just integers? What are their ranges? There may be
something far faster than a search (like a hash or array lookup) if
speed is really so important.
1. What is the case when something isn't found
2. What is the data type
3. How many items are there
4. What are the data ranges on the items
5. Are there ever duplicates
6. Anything extra? (They are expected to be sequential, they need to be
able to remove quickly etc)
With this information we may be able to suggest a better approach (even
if I did do lousy in my last topcoder match . . bleh)
--
LTP
:)
Oh! One more question. What language is this? And what else is in the
data item?
Keep in mind, if you need to add or remove items and find quickly, you
may want to use a tree and many modern languages have pre-existing tree
structures. For instance, TreeSet in Java.
.NET libraries have a convenient Dictionary class, which you may find
useful.
STL libraries can be extended in C++. etc. etc.
Though it is a good exercise if you don't know how, you shouldn't HAVE
TO re-invent the wheel ;)
--
LTP
:)
> I suggest calling your variables top and bottom.
>
> And there are 3 cases, not two. One is your pivot point is less than
> the target, your pivot point is greater than the target or your pivot
> point equals the target. You exit if top<bottom.
To clarify, what he's suggesting is this:
(NOTE: logic completely done in newsreader, no syntax or correctness
checking done. But this should be easy to debug.)
top = array.length
bottom = 0
foundIndex = -1
while (top > bottom)
{
mid = (top + bottom) / 2
if array [mid] == desiredValue
foundIndex = mid
break
else if array [mid] > desiredValue
top = mid
else
bottom = mid
}
return (foundIndex)
--
Please take off your pants or I won't read your e-mail.
I will not, no matter how "good" the deal, patronise any business which sends
unsolicited commercial e-mail or that advertises in discussion newsgroups.
Afraid this doesn't work, as it assumes desiredValue is an integer, or
at least directly present in the list. eg I could have
0.5
1.2
5.6
5.8
7.4
9.0
for desiredValue=6.2 then I would like index 3, because index 3&4
bracket the desiredValue.
It was not as easy as I initially assumed, even though my algorithm
works for most cases.
It isn't finding a match. The item may not be in the list. I'm looking
for items that bracket it.
> I suggest calling your variables top and bottom.
>
> And there are 3 cases, not two. One is your pivot point is less than
> the target, your pivot point is greater than the target or your pivot
> point equals the target. You exit if top<bottom.
This, and trivial list cases, were 'removed for clarity'.
> And i+=n and i-=n make no sense at all. In a true binary search just
> keep adjusting your top and bottom, and since you didn't match on your
> pivot increment it or decrement it depending on the case.
yes I did, I just didn't use a top and bottom, but rather an index 'i'
and and offset 'n' which keeps halving.
> To be honest, I'm surprised that code works ever.
It's been working for literally years. I'm now chasing bugs.
> If what you are trying to do is drastically different than finding a
> value in a sorted list, then I apologize. I guess I don't understand.
The list is a list of floats, and the value is a float that may, but is
probably not, in the list. There will be values that bracket it though.
Imagine recording (and time-stamped) 'frames', and attempting to find
the two frames around the current playback time (perhaps for interpolation).
> Further, I suggest you make test cases and test any function you write
> with a couple thousand test cases of random data with different array
> sizes.
Will do. But i have nothing to start with.
> Also, are these just integers?
I thought it was clear. Timestamps are unlikely to be integers.
I have noticed that my code seems to work fine for integers however, but
this is not useful to me.
> What are their ranges? There may be
> something far faster than a search (like a hash or array lookup) if
> speed is really so important.
Nope.
Thanks for the reply!
C++. It's actually a template function that can take an array of
anything, as long as it has a 'time' member of either float or double.
Arrays of such things are quite common in my code, and I often need to
find the one nearest the current time (or specifically, the two that
bracket it).
> Keep in mind, if you need to add or remove items and find quickly, you
> may want to use a tree and many modern languages have pre-existing tree
> structures.
The lists are generally immutable. Manipulating them is not a problem
performance wise. Searching them is.
> For instance, TreeSet in Java.
Java? I said I needed speed:)
> STL libraries can be extended in C++. etc. etc.
I have had bad experiences with the STL (which no longer exists as such
since it was integrated into C++98 - bad idea IMO) and try to avoid its use.
> Though it is a good exercise if you don't know how, you shouldn't HAVE
> TO re-invent the wheel ;)
Sometimes that's simpler. Once you've seen a wheel working after all...
Anyway, it's seems to be quite hard finding cases where this has been
done before. Perhaps the description 'binary search' is wrong (implying
integers?), and a search of this sort is normally called something else
and that's part of my problem.
Cheers
template <class T> long binary_search
(double t,T *array,long count)
{
//handle trivial cases:
if (count<2) return -1;
if (count==2) return 0;
if (t<array[0].time)
return 0;
if (t>array[count-2].time)
return count-2;
//binary search:
long top=count-2;
long bottom=0;
for(;;)
{
long mid=(top+bottom)/2;
if (t>array[mid].time && t<=array[mid+1].time)
return mid;
else if (t>array[mid+1].time)
bottom=mid;
else //if (t<=array[mid].time)
top=mid;
}
}
> > top = array.length
> > bottom = 0
> > foundIndex = -1
> > while (top > bottom)
> > {
> > mid = (top + bottom) / 2
> > if array [mid] == desiredValue
> > foundIndex = mid
> > break
> > else if array [mid] > desiredValue
> > top = mid
> > else
> > bottom = mid
> > }
> >
> > return (foundIndex)
>
> Afraid this doesn't work, as it assumes desiredValue is an integer, or
> at least directly present in the list. eg I could have
>
> 0.5
> 1.2
> 5.6
> 5.8
> 7.4
> 9.0
>
> for desiredValue=6.2 then I would like index 3, because index 3&4
> bracket the desiredValue.
Ah, that wasn't in the original spec. I thought you were searching for
an exact value.
> It was not as easy as I initially assumed, even though my algorithm
> works for most cases.
If the value does not exist in the array, then what is it you'd like to
be returned? 1 greater than the largest index still smaller than the
value? I don't think your other solution (posted after this) does that.
I agree - this is nothing like what you originally asked for.
It is important to define exactly what it is that you want - this
enables you to test it and know if it is working.
It doesn't have to be elaborate. For example:
A function which returns the first item in a sorted array greater than
or equal to a given value.
I work in QA right now, and it is my job to find minor discrepancies
such as these :)
--
LTP
:)
The linear example I gave in the original post should have made it
clear. I wanted something functionally equivalent but faster.