Partial sorting of an array

139 views
Skip to first unread message

Muneer

unread,
May 28, 2015, 2:31:42 PM5/28/15
to arrayfi...@googlegroups.com
Hi guys,

I'd like to get the top 1% of values in an array. 

In C++ I'd do this with std::nth_element or std::partial_sort. But what's the best way of doing this in ArrayFire?

Cheers,
Muneer

Pavan Yalamanchili

unread,
May 28, 2015, 2:38:44 PM5/28/15
to Muneer, arrayfi...@googlegroups.com
Hi Muneer,

Currently there are no functions that perform what you are asking for. If the input is distributed uniformly, you could use where + sort to get a fuzzy answer in the following manner

sort(where(input > max<double>(input) * 0.01));

We can perhaps look into implementing this in the future.

--
Pavan

--
You received this message because you are subscribed to the Google Groups "ArrayFire Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arrayfire-use...@googlegroups.com.
To post to this group, send email to arrayfi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/arrayfire-users/ebda45cc-b4dd-4f46-86dd-5eb51bb4668c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Muneer

unread,
May 28, 2015, 3:21:01 PM5/28/15
to arrayfi...@googlegroups.com, mun....@gmail.com
Hi Pavan,

Thank you for your amazingly fast reply as usual.

Yes currently I'm doing something similar to what you suggest (ie. taking a random sample, getting the max and then filtering the input with that threshold value). It's inaccurate though and suffers some calibration need (ie. size of the sample / the 0.01 value), which if I get wrong it messes up.

If I replace the estimate methodology with sort() it increases the runtime by about 4x the estimate method.

I guess all I need is something that would tell me what would the value be at position n if the array was sorted. I don't need the array to be sorted, not even the first part of it to be sorted (like std::nth_element guarantees). Because I can then use that threshold value to find the guys in the array that are greater than it.

What would be the best way to get the threshold value accurately? Do I have to resort to sort or is there another way?

Kind regards,
Muneer

Pavan Yalamanchili

unread,
May 29, 2015, 2:59:02 AM5/29/15
to Muneer, arrayfi...@googlegroups.com
Hi Muneer, 

After doing some research, I don't think there is an easy way to have a parallel version of this algorithm that performs well on the GPUs. In most cases the sort might be faster than any alternatives we come up with.

If we implement "partition" instead of where, you could do something like this: 

Implementing partition in place of where can be done with just a little bit of work.

However if the loop runs more than 4 times, I think it wont provide a faster solution in your case.


William Tambellini

unread,
Apr 6, 2017, 12:42:18 PM4/6/17
to ArrayFire Users, mun....@gmail.com
Hi Pavan
I ve seen some references of partial sorting for parallel computing (from slide 29):
http://on-demand.gputechconf.com/gtc/2009/presentations/1034-Multi-GPU-Recommendation-System.pdf
Would you know if there is any cuda functions to do that from AF (libafcuda) ?
Cheers
W.

John Melonakos

unread,
Sep 27, 2021, 3:01:33 PM9/27/21
to ArrayFire Users
I saw that this thread is the 26th reference in a recently published paper, https://arxiv.org/pdf/2109.08219.pdf

Here is the documentation for ArrayFire's topk function, which was released more recently than this thread:

https://arrayfire.org/docs/group__stat__func__topk.htm
Reply all
Reply to author
Forward
0 new messages