Python-like std::argmax?

775 views
Skip to first unread message

Andrew Hoelscher

unread,
Nov 11, 2013, 8:36:08 AM11/11/13
to std-pr...@isocpp.org
I frequently find myself needing to iterate over a container and get an iterator to the element that maximizes or minimizes some fairly expensive function, and I don't think there's a good algorithm in the standard library to do this as well as possible. Here's what I mean:

template<class ForwardIterator, class UnaryFunction>
ForwardIterator argmax(ForwardIterator first, ForwardIterator last, UnaryFunction func)
{
 
if( first == last ) return last;

 
auto best_score_so_far = func(*first);
 
ForwardIterator best = first;

 
for(auto i = first + 1; i != last; i++)
 
{
   
const auto current_score = func(*i);
       
   
if( current_score > best_score_so_far )
   
{
      best_score_so_far
= current_score;
      best
= i;
   
}
 
}

 
return best;
}

This implementation uses O(1) space and exactly N calls to func. All the implementations that I can think of that use calls to standard algorithms either use more space or make more calls to func. If func is some hellaciously-expensive function, then the extra calls can be problematic.  If you're running in a memory-constrained environment or are just working with very long lists, the extra memory used can be problematic (or could blow out your cache).


Here are two solutions I can see to use. The first one is a pretty naive implementation using max element that calls func 2(N-1) times:

template<class ForwardIterator, class UnaryFunction>
ForwardIterator argmax(ForwardIterator first, ForwardIterator last, UnaryFunction func)
{
 
return std::max_element(first, last,
   
[&](typename ForwardIterator::reference a, typename ForwardIterator::reference b)
   
{
     
return func(a) < func(b);
   
}
 
);
}


Since the problem is that the implementation calls func too many times, we could use std::transform to get a vector of all of the results of func in the range [first, last), and then use max_element over that vector, but now we're using O(N) space, traversing the range twice (once for std::distance and the other for std::transform), and we have a dynamic allocation (for std::vector):

template<class ForwardIterator, class UnaryFunction>
ForwardIterator argmax_spacehog(ForwardIterator first, ForwardIterator last, UnaryFunction func)
{
 
typedef decltype(func(*first)) return_t;

  std
::vector<return_t> scores(std::distance(first, last));

  std
::transform(first, last, scores.begin(), func);

 
auto highest_score = std::max_element(scores.begin(), scores.end());

 
return first + std::distance(scores.begin(), highest_score);
}


The upshot is, I've found that an argmax function has been a bottleneck in a few of my projects, and I've ended up including the hand-rolled code snippet quite a few times. Is this something that the library working group might be interested in or has considered before? I know this is pretty trivial compared to move constructors, xvalues, and threading libraries, but it's something that I would use a bunch! As far as other languages and implementations (and the title of the post), this is how Python deals with its max function (as well as min and sort). 

David Krauss

unread,
Nov 12, 2013, 4:08:13 AM11/12/13
to std-pr...@isocpp.org
On 11/11/13 9:36 PM, Andrew Hoelscher wrote:
> I frequently find myself needing to iterate over a container and get an
> iterator to the element that maximizes or minimizes some fairly expensive
> function, and I don't think there's a good algorithm in the standard
> library to do this as well as possible.

Using the Standard as a reference, it's sort of hidden at the end of
25.4.7 under "Sorting and related operations," but you are describing
std::(min|max|minmax)_element.

David Krauss

unread,
Nov 12, 2013, 4:18:44 AM11/12/13
to std-pr...@isocpp.org
On 11/11/13 9:36 PM, Andrew Hoelscher wrote:
> Here are two solutions I can see to use. The first one is a pretty naive
> implementation using max element that calls func 2(N-1) times:

Oh, you want to cache the function result. Never mind my previous message.

Several O(N) algorithms likewise could be computed on "filtered" values.
Not sure this is really the domain of the standard library.

> The upshot is, I've found that an argmax function has been a bottleneck in
> a few of my projects, and I've ended up including the hand-rolled code
> snippet quite a few times. Is this something that the library working group
> might be interested in or has considered before? I know this is pretty
> trivial compared to move constructors, xvalues, and threading libraries,
> but it's something that I would use a bunch! As far as other languages and
> implementations (and the title of the post), this is how Python deals with
> its max function (as well as min and sort).

Perhaps a standard caching facility would be more useful. A map or
hash-map with built-in LRU or pseudo-LRU would generalize the problem
nicely.

Edward Catmur

unread,
Nov 12, 2013, 9:08:25 AM11/12/13
to std-pr...@isocpp.org

All the implementations that I can think of that use calls to standard algorithms either use more space or make more calls to func. If func is some hellaciously-expensive function, then the extra calls can be problematic.

Have you considered a transform iterator? There's two options if you want to retain the result of the transform: lazy (in which case you'd want to use std::optional<decltype(func(*first))>> for storage) or eager (in which case you'd need to track the last iterator to avoid dereferencing a past-the-end, so it would properly be a transform range).
Reply all
Reply to author
Forward
0 new messages