[Boost-users] [iterators][range][rangeex] range concatenation

104 views
Skip to first unread message

Dmitry Vinogradov

unread,
Apr 5, 2009, 1:33:42 PM4/5/09
to boost...@lists.boost.org
Is it possible with Boost (or RangeEx) to concatenate ranges?

Sample usage:

template <class Range> void process_range(Range const &);

std::vector<foo> v = ...;
std::list<foo> l = ...;
process_range(concat(v, l));

_______________________________________________
Boost-users mailing list
Boost...@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users

er

unread,
Apr 5, 2009, 9:46:21 PM4/5/09
to boost...@lists.boost.org
Dmitry Vinogradov wrote:
> Is it possible with Boost (or RangeEx) to concatenate ranges?
>
> Sample usage:
>
> template <class Range> void process_range(Range const &);
>
> std::vector<foo> v = ...;
> std::list<foo> l = ...;
> process_range(concat(v, l));

I take it you have a good reason not to simply copy v and l into a new
container. iterator_facade might help, and iterator_range to obtain a
range.

Thorsten Ottosen

unread,
Apr 6, 2009, 12:39:52 PM4/6/09
to boost...@lists.boost.org
er skrev:

> Dmitry Vinogradov wrote:
>> Is it possible with Boost (or RangeEx) to concatenate ranges?
>>
>> Sample usage:
>>
>> template <class Range> void process_range(Range const &);
>>
>> std::vector<foo> v = ...;
>> std::list<foo> l = ...;
>> process_range(concat(v, l));
>
> I take it you have a good reason not to simply copy v and l into a new
> container. iterator_facade might help, and iterator_range to obtain a
> range.

It will be impossible to implement a generic iterator that allows
for fast iteration over the two ranges. The problem is that the
iterator would have to check when the new container starts, similar to
segments in a std::deque<T>.

So I would just call process_range() twice. If you really need to
view the two ranges as one range, then you have to pay an abstraction
penalty, albeit it might be more efficiant than copying the two
containers into a new one (and it is probably not that trivial to write
the iterator that depends on two or more other iterator types).

If "foo" is a heavy object, then copying the references of the elements
to a new container std::vector<boost::reference_wrapper<foo>> is
probably the most efficient alternative.

-Thorsten

Dmitry Vinogradov

unread,
Apr 6, 2009, 4:09:01 PM4/6/09
to boost...@lists.boost.org
Thorsten Ottosen wrote:
> er skrev:
>> Dmitry Vinogradov wrote:
>>> Is it possible with Boost (or RangeEx) to concatenate ranges?
>>>
>>> Sample usage:
>>>
>>> template <class Range> void process_range(Range const &);
>>>
>>> std::vector<foo> v = ...;
>>> std::list<foo> l = ...;
>>> process_range(concat(v, l));
>>
>> I take it you have a good reason not to simply copy v and l into a new
>> container. iterator_facade might help, and iterator_range to obtain a
>> range.
>
> It will be impossible to implement a generic iterator that allows
> for fast iteration over the two ranges. The problem is that the
> iterator would have to check when the new container starts, similar to
> segments in a std::deque<T>.
>
> So I would just call process_range() twice. If you really need to
> view the two ranges as one range, then you have to pay an abstraction
> penalty, albeit it might be more efficiant than copying the two
> containers into a new one (and it is probably not that trivial to write
> the iterator that depends on two or more other iterator types).
>
> If "foo" is a heavy object, then copying the references of the elements
> to a new container std::vector<boost::reference_wrapper<foo>> is
> probably the most efficient alternative.
>

Calling process_range() twice is not a solution in my case. Copying
references to a new container is a better way.

Regarding efficiency, can you look thru my rough concat() implementation
attached to discover any its disadvantages?

test.cpp
concat.h

Thorsten Ottosen

unread,
Apr 6, 2009, 5:46:34 PM4/6/09
to boost...@lists.boost.org
Dmitry Vinogradov skrev:

> Calling process_range() twice is not a solution in my case. Copying
> references to a new container is a better way.
>
> Regarding efficiency, can you look thru my rough concat() implementation
> attached to discover any its disadvantages?

Wow. Nice work!

Some comments:

0. I prefer to use unsigned instead of std::size_t. std::size_t can be
larger than a word IIRC.

1.
You can call Stabilize() in increment().

2.
You should probably check It == End1 before Range == 0, because
Range == 0 will be true most of the time during the iteration through
the first range.

3. How can you avoid to store the second end iterator?

How does the performace compare to creating a vector of references?
(remember to call reserve()).

If the performance is good,this baby should be included in the range
library asap.

Neil Groves

unread,
Apr 6, 2009, 6:30:23 PM4/6/09
to boost...@lists.boost.org
On Mon, Apr 6, 2009 at 10:46 PM, Thorsten Ottosen <thorsten...@dezide.com> wrote:
Dmitry Vinogradov skrev:


Calling process_range() twice is not a solution in my case. Copying references to a new container is a better way.

Regarding efficiency, can you look thru my rough concat() implementation attached to discover any its disadvantages?

It is undeniably more elegant than calling algorithms n times as a general solution. It is possible to improve performance by reordering some of the boolean expressions and by reducing the use of boost variant.
 

If the performance is good,this baby should be included in the range library asap.

I would like to implement a new version of this functionality, since I have some code I had prototyped previously. I believe that the idea is good, but the functionality should efficiently support two random access iterators, and that the performance can be improved by reducing the use of boost::variant, although I would have to measure the variant approach to be certain. I can add this to the list of things to do in response to the Boost.RangeEx review.
 

-Thorsten

Neil
 

Dmitry Vinogradov

unread,
Apr 6, 2009, 8:31:50 PM4/6/09
to boost...@lists.boost.org
Thorsten Ottosen wrote:
> Dmitry Vinogradov skrev:

>> Regarding efficiency, can you look thru my rough concat()
>> implementation attached to discover any its disadvantages?
>
> Wow. Nice work!
>
> Some comments:
>
> 0. I prefer to use unsigned instead of std::size_t. std::size_t can be
> larger than a word IIRC.
>
> 1.
> You can call Stabilize() in increment().
>
> 2.
> You should probably check It == End1 before Range == 0, because
> Range == 0 will be true most of the time during the iteration through
> the first range.

Thanks, I'll use these hints.

> 3. How can you avoid to store the second end iterator?

I need to store three iterators: current state (It) - it's for sure, and
the "gap" between ranges - Range1.end() and Range2.begin() for backward
and forward jumping between Range1 and Range2. These iterators is all
that I store. I can store Range1.end() and Range2.begin() as usual
iterators, not as variants. Currently they are stored as variants just
for simple comparision with `It' variant.

It's possible to get rid of Range member and use It.which() instead, but
I think it's hard to control `which' member of variant while having
typeof(TIterator1)==typeof(TIterator2), because "It = SomeIterator;"
statement will lead always to It.which()==0. Self-invented variant could
help here.

> How does the performace compare to creating a vector of references?
> (remember to call reserve()).

I think it's anyway very useful to have concat(). One can make his own
decision between concat() and vector of references for every particular
case. Creating vector will always lead to heap allocation and IFAIK it's
not time-cheap. Add time for copying references. Summary time will
depend a lot of particular algorithm is used for a range, so again I
think everyone must have a choice.

> If the performance is good,this baby should be included in the range
> library asap.

In additional for all, my implementation miss one big thing: it doesn't
have an auto-detection of ConcatIterator traversal category. So, I bet
for Neil's implementation he noticed in this thread ;)

Thorsten Ottosen

unread,
Apr 7, 2009, 2:49:00 AM4/7/09
to boost...@lists.boost.org
Neil Groves skrev:

>
> On Mon, Apr 6, 2009 at 10:46 PM, Thorsten Ottosen
> <thorsten...@dezide.com <mailto:thorsten...@dezide.com>> wrote:
>
> Dmitry Vinogradov skrev:
>
>
> Calling process_range() twice is not a solution in my case.
> Copying references to a new container is a better way.
>
> Regarding efficiency, can you look thru my rough concat()
> implementation attached to discover any its disadvantages?
>
>
> It is undeniably more elegant than calling algorithms n times as a
> general solution. It is possible to improve performance by reordering
> some of the boolean expressions and by reducing the use of boost variant.

Right on.

>
>
> If the performance is good,this baby should be included in the range
> library asap.
>
>
> I would like to implement a new version of this functionality, since I
> have some code I had prototyped previously. I believe that the idea is
> good, but the functionality should efficiently support two random access
> iterators, and that the performance can be improved by reducing the use
> of boost::variant, although I would have to measure the variant approach
> to be certain. I can add this to the list of things to do in response to
> the Boost.RangeEx review.

Great. I'm *sure* that the variant thing adds overhead:

1. when the iterators have different size

2. because we check two times that we have a certain case:

if( range == 0 )

also implies that we should increment a certain iterator, else another.

I wonder if we could call this beast

join_view(r1,r2)

or something. Perhaps support for 3 and 4 ranges would be nice.

Neil Groves

unread,
Apr 7, 2009, 12:55:22 PM4/7/09
to boost...@lists.boost.org
On Tue, Apr 7, 2009 at 7:49 AM, Thorsten Ottosen <thorsten...@dezide.com> wrote:

I wonder if we could call this beast

  join_view(r1,r2)

or something. Perhaps support for 3 and 4 ranges would be nice.

How about r1 | join(r2) | join(r3) ?

I was thinking that this looks very much like a job for a range adaptor. The support for multiple ranges is almost inherent because you can chain the iterator such that the second iterator type is a join_iterator. Is there any good reason that this should not be a range adaptor that anyone can think of?
 


-Thorsten
____________________________________________

Best Wishes,
Neil Groves

Thorsten Ottosen

unread,
Apr 7, 2009, 2:28:53 PM4/7/09
to boost...@lists.boost.org
Neil Groves skrev:

>
>
> On Tue, Apr 7, 2009 at 7:49 AM, Thorsten Ottosen
> <thorsten...@dezide.com <mailto:thorsten...@dezide.com>> wrote:
>
>
> I wonder if we could call this beast
>
> join_view(r1,r2)
>
> or something. Perhaps support for 3 and 4 ranges would be nice.
>
>
> How about r1 | join(r2) | join(r3) ?
>
> I was thinking that this looks very much like a job for a range adaptor.
> The support for multiple ranges is almost inherent because you can chain
> the iterator such that the second iterator type is a join_iterator.

I suspect that r1 | join(r2) | join(r3) will be less efficient than
join(r1,r2,r3), because the former will lead to double checks
of which range that is being iterated, unless there is some clever way
to avoid that. But the | joined(r1,r2) syntax or whatever is fine with me.

Steven Watanabe

unread,
Apr 7, 2009, 2:44:11 PM4/7/09
to boost...@lists.boost.org
AMDG

Thorsten Ottosen wrote:
> I suspect that r1 | join(r2) | join(r3) will be less efficient than
> join(r1,r2,r3), because the former will lead to double checks
> of which range that is being iterated, unless there is some clever way
> to avoid that. But the | joined(r1,r2) syntax or whatever is fine with
> me.

It would be pretty easy to make r1 | join(r2) recognize whether
r1 and r1 are themselves joint_views.

In Christ,
Steven Watanabe

Thorsten Ottosen

unread,
Apr 7, 2009, 3:54:23 PM4/7/09
to boost...@lists.boost.org
Steven Watanabe skrev:

> AMDG
>
> Thorsten Ottosen wrote:
>> I suspect that r1 | join(r2) | join(r3) will be less efficient than
>> join(r1,r2,r3), because the former will lead to double checks
>> of which range that is being iterated, unless there is some clever way
>> to avoid that. But the | joined(r1,r2) syntax or whatever is fine with
>> me.
>
> It would be pretty easy to make r1 | join(r2) recognize whether
> r1 and r1 are themselves joint_views.

Right, but is it easy to "unroll" the necessary information
inside e.g. increment()?

-Thorsten

Steven Watanabe

unread,
Apr 7, 2009, 4:32:41 PM4/7/09
to boost...@lists.boost.org
AMDG

Thorsten Ottosen wrote:
> Steven Watanabe skrev:
>> AMDG
>>
>> Thorsten Ottosen wrote:
>>> I suspect that r1 | join(r2) | join(r3) will be less efficient than
>>> join(r1,r2,r3), because the former will lead to double checks
>>> of which range that is being iterated, unless there is some clever way
>>> to avoid that. But the | joined(r1,r2) syntax or whatever is fine
>>> with me.
>>
>> It would be pretty easy to make r1 | join(r2) recognize whether
>> r1 and r1 are themselves joint_views.
>
> Right, but is it easy to "unroll" the necessary information
> inside e.g. increment()?

You don't necessarily have to.

template<class Sequences>
class joint_view;

template<class S, class R>
joint_view<typename mpl::push_back<S, R>::type>
operator|(const joint_view<S>& v, R& r);

In Christ,
Steven Watanabe

Thorsten Ottosen

unread,
Apr 8, 2009, 2:45:21 AM4/8/09
to boost...@lists.boost.org
Steven Watanabe skrev:
> AMDG
>
> Thorsten Ottosen wrote:
>> Steven Watanabe skrev:
>>> AMDG
>>>
>>> Thorsten Ottosen wrote:
>>>> I suspect that r1 | join(r2) | join(r3) will be less efficient than
>>>> join(r1,r2,r3), because the former will lead to double checks
>>>> of which range that is being iterated, unless there is some clever way
>>>> to avoid that. But the | joined(r1,r2) syntax or whatever is fine
>>>> with me.
>>>
>>> It would be pretty easy to make r1 | join(r2) recognize whether
>>> r1 and r1 are themselves joint_views.
>>
>> Right, but is it easy to "unroll" the necessary information
>> inside e.g. increment()?
>
> You don't necessarily have to.
>
> template<class Sequences>
> class joint_view;
>
> template<class S, class R>
> joint_view<typename mpl::push_back<S, R>::type>
> operator|(const joint_view<S>& v, R& r);

The hard part is the implementation of increment()/decrement().

-Thorsten

Steven Watanabe

unread,
Apr 8, 2009, 9:51:54 AM4/8/09
to boost...@lists.boost.org
AMDG

Thorsten Ottosen wrote:
>>>> Thorsten Ottosen wrote:
>>>>> I suspect that r1 | join(r2) | join(r3) will be less efficient than
>>>>> join(r1,r2,r3), because the former will lead to double checks
>>>>> of which range that is being iterated, unless there is some clever
>>>>> way
>>>>> to avoid that. But the | joined(r1,r2) syntax or whatever is fine
>>>>> with me.
>

> The hard part is the implementation of increment()/decrement().

Right, but what I was trying to say was that
join(r1, r2, r3) is not significantly different
from r1 | join(r2) | join(r3). If we can implement
one efficiently, we can implement the other efficiently.

In Christ,
Steven Watanabe

er

unread,
Aug 21, 2009, 10:25:08 PM8/21/09
to boost...@lists.boost.org
Dmitry Vinogradov wrote:
> Thorsten Ottosen wrote:
>> Dmitry Vinogradov skrev:
>>> Regarding efficiency, can you look thru my rough concat()

Was this file finally implemented and made available somewhere? Thanks.

er

unread,
Jan 7, 2010, 2:21:03 PM1/7/10
to boost...@lists.boost.org

> If the performance is good,this baby should be included in the range
> library asap.
>
> -Thorsten

Hello,

I am wondering if perhaps someone responsible for range_ex has
implemented concat and would care to kindly share it (apparently it's
not included in the vault).

Many thanks.

er

unread,
Jan 9, 2010, 8:13:57 AM1/9/10
to boost...@lists.boost.org
er wrote:

> I am wondering if perhaps someone responsible for range_ex has
> implemented concat and would care to kindly share it (apparently it's
> not included in the vault).
>

Someone tells me I overlooked chain which works similarly, albeit for
single pass only. Very useful, thanks!

#include <iostream>
#include <vector>
#include <list>
#include <iterator>
#include <algorithm>
#include <boost/range.hpp>
#include <boost/range/chain.hpp>
#include <boost/typeof/typeof.hpp>
#include <boost/next_prior.hpp>

int main (int argc, char * const argv[]) {

typedef double val_;
typedef std::vector<val_> vec_;
typedef std::list<val_> list_;

vec_ seq0; seq0.push_back(1);
list_ seq1; seq1.push_back(2);
vec_ seq2; seq2.push_back(3); seq2.push_back(4);
vec_ seq3; seq2.push_back(5);

BOOST_AUTO(
chained,
boost::chain(seq0,
boost::chain(seq1,
boost::chain(seq2,seq3)
)
)
);

BOOST_ASSERT( *boost::begin(chained) == 1 ); // Warning : reference to
temporary
BOOST_ASSERT( *boost::next(boost::begin(chained),1) == 2);
BOOST_ASSERT( *boost::next(boost::begin(chained),2) == 3);
BOOST_ASSERT( *boost::next(boost::begin(chained),3) == 4);
BOOST_ASSERT( *boost::next(boost::begin(chained),4) == 5);

std::copy(
boost::begin(chained),
boost::end(chained),
std::ostream_iterator<val_>(std::cout, " ")
);

return 0;

er

unread,
Jan 12, 2010, 4:59:03 PM1/12/10
to boost...@lists.boost.org

>
> Someone tells me I overlooked chain which works similarly, albeit for
> single pass only. Very useful, thanks!
>

Here's a suggestion for generalizing range::chain
https://svn.boost.org/svn/boost/sandbox/statistics/detail/range_ex/

Reply all
Reply to author
Forward
0 new messages