Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

vectors of large objects

85 views
Skip to first unread message

Paul

unread,
Jan 24, 2016, 4:03:52 PM1/24/16
to
Suppose we want to swap the first two elements of a vector<T> Suppose further that the type T is very large so we want to avoid copying.

Suppose v is a vector<T>
We can do T temp = v[0]; v[0] = v[1]; v[1] = temp; However, this seems to involve unnecessary copying. I therefore suggested that, for an operation that involves a lot of swapping (for example, a sort), we should use a vector of pointers to T rather than a vector<T> and swap via the pointers.

However, someone (in a private conversation so I don't want to name the person) told me that T temp = v[0]; v[0] = v[1]; v[1] = temp; does not in fact move large objects around, and only moves pointers, so that my issue doesn't really exist.

But it does exist, doesn't it? For example, T temp = v[0]; is copying a member of T and we don't want to do that. Can the issue be handled instead via references -- T& temp = v[0]; v[0] = v[1]; v[1] = temp; However, doesn't the line v[0] = v[1] involve an assignment of a large type which is likely to be expensive?

Any ideas?

Thanks,

Paul

Jorgen Grahn

unread,
Jan 24, 2016, 4:11:51 PM1/24/16
to
On Sun, 2016-01-24, Paul wrote:
> Suppose we want to swap the first two elements of a vector<T>
> Suppose further that the type T is very large so we want to avoid
> copying.

What's wrong with std::swap(v[0], v[1]) ?

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Paul

unread,
Jan 24, 2016, 4:15:20 PM1/24/16
to
On Sunday, January 24, 2016 at 9:11:51 PM UTC, Jorgen Grahn wrote:
> On Sun, 2016-01-24, Paul wrote:
> > Suppose we want to swap the first two elements of a vector<T>
> > Suppose further that the type T is very large so we want to avoid
> > copying.
>
> What's wrong with std::swap(v[0], v[1]) ?
>
Nothing. But I want to write the swap myself.

Paul

Ben Bacarisse

unread,
Jan 24, 2016, 5:14:57 PM1/24/16
to
Paul <peps...@gmail.com> writes:

> Suppose we want to swap the first two elements of a vector<T> Suppose
> further that the type T is very large so we want to avoid copying.
>
> Suppose v is a vector<T>
> We can do T temp = v[0]; v[0] = v[1]; v[1] = temp; However, this seems
> to involve unnecessary copying. I therefore suggested that, for an
> operation that involves a lot of swapping (for example, a sort), we
> should use a vector of pointers to T rather than a vector<T> and swap
> via the pointers.

The word "via" bothers me. If you swap the pointers, yes, you gain that
way. But you are not swapping via the pointers -- you are just swapping
the pointers.

> However, someone (in a private conversation so I don't want to name
> the person) told me that T temp = v[0]; v[0] = v[1]; v[1] = temp; does
> not in fact move large objects around, and only moves pointers, so
> that my issue doesn't really exist.

The stuff about pointers sounds crazy enough that, if it were me, I'd
want to to check the I'd understood the person. It's true that using a
temporary need not do all that copying, because the compiler might know
enough about the types to be able to optimise the code, but that's not
really what your correspondent seems to have been saying.

> But it does exist, doesn't it? For example, T temp = v[0]; is copying
> a member of T and we don't want to do that. Can the issue be handled
> instead via references -- T& temp = v[0]; v[0] = v[1]; v[1] = temp;

Costs aside, that won't work. The first initialisation makes temp and
v[0] be aliases -- two names for the same object. The second assignment
(to v[1]) will simply copy what it in v[0] which you've already made a
copy of v[1].

> However, doesn't the line v[0] = v[1] involve an assignment of a large
> type which is likely to be expensive?

If you want to swap two objects in any literal sense of the word, you
must be prepared to pay the cost a reading them both and writing them
both. You can make the cost only a very little more by doing the swap
byte by byte, but that may not be faster than going via a whole extra
copy to a temporary. With luck, the std::swap algorithm will be
implemented in a way that makes it the best choice (I know you are
asking about doing it yourself).

--
Ben.

Öö Tiib

unread,
Jan 24, 2016, 5:18:44 PM1/24/16
to
On Sunday, 24 January 2016 23:03:52 UTC+2, Paul wrote:
> Suppose we want to swap the first two elements of a vector<T> Suppose further that the type T is very large so we want to avoid copying.
>
> Suppose v is a vector<T>
> We can do T temp = v[0]; v[0] = v[1]; v[1] = temp; However, this seems
> to involve unnecessary copying. I therefore suggested that, for an
> operation that involves a lot of swapping (for example, a sort), we
> should use a vector of pointers to T rather than a vector<T> and swap via
> the pointers.

If you need sorted things then keep them in sorted containers.

>
> However, someone (in a private conversation so I don't want to name
> the person) told me that T temp = v[0]; v[0] = v[1]; v[1] = temp; does not
> in fact move large objects around, and only moves pointers, so that my
> issue doesn't really exist.

That is not correct.

>
> But it does exist, doesn't it? For example, T temp = v[0]; is copying
> a member of T and we don't want to do that. Can the issue be handled
> instead via references -- T& temp = v[0]; v[0] = v[1]; v[1] = temp;
> However, doesn't the line v[0] = v[1] involve an assignment of a large
> type which is likely to be expensive?
>
> Any ideas?

If the type 'T' has move assignment then you can write:

T temp = std::move(v[0]);
v[0] = std::move(v[1]);
v[1] = std::move(temp);

However 'std::swap(v[0],v[1])' does same.

Luca Risolia

unread,
Jan 24, 2016, 7:28:41 PM1/24/16
to
Il 24/01/2016 22:11, Jorgen Grahn ha scritto:

> What's wrong with std::swap(v[0], v[1]) ?

This is better:

using std::swap;
swap(v[0], v[1]);

as it does not disable ADL, which is the general advice in the STL.

Paul

unread,
Jan 25, 2016, 3:12:05 AM1/25/16
to
I don't understand why using std::swap is better. Can you expand please? Thanks.

Paul

Luca Risolia

unread,
Jan 25, 2016, 3:51:17 AM1/25/16
to
On 25/01/2016 09:11, Paul wrote:
> On Monday, January 25, 2016 at 12:28:41 AM UTC, Luca Risolia wrote:

>> using std::swap;
>> swap(v[0], v[1]);

> I don't understand why using std::swap is better. Can you expand please? Thanks.

"std::swap may be specialized in namespace std for user-defined types,
but such specializations are not found by ADL (the namespace std is not
the associated namespace for the user-defined type). The expected way to
make a user-defined type swappable is to provide a non-member function
swap in the same namespace as the type" (cppreference.com)

"std::swap(v[0], v[1])" is a scope-qualified call, so it disables ADL,
while the "swap(v[0], v[1])" does not, as it is not a scope-qualified call.

For example:

namespace X {
struct Y {};
void swap(Y&, Y&) {
std::cout << "my swap\n";
}
}

int main() {
X::Y a, b;

std::swap(a, b); // X::swap() is ignored

using std::swap;
swap(a, b); // X::swap() is used
}

Paavo Helde

unread,
Jan 25, 2016, 3:55:11 AM1/25/16
to
The above specifically *avoids* using std::swap if there is a specific
swap() function present for elements of v.

See e.g. http://en.cppreference.com/w/cpp/language/adl, especially the
Notes section.


Paul

unread,
Jan 25, 2016, 4:02:51 AM1/25/16
to
Thanks to everyone on this thread. I understand the issues, both with regard to my original question and my follow-up about ADL.

Paul

Paavo Helde

unread,
Jan 25, 2016, 4:08:05 AM1/25/16
to
On 25.01.2016 10:54, Paavo Helde wrote:
> On 25.01.2016 10:11, Paul wrote:
>> On Monday, January 25, 2016 at 12:28:41 AM UTC, Luca Risolia wrote:
>>> Il 24/01/2016 22:11, Jorgen Grahn ha scritto:
>>>
>>>> What's wrong with std::swap(v[0], v[1]) ?
>>>
>>> This is better:
>>>
>>> using std::swap;
>>> swap(v[0], v[1]);
>>>
>>> as it does not disable ADL, which is the general advice in the STL.
>>
>> I don't understand why using std::swap is better. Can you expand
>> please? Thanks.
>
> The above specifically *avoids* using std::swap if there is a specific
> swap() function present for elements of v.

Yep, to avoid using std::swap() one must say 'using std::swap()'.
Welcome to C++! Reminds me the 'throw()' declaration.



Juha Nieminen

unread,
Jan 25, 2016, 4:25:45 AM1/25/16
to
Why? If the type is move-constructible and move-assignable, std::swap()
will automatically move them rather than copy them.

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

SG

unread,
Jan 25, 2016, 6:46:58 AM1/25/16
to
On Sunday, January 24, 2016 at 10:03:52 PM UTC+1, Paul wrote:
> Suppose we want to swap the first two elements of a vector<T>
> Suppose further that the type T is very large so we want to avoid
> copying.

What kind of size are we talking about? sizeof(T)? Or size in terms
of how costly copying would be? The sizeof() a vector is rather small
(just three pointers) but copying a vector might involve copying lots
of other things that are not *directly* part of the vector.

> Suppose v is a vector<T>
> We can do
> T temp = v[0]; v[0] = v[1]; v[1] = temp;
> However, this seems to involve unnecessary copying.

You should do

using std::swap; // fall back on std::swap
swap(v[0], v[1]); // might invoke another swap via ADL

instead because it might resolve to a special swap function that has
been optimized for the type T (via ADL or a specialized std::swap).
std::swap's default implementation also makes use of move semantics:

template<class T>
void swap(T& a, T& b) noexcept(...) {
T t = std::move(a); // tries to move a into t
a = std::move(b); // tries to move b into a
b = std::move(t); // tries to move t into b
}

which is obviously better if T has a move constructor and move
assignment operator that is more efficient that their respective
copy versions.

> I therefore suggested that, for an operation that involves a lot of
> swapping (for example, a sort), we should use a vector of pointers
> to T rather than a vector<T> and swap via the pointers.

Yeah, if sizeof(T) is huge, this indirection might help. I would also
suggest to test this approach (std::vector<unique_ptr<T>> or boost::
ptr_vector<T>). If your T is a vector or a string, for example, this
is however not the case.

> However, someone (in a private conversation so I don't want to name
> the person) told me that T temp = v[0]; v[0] = v[1]; v[1] = temp;
> does not in fact move large objects around, and only moves pointers,
> so that my issue doesn't really exist.

First of all, it does not move anything because you didn't ask for it.
That's what std::move is for.

T temp = v[0]; // this *copies* v[0] to temp

There are situations where std::move is not necessary and in those
situations using std::move is actually somewhat harmful (since it
disables the otherwise applicable copy elision optimizations) but this
is not one of them, so you need to use std::move here.

But then it also depends on what T is and goes back to my first
question. If T is something like a std::vector<U> then you *don't*
need this additional layer of indirection because a vector *already*
stores its "elements" indirectly and has very efficient move
operations. These move operations are doing the pointer
manipulations. If sizeof(T) is huge, this is another story.

Cheers!
sg

Öö Tiib

unread,
Jan 25, 2016, 7:19:51 AM1/25/16
to
That example is so but OP case is different. There are no 'a' and 'b'
but 'v[0]' and 'v[1]' that are of type 'std::vector<X::Y>::reference'.
So the story must be is (or isn't?) somewhat trickier?

SG

unread,
Jan 25, 2016, 8:59:14 AM1/25/16
to
On Monday, January 25, 2016 at 1:19:51 PM UTC+1, Öö Tiib wrote:
>
> That example is so but OP case is different. There are no 'a' and 'b'
> but 'v[0]' and 'v[1]' that are of type 'std::vector<X::Y>::reference'.

If v is a std::vector<T> where T != bool then v[0] and v[1] are lvalue
expressions of type T. Let's ignore the T==bool case here, OK?

> So the story must be is (or isn't?) somewhat trickier?

No. It's not trickier. Given

T a = ...;
T b = ...;
std::vector<T> v = ...;

there is no difference between the expressions a and v[0] with respect
to their type (T) and value category (lvalue).

using std::swap;
swap(a, b);
swap(v[0], v[1]); // works just as well as above
// if v contains at least two elements. :-)

There is a difference if you use decltype but that's because decltype
distinguishes between multiple cases and can look "beyond" the
expression. IIRC decltype basically distinguishes between three
cases:

* the declared type of some entity.
Example: decltype(a) is T

* the declared return type of a function.
Example: decltype(v[0]) is T&
Example: decltype(std::move(a)) is T&&

* The type and "lvalueness" of an arbitrary expression. You can
force this case with an additional set of parentheses around the
expression.
Example: decltype((a)) is T&
Example: decltype((v[0])) is T&
Example: decltype((std::move(a))) is T
In all cases the type of the expression is T. But only in the
first two cases they are lvalue expressions which is why you
get an lvalue reference type. Xvalues are basically ignored
in this case.

Isn't this nicely complicated? Exactly what you would expect from C++!
;-)

Cheers!
sg

Luca Risolia

unread,
Jan 25, 2016, 9:36:51 AM1/25/16
to
On 25/01/2016 13:19, Öö Tiib wrote:
> That example is so but OP case is different. There are no 'a' and 'b'
> but 'v[0]' and 'v[1]' that are of type
> 'std::vector<X::Y>::reference'. So the story must be is (or isn't?)
> somewhat trickier?

The OP said elsewhere - before my answer:

"[...] But I want to write the swap myself."

He wants to write a swap for his type T.

Usually you want to declare swap in the same namespace as T, and the
typical advice to do make use of it is what I mentioned before.

Another conceptually similar and more familiar example in the standard -
which works because of ADL - is about operator<<(std::ostream&, T)
taking user-defined types as right hand argument. It's desiderable to
declare both your operator<< and T in the same namespace, e.g.:

namespace X {
struct T {};
std::ostream& operator<<(std::ostream&, const T& t);
}

int main() {
X::T t;
std::cout << t; // ADL
}

Öö Tiib

unread,
Jan 25, 2016, 11:59:42 AM1/25/16
to
On Monday, 25 January 2016 16:36:51 UTC+2, Luca Risolia wrote:
> On 25/01/2016 13:19, Öö Tiib wrote:
> > That example is so but OP case is different. There are no 'a' and 'b'
> > but 'v[0]' and 'v[1]' that are of type
> > 'std::vector<X::Y>::reference'. So the story must be is (or isn't?)
> > somewhat trickier?
>
> The OP said elsewhere - before my answer:
>
> "[...] But I want to write the swap myself."
>
> He wants to write a swap for his type T.
>
> Usually you want to declare swap in the same namespace as T, and the
> typical advice to do make use of it is what I mentioned before.

Ok, I think I get it now, it is for case when he wants but is unsure
if he writes 'void swap(Y&, Y&)' in his namespace X?
So the 'using namespace std' serves as way to fall back to 'std::swap'
if he for some reason did not write 'X::swap' despite he said he wants.

Same fallback (without 'using namespace std' needed) works with
'std::iter_swap':

std::iter_swap(&a, &b); // X::swap() is used if it exists otherwise std::swap()

It may be someone considers it worse to read that way but code seems
to be is generated as good.

>
> Another conceptually similar and more familiar example in the standard -
> which works because of ADL - is about operator<<(std::ostream&, T)
> taking user-defined types as right hand argument. It's desiderable to
> declare both your operator<< and T in the same namespace, e.g.:
>
> namespace X {
> struct T {};
> std::ostream& operator<<(std::ostream&, const T& t);
> }
>
> int main() {
> X::T t;
> std::cout << t; // ADL
> }

The 'std::ostream' is used rarely lately ... 'swap' is more common. Also
here are no need for that 'using namespace std'.

Öö Tiib

unread,
Jan 25, 2016, 12:14:08 PM1/25/16
to
On Monday, 25 January 2016 15:59:14 UTC+2, SG wrote:
> On Monday, January 25, 2016 at 1:19:51 PM UTC+1, Öö Tiib wrote:
> >
> > That example is so but OP case is different. There are no 'a' and 'b'
> > but 'v[0]' and 'v[1]' that are of type 'std::vector<X::Y>::reference'.
>
> If v is a std::vector<T> where T != bool then v[0] and v[1] are lvalue
> expressions of type T. Let's ignore the T==bool case here, OK?
>
> > So the story must be is (or isn't?) somewhat trickier?
>
> No. It's not trickier. Given
>
> T a = ...;
> T b = ...;
> std::vector<T> v = ...;
>
> there is no difference between the expressions a and v[0] with respect
> to their type (T) and value category (lvalue).
>
> using std::swap;
> swap(a, b);
> swap(v[0], v[1]); // works just as well as above
> // if v contains at least two elements. :-)

Ok. Got it.

>
> There is a difference if you use decltype but that's because decltype
> distinguishes between multiple cases and can look "beyond" the
> expression. IIRC decltype basically distinguishes between three
> cases:
>
> * the declared type of some entity.
> Example: decltype(a) is T
>
> * the declared return type of a function.
> Example: decltype(v[0]) is T&
> Example: decltype(std::move(a)) is T&&
>
> * The type and "lvalueness" of an arbitrary expression. You can
> force this case with an additional set of parentheses around the
> expression.
> Example: decltype((a)) is T&
> Example: decltype((v[0])) is T&
> Example: decltype((std::move(a))) is T
> In all cases the type of the expression is T. But only in the
> first two cases they are lvalue expressions which is why you
> get an lvalue reference type. Xvalues are basically ignored
> in this case.
>
> Isn't this nicely complicated? Exactly what you would expect from C++!
> ;-)

Indeed. Sometimes I wish that there were no references whatsoever
and considering if things are moved, copied or passed by reference
was decided by compiler somehow (like it works with "by-value"
return values). However it is sort of hard to come out with good clear
recipe to it.

Paavo Helde

unread,
Jan 25, 2016, 3:35:24 PM1/25/16
to
On 25.01.2016 18:59, 嘱 Tiib wrote:
> On Monday, 25 January 2016 16:36:51 UTC+2, Luca Risolia wrote:
>> On 25/01/2016 13:19, 嘱 Tiib wrote:
>>> That example is so but OP case is different. There are no 'a' and 'b'
>>> but 'v[0]' and 'v[1]' that are of type
>>> 'std::vector<X::Y>::reference'. So the story must be is (or isn't?)
>>> somewhat trickier?
>>
>> The OP said elsewhere - before my answer:
>>
>> "[...] But I want to write the swap myself."
>>
>> He wants to write a swap for his type T.
>>
>> Usually you want to declare swap in the same namespace as T, and the
>> typical advice to do make use of it is what I mentioned before.
>
> Ok, I think I get it now, it is for case when he wants but is unsure
> if he writes 'void swap(Y&, Y&)' in his namespace X?
> So the 'using namespace std' serves as way to fall back to 'std::swap'
> if he for some reason did not write 'X::swap' despite he said he wants.

It's more for generic templated code where you have no idea if there is
any special swapping support defined for type T or not, but you still
want to swap T-s.

Cheers
Paavo



0 new messages