Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Vector sizes and other missed optimizations

238 views
Skip to first unread message

Andy Lutomirski

unread,
May 16, 2013, 3:45:36 PM5/16/13
to
This has been annoying me for a while:

#include <stddef.h>

template<typename T>
class Vec
{
public:
Vec(size_t len) : len_(len) { data_ = new T[len]; }
~Vec() { delete data_; }
Vec(const Vec &) = delete;
void operator = (const Vec &) = delete;

size_t size() const { return len_; }
T &operator [] (size_t pos) { return data_[pos]; }
private:
size_t len_;
T * data_;
};

template<typename T>
void IncrementVec(Vec<T> &v)
{
for (size_t i = 0; i < v.size(); i++)
++v[i];
}

void f1(Vec<size_t> &v)
{
// This is slow -- the compiler can't figure out that data_[i]
// does not alias len_;
IncrementVec(v);
}

void f2(Vec<short> &v)
{
// This is fast -- type-based alias analysis works.
IncrementVec(v);
}

The code generation for f1 loads len_ on every iteration of the loop.
libstdc++'s vector avoids this problem by storing a one-past-the-end
pointer instead of a size, but that has other issues (in my use case,
I'm storing a multidimensional object, and encoding strides in pointers
gets awkward).

Are there any ways, idiomatic or otherwise, to tell a compiler (even in
theory) that data_ can't point to len_? Adding a restrict qualifier to
data_ has no effect in g++ (correctly AFAICT -- data_ isn't a function
parameter).

Thanks,
Andy


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Wil Evers

unread,
May 16, 2013, 7:47:07 PM5/16/13
to
Andy Lutomirski wrote:

[snip]
> template<typename T>
> class Vec
> {
> public:
[snip]
> size_t size() const { return len_; }
> T &operator [] (size_t pos) { return data_[pos]; }
> private:
> size_t len_;
> T * data_;
> };
>
> template<typename T>
> void IncrementVec(Vec<T> &v)
> {
> for (size_t i = 0; i < v.size(); i++)
> ++v[i];
> }
>
> void f1(Vec<size_t> &v)
> {
> // This is slow -- the compiler can't figure out that data_[i]
> // does not alias len_;
> IncrementVec(v);
> }
>
> void f2(Vec<short> &v)
> {
> // This is fast -- type-based alias analysis works.
> IncrementVec(v);
> }
>
> The code generation for f1 loads len_ on every iteration of the
> loop.

[snip]

> Are there any ways, idiomatic or otherwise, to tell a compiler (even
> in theory) that data_ can't point to len_? Adding a restrict
> qualifier to data_ has no effect in g++ (correctly AFAICT -- data_
> isn't a function parameter).

What about fetching the size only once, and storing it in a local
variable:

template<typename T>
void IncrementVec(Vec<T> &v)
{
size_t sz = v.size();
for (size_t i = 0; i < sz; i++)
++v[i];
}

I would expect any halfway decent compiler to detect that the address
of the local variable 'sz' is not taken, and therefore cannot be
modified by incrementing the vector's elements.

Hope this helps,

- Wil

Joshua Maurice

unread,
May 16, 2013, 7:47:52 PM5/16/13
to
On May 16, 12:45 pm, Andy Lutomirski <l...@amacapital.net> wrote:
> This has been annoying me for a while:
>
> #include <stddef.h>
>
> template<typename T>
> class Vec
> {
> public:
> Vec(size_t len) : len_(len) { data_ = new T[len]; }
> ~Vec() { delete data_; }
> Vec(const Vec &) = delete;
> void operator = (const Vec &) = delete;
>
> size_t size() const { return len_; }
> T &operator [] (size_t pos) { return data_[pos]; }
> private:
> size_t len_;
> T * data_;
> };
[...]
> template<typename T>
> void IncrementVec(Vec<T> &v)
> {
> for (size_t i = 0; i < v.size(); i++)
> ++v[i];
>
> }
[...]
> Are there any ways, idiomatic or otherwise, to tell a compiler (even in
> theory) that data_ can't point to len_? Adding a restrict qualifier to
> data_ has no effect in g++ (correctly AFAICT -- data_ isn't a function
> parameter).

An idiomatic solution to this particular problem? Call size() once
outside the loop, store the result in a local variable, and use the
local variable in the "for expression" instead of size().

Ex:
size_t n = v.size();
for (size_t i = 0; i < n; i++)
++v[i];

I'd expect a good compiler to be able to optimize that as opposed to
the old code. In the old code, a compiler couldn't (easily) know if
they were aliasing, but as soon as you create a new local variable,
escape analysis should be able to kick in and show that nothing
aliases the local variable. At least, that's what I'd expect on a good
compiler. Test of course. Also, I don't think this is what you wanted,
but it is an "idiomatic" way of solving this very particular problem.

That would be an interesting language feature though. I wonder how you
might specify it correctly and in such a way that it's useful as a
general purpose thing. Perhaps that restrict on a member X of a class
Y means that the member X and any directly or indirectly pointed-to or
contained objects of X will not alias any other member of Y for the
same Y object. Maybe.

Richard

unread,
May 16, 2013, 7:48:55 PM5/16/13
to
[Please do not mail me a copy of your followup]

Andy Lutomirski <lu...@amacapital.net> spake the secret code
<51952d65$0$52762$742e...@news.sonic.net> thusly:

>This has been annoying me for a while:
>
>#include <stddef.h>
>
>template<typename T>
>class Vec
>{
>public:
> Vec(size_t len) : len_(len) { data_ = new T[len]; }
> ~Vec() { delete data_; }

You mean mismatched array new and scalar delete?

> // This is slow -- the compiler can't figure out that data_[i]
> // does not alias len_;

Oh, you're talking about something else.
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
The Terminals Wiki <http://terminals.classiccmp.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

Richard

unread,
May 17, 2013, 1:35:43 AM5/17/13
to
[Please do not mail me a copy of your followup]

Joshua Maurice <joshua...@googlemail.com> spake the secret code
<1a1ac670-b9ca-4a3d...@d8g2000pbe.googlegroups.com> thusly:

>An idiomatic solution to this particular problem? Call size() once
>outside the loop, store the result in a local variable, and use the
>local variable in the "for expression" instead of size().
>
>Ex:
> size_t n = v.size();
> for (size_t i = 0; i < n; i++)
> ++v[i];

or even:

for (size_t i = 0, n = v.size(); i < n; ++i)
++v[i];

--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
The Terminals Wiki <http://terminals.classiccmp.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>


Ulrich Eckhardt

unread,
May 17, 2013, 5:59:06 AM5/17/13
to
Am 16.05.2013 21:45, schrieb Andy Lutomirski:
> This has been annoying me for a while:
[...]
> template<typename T>
> void IncrementVec(Vec<T> &v)
> {
> for (size_t i = 0; i < v.size(); i++)
> ++v[i];
> }
[...]
> The code generation for f1 loads len_ on every iteration of the loop.

I'd say the default for-loop should look like this:

for(size_t i=0, size=v.size(); i!=size; ++i)

or for iterators:

for(iterator it=v.begin(), end=v.end(); it!=end; ++it)

This avoids any function call or load overhead on the loop variables.
Also note that using "!=" and prefix-increment should be automatic,
no-brain-involved actions of your typing hands anyway, since these are
guaranteed to work and perform.

That said, being able to annotate to the compiler/optimizer that some
things don't alias would be nice. I think that such subtle hints could
be used to improve quite some code where the compiler up to now has to
assume that some things could be modified by side effects.

Uli

Andy Champ

unread,
May 17, 2013, 10:29:26 AM5/17/13
to
On 17/05/2013 10:59, Ulrich Eckhardt wrote:
> I'd say the default for-loop should look like this:
>
> for(size_t i=0, size=v.size(); i!=size; ++i)
>
The paranoid aming us will always code that as
for(size_t i=0, size=v.size(); i<size; ++i)

> or for iterators:
>
> for(iterator it=v.begin(), end=v.end(); it!=end; ++it)

.... and always feel ever-so-slightly uncomfortable that you can't do
less-than on iterators.

Andy

Greg Marr

unread,
May 17, 2013, 5:34:21 PM5/17/13
to

On Friday, May 17, 2013 10:29:26 AM UTC-4, Andy Champ wrote:
>> > or for iterators:
> >
> > for(iterator it=v.begin(), end=v.end(); it!=end; ++it)
>
> .... and always feel ever-so-slightly uncomfortable that you can't do
> less-than on iterators.

You can, as long as they're random-access iterators.

If you're uncomfortable with that, maybe you'd be better off using

for(auto const &elem : v) {}

or

for_each(v.begin(), v.end(), [](auto const &elem){});

Daniel Krügler

unread,
May 17, 2013, 5:35:00 PM5/17/13
to

Am 17.05.2013 16:29, schrieb Andy Champ:
>> or for iterators:
>>
>> for(iterator it=v.begin(), end=v.end(); it!=end; ++it)
>
> .... and always feel ever-so-slightly uncomfortable that you can't do
> less-than on iterators.

Sure you can - if they are random-access iterator.

HTH & Greetings from Bremen,

Daniel Kr�gler

Andy Lutomirski

unread,
May 17, 2013, 5:36:37 PM5/17/13
to

On 05/17/2013 02:59 AM, Ulrich Eckhardt wrote:
> Am 16.05.2013 21:45, schrieb Andy Lutomirski:
> > This has been annoying me for a while:
> [...]
> > template<typename T>
> > void IncrementVec(Vec<T> &v)
> > {
> > for (size_t i = 0; i < v.size(); i++)
> > ++v[i];
> > }
> [...]
> > The code generation for f1 loads len_ on every iteration of the loop.
>
> I'd say the default for-loop should look like this:
>
> for(size_t i=0, size=v.size(); i!=size; ++i)

Ugh. This works, of course (although it now may result in a pointless
spill to the stack, depending on what's in the loop). But I don't want
to make my users think about it.

My real usecase is for a multidimensional vector-like thing, in which
the class contains the strides as private member variables. There isn't
a clean way to manually hoist the accesses out of loops.

More generally, any class that contains a pointer to the same scalar
type as one of its private members may be difficult for the compiler to
optimize well.

--Andy

Andy Champ

unread,
May 19, 2013, 3:10:15 AM5/19/13
to

On 17/05/2013 22:34, Greg Marr wrote:
> On Friday, May 17, 2013 10:29:26 AM UTC-4, Andy Champ wrote:
>>>> >> >or for iterators:
>>> > >
>>> > > for(iterator it=v.begin(), end=v.end(); it!=end; ++it)
>> >
>> >.... and always feel ever-so-slightly uncomfortable that you can't
>> >do less-than on iterators.
> You can, as long as they're random-access iterators.

Didn't know that. Don't know why I didn't know, but I didn't. It makes
sense that you can.

> If you're uncomfortable with that, maybe you'd be better off using
>
> for(auto const &elem : v) {}
>

Only if the compiler is new enough :(

> or
>
> for_each(v.begin(), v.end(), [](auto const &elem){});

... and that's what I tend to do now - even though in the MS world
that involves two function calls - one to for_each, and then one to
the lambda _on_ _each_ _iteration_ :(

Hasn't been a performance problem yet.

Andy

Greg Marr

unread,
May 20, 2013, 3:07:21 AM5/20/13
to

> Only if the compiler is new enough :(

Still on VS2010, huh? Me too.

> > for_each(v.begin(), v.end(), [](auto const &elem){});
>
> ... and that's what I tend to do now - even though in the MS world
> that involves two function calls - one to for_each, and then one to
> the lambda _on_ _each_ _iteration_ :(

They're all templated, so they should get inlined in the release
build.
0 new messages