Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Do vector iterators remain valid after resize() if base pointer of elements remains unchanged?

2 views
Skip to first unread message

zr

unread,
Mar 24, 2009, 6:56:24 AM3/24/09
to
I hope the following example illustrates the question:

#include <vector>

int _tmain(int argc, _TCHAR* argv[])
{
std::vector<int> sorted_image_id;
//..
sorted_image_id::pointer base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id().size()*4);

if (&sorted_image_id.front() == base)
{
//Is it safe to assume all iterators are still valid?
} else
{
//Need to reassign iterators
}
//...

return 0;
}

zr

unread,
Mar 24, 2009, 7:05:14 AM3/24/09
to

Here is a corrected version that passes compilation:

int _tmain(int argc, _TCHAR* argv[])
{

float resize_factor = 1.3;


std::vector<int> sorted_image_id;
//..

int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);

// optimization: reuse iterators after resize

Victor Bazarov

unread,
Mar 24, 2009, 8:20:30 AM3/24/09
to
zr wrote:
> On Mar 24, 12:56 pm, zr <zvir...@gmail.com> wrote:
>> I hope the following example illustrates the question:
>>
>> #include <vector>
>>
>> int _tmain(int argc, _TCHAR* argv[])
>> {
>> std::vector<int> sorted_image_id;
>> //..
>> sorted_image_id::pointer base = &sorted_image_id.front();
>> sorted_image_id.resize(sorted_image_id().size()*4);
>>
>> if (&sorted_image_id.front() == base)
>> {
>> //Is it safe to assume all iterators are still valid?
>> } else
>> {
>> //Need to reassign iterators
>> }
>> //...
>>
>> return 0;
>>
>> }
>
> Here is a corrected version that passes compilation:

"Passes compilation"? Not on my compiler. '_TCHAR' is undefined.

>
> int _tmain(int argc, _TCHAR* argv[])
> {
> float resize_factor = 1.3;
> std::vector<int> sorted_image_id;
> //..
> int* base = &sorted_image_id.front();
> sorted_image_id.resize(sorted_image_id.size()*resize_factor);
>
> // optimization: reuse iterators after resize
> if (&sorted_image_id.front() == base)
> {
> //Is it safe to assume all iterators are still valid?
> } else
> {
> //Need to reassign iterators
> }
> //...
>
> return 0;
> }

'std::vector' is very sensitive WRT its iterators. If reallocation
occurs, all iterators are invalidated. So, if your pointer comparison
is designed to see whether the reallocation has occurred (which it would
seem to do), then you're probably OK. I don't know much about the
systems on which loading an invalid pointer into a register can trip the
hardware signal, so I can't vouch for the cases when 'base' is compared
with the address of the first element after becoming invalid on such
systems. FAIK, it could be a trap. The only value guaranteed to work
(when compared with any other pointer) is a null pointer value.

Of course, hardware interrupt upon loading of a pointer value that just
became "invalid" is highly unlikely, so you're *most likely* fine.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Balog Pal

unread,
Mar 24, 2009, 9:50:32 AM3/24/09
to
int* base = &sorted_image_id.front();
sorted_image_id.resize(sorted_image_id.size()*resize_factor);

// optimization: reuse iterators after resize
if (&sorted_image_id.front() == base)

if reallocation happened base has an invalid pointer, so using it this way
(lvalue to rvalue conversion) is undefined behavior.

You may try to play with capacity() before resize()


Fred Zwarts

unread,
Mar 24, 2009, 10:43:16 AM3/24/09
to
"zr" <zvi...@gmail.com> wrote in message news:1e21430a-6060-4167...@o36g2000yqh.googlegroups.com...

On Mar 24, 12:56 pm, zr <zvir...@gmail.com> wrote:
> I hope the following example illustrates the question:
>
> #include <vector>
>
>
> int _tmain(int argc, _TCHAR* argv[])
> {
> float resize_factor = 1.3;
> std::vector<int> sorted_image_id;
> //..
> int* base = &sorted_image_id.front();
> sorted_image_id.resize(sorted_image_id.size()*resize_factor);
>
> // optimization: reuse iterators after resize
> if (&sorted_image_id.front() == base)
> {
> //Is it safe to assume all iterators are still valid?
> } else
> {
> //Need to reassign iterators
> }
> //...
>
> return 0;
> }

I am not sure.
Could sorted_image_id.end() depend on the size of the vector even if no reallocation occurs?
If so, an iterator with this value before the resize may have a value after the
resize with at least a questionable value.

rep_movsd

unread,
Mar 24, 2009, 1:21:49 PM3/24/09
to
A vector always stores elements in sequence, provides o(1) random
access to its elements and amortized o(1) for back insertions and
deletions.

I am not sure if this is in the standard, but it would be stupid to
implement it any other way other than as an array that resizes.
There are reams of code that depend on this...

So if the &*v.begin() is unchanged there should be no sane reason why
iterators will be invalid since they are effectively pointers...

The OP is clearly not programming an esoteric architecture that faults
on invalid pointer assignment ( as opposed to dereference ) as clearly
evidenced by the use of "_TCHAR*" - Its garden variety MS Visual C++
on Windows on x86. So I think musing about that is not fruitful.

Also since vector iterators are effectively pointers, there is nothing
special about end() , its just a pointer to data one element past the
last one.

The most sensible thing to do is to not use iterators but indexes...

There is really no significant performance difference between
accessing an element by an iterator or simply using the [] operator
with an index. [] gets inlined, its the compilers job....

Bo Persson

unread,
Mar 24, 2009, 4:42:16 PM3/24/09
to
rep_movsd wrote:
> A vector always stores elements in sequence, provides o(1) random
> access to its elements and amortized o(1) for back insertions and
> deletions.
>
> I am not sure if this is in the standard, but it would be stupid to
> implement it any other way other than as an array that resizes.
> There are reams of code that depend on this...
>
> So if the &*v.begin() is unchanged there should be no sane reason
> why iterators will be invalid since they are effectively pointers...
>
> The OP is clearly not programming an esoteric architecture that
> faults on invalid pointer assignment ( as opposed to dereference )
> as clearly evidenced by the use of "_TCHAR*" - Its garden variety
> MS Visual C++ on Windows on x86. So I think musing about that is
> not fruitful.

There is nothing wrong in pointing out the undefined behaviour, should
the OP release his "working" code for others to reuse. Remenber that
the x86 does have checking in segmented mode, we are just not using
that most of the time.


Bo Persson


zr

unread,
Mar 25, 2009, 1:54:50 AM3/25/09
to

Now I'm just curious
On x86, two pointers are compared using "compare integer"
instructions, so if one the operands is actually a dangling pointer,
no exception would be raised. Is there any platform where this is not
the case?

Fred Zwarts

unread,
Mar 25, 2009, 4:14:30 AM3/25/09
to

"rep_movsd" <rep....@gmail.com> wrote in message news:d54270f8-2ccf-402b...@v35g2000pro.googlegroups.com...

>A vector always stores elements in sequence, provides o(1) random
> access to its elements and amortized o(1) for back insertions and
> deletions.
>
> I am not sure if this is in the standard, but it would be stupid to
> implement it any other way other than as an array that resizes.
> There are reams of code that depend on this...
>
> So if the &*v.begin() is unchanged there should be no sane reason why
> iterators will be invalid since they are effectively pointers...
>
> The OP is clearly not programming an esoteric architecture that faults
> on invalid pointer assignment ( as opposed to dereference ) as clearly
> evidenced by the use of "_TCHAR*" - Its garden variety MS Visual C++
> on Windows on x86. So I think musing about that is not fruitful.
>
> Also since vector iterators are effectively pointers, there is nothing
> special about end() , its just a pointer to data one element past the
> last one.

But if end() was saved in an iterator, after the resize this iterator no longer equals end().
If that iterator was used as a shortcut for end(), this iterator can no longer be used for this purpose after the resize.
The iterator may still point to something, but no longer to the end of the vector.
It depends on the meaning of "valid" if that iterator is still valid.

James Kanze

unread,
Mar 25, 2009, 5:00:25 AM3/25/09
to
On Mar 25, 6:54 am, zr <zvir...@gmail.com> wrote:
> On Mar 24, 10:42 pm, "Bo Persson" <b...@gmb.dk> wrote:
> > rep_movsd wrote:

[...]


> > There is nothing wrong in pointing out the undefined
> > behaviour, should the OP release his "working" code for
> > others to reuse. Remenber that the x86 does have checking in
> > segmented mode, we are just not using that most of the time.

> Now I'm just curious On x86, two pointers are compared using


> "compare integer" instructions, so if one the operands is
> actually a dangling pointer, no exception would be raised. Is
> there any platform where this is not the case?

On an x86, pointers are 48 bits, and the compare integer
instruction only works on 32 bits (at least on the 32 bit
x86's). So something more is needed. The usual solutions are
either to ignore the segment (thus seriously restricting the
programmers), or to load the addresses with something like the
les instruction, then move the segment into a general register
in order to compare it. If the address is invalid, I think that
the les instruction will trap.

The reason this bit of undefined behavior was introduced into
the standard was precisely to support the x86.

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

unread,
Mar 25, 2009, 5:06:53 AM3/25/09
to
On Mar 24, 3:43 pm, "Fred Zwarts" <F.Zwa...@KVI.nl> wrote:
> "zr" <zvir...@gmail.com> wrote in
> messagenews:1e21430a-6060-4167...@o36g2000yqh.googlegroups.com...

> On Mar 24, 12:56 pm, zr <zvir...@gmail.com> wrote:

> > I hope the following example illustrates the question:

> > #include <vector>

> > int _tmain(int argc, _TCHAR* argv[])
> > {
> > float resize_factor = 1.3;
> > std::vector<int> sorted_image_id;
> > //..
> > int* base = &sorted_image_id.front();
> > sorted_image_id.resize(sorted_image_id.size()*resize_factor);

> > // optimization: reuse iterators after resize
> > if (&sorted_image_id.front() == base)
> > {
> > //Is it safe to assume all iterators are still valid?
> > } else
> > {
> > //Need to reassign iterators
> > }
> > //...
> > return 0;
> > }

> I am not sure.
> Could sorted_image_id.end() depend on the size of the vector
> even if no reallocation occurs?

Or course it could.

> If so, an iterator with this value before the resize may have
> a value after the resize with at least a questionable value.

I think it's evident that he can only be talking about valid,
dereferenceable iterators. His code should work, but IMHO, it
doesn't look very idiomatic, and would raise questions in a code
review. I think something like:

bool iteratorsValid = array.capacity() >= newSize ;
array.resize( newSize ) ;
if ( iteratorsValid ) // ...

would be more typical and easier to understand.

zr

unread,
Mar 25, 2009, 7:56:11 PM3/25/09
to
On Mar 25, 11:00 am, James Kanze <james.ka...@gmail.com> wrote:
> On Mar 25, 6:54 am, zr <zvir...@gmail.com> wrote:
>
> > On Mar 24, 10:42 pm, "Bo Persson" <b...@gmb.dk> wrote:
> > > rep_movsd wrote:
>
>     [...]
>
> > > There is nothing wrong in pointing out the undefined
> > > behaviour, should the OP release his "working" code for
> > > others to reuse. Remenber that the x86 does have checking in
> > > segmented mode, we are just not using that most of the time.
> > Now I'm just curious On x86, two pointers are compared using
> > "compare integer" instructions, so if one the operands is
> > actually a dangling pointer, no exception would be raised. Is
> > there any platform where this is not the case?
>
> On an x86, pointers are 48 bits, and the compare integer
> instruction only works on 32 bits (at least on the 32 bit
> x86's).  So something more is needed.  The usual solutions are
> either to ignore the segment (thus seriously restricting the
> programmers), or to load the addresses with something like the
> les instruction, then move the segment into a general register
> in order to compare it.  If the address is invalid, I think that
> the les instruction will trap.
>
> The reason this bit of undefined behavior was introduced into
> the standard was precisely to support the x86.
>
> --
> James Kanze (GABI Software)             email:james.ka...@gmail.com

> Conseils en informatique orientée objet/
>                    Beratung in objektorientierter Datenverarbeitung
> 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

But sizeof(void*) will return 4 bytes, which is 32 bits, so how
exactly does the language implementor support far pointers?

rep_movsd

unread,
Mar 26, 2009, 2:21:09 AM3/26/09
to
On Mar 25, 2:00 pm, James Kanze <james.ka...@gmail.com> wrote:
> On Mar 25, 6:54 am, zr <zvir...@gmail.com> wrote:
>
> On an x86, pointers are 48 bits, and the compare integer
> instruction only works on 32 bits (at least on the 32 bit
> x86's).  So something more is needed.  The usual solutions are
> either to ignore the segment (thus seriously restricting the
> programmers), or to load the addresses with something like the
> les instruction, then move the segment into a general register
> in order to compare it.  If the address is invalid, I think that
> the les instruction will trap.
>
> The reason this bit of undefined behavior was introduced into
> the standard was precisely to support the x86.

Not true.....
on 32 bit compilers sizeof(void*) == sizeof(int)
I've never heard of 48 bit pointers.
The memory model for 32 bit protected mode from the application and
compilers point of view is flat.

On 64 bit compilers this int/void* guarantee is broken...

Loading an invalid address into a register will not trap!


James Kanze

unread,
Mar 26, 2009, 5:59:32 AM3/26/09
to
On Mar 26, 12:56 am, zr <zvir...@gmail.com> wrote:
> On Mar 25, 11:00 am, James Kanze <james.ka...@gmail.com> wrote:

> > On an x86, pointers are 48 bits, and the compare integer
> > instruction only works on 32 bits (at least on the 32 bit
> > x86's). So something more is needed. The usual solutions
> > are either to ignore the segment (thus seriously restricting
> > the programmers), or to load the addresses with something
> > like the les instruction, then move the segment into a
> > general register in order to compare it. If the address is
> > invalid, I think that the les instruction will trap.

> > The reason this bit of undefined behavior was introduced
> > into the standard was precisely to support the x86.

> But sizeof(void*) will return 4 bytes, which is 32 bits, so


> how exactly does the language implementor support far
> pointers?

That depends on the implementation. I've used C++ compilers for
the 80x86 where sizeof(void*) was 6. (And sizeof(long) was 4,
and there was no long long, so there was no integral type large
enough to handle a pointer.) And of course, sizeof(void*) could
be 4 on the earlier 16 bit processors.

--
James Kanze (GABI Software) email:james...@gmail.com

James Kanze

unread,
Mar 26, 2009, 6:01:22 AM3/26/09
to
On Mar 26, 7:21 am, rep_movsd <rep.mo...@gmail.com> wrote:
> On Mar 25, 2:00 pm, James Kanze <james.ka...@gmail.com> wrote:

> > On Mar 25, 6:54 am, zr <zvir...@gmail.com> wrote:

> > On an x86, pointers are 48 bits, and the compare integer
> > instruction only works on 32 bits (at least on the 32 bit
> > x86's). So something more is needed. The usual solutions
> > are either to ignore the segment (thus seriously restricting
> > the programmers), or to load the addresses with something
> > like the les instruction, then move the segment into a
> > general register in order to compare it. If the address is
> > invalid, I think that the les instruction will trap.

> > The reason this bit of undefined behavior was introduced
> > into the standard was precisely to support the x86.

> Not true.....

Obviously, you aren't familiar with the Intel architecture.

> on 32 bit compilers sizeof(void*) == sizeof(int)
> I've never heard of 48 bit pointers.

And I've used them. On an Intel 80386.

> The memory model for 32 bit protected mode from the
> application and compilers point of view is flat.

Only if the OS artificially chooses to restrict the users to
that.

> On 64 bit compilers this int/void* guarantee is broken...

> Loading an invalid address into a register will not trap!

It does on an Intel 80x86. I've actually seen it happen.

zr

unread,
Mar 26, 2009, 10:31:47 AM3/26/09
to
On Mar 26, 11:59 am, James Kanze <james.ka...@gmail.com> wrote:
> On Mar 26, 12:56 am, zr <zvir...@gmail.com> wrote:
>
> > On Mar 25, 11:00 am, James Kanze <james.ka...@gmail.com> wrote:
> > > On an x86, pointers are 48 bits, and the compare integer
> > > instruction only works on 32 bits (at least on the 32 bit
> > > x86's).  So something more is needed.  The usual solutions
> > > are either to ignore the segment (thus seriously restricting
> > > the programmers), or to load the addresses with something
> > > like the les instruction, then move the segment into a
> > > general register in order to compare it.  If the address is
> > > invalid, I think that the les instruction will trap.
> > > The reason this bit of undefined behavior was introduced
> > > into the standard was precisely to support the x86.
> > But sizeof(void*) will return 4 bytes, which is 32 bits, so
> > how exactly does the language implementor support far
> > pointers?
>
> That depends on the implementation.  I've used C++ compilers for
> the 80x86 where sizeof(void*) was 6.  (And sizeof(long) was 4,
> and there was no long long, so there was no integral type large
> enough to handle a pointer.)  And of course, sizeof(void*) could
> be 4 on the earlier 16 bit processors.
>
> --
> James Kanze (GABI Software)             email:james.ka...@gmail.com

> Conseils en informatique orientée objet/
>                    Beratung in objektorientierter Datenverarbeitung
> 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Very interesting, thanks.

rep_movsd

unread,
Mar 27, 2009, 4:56:00 PM3/27/09
to
> Obviously, you aren't familiar with the Intel architecture.
>
> > on 32 bit compilers sizeof(void*) == sizeof(int)
> > I've never heard of 48 bit pointers.
>
> And I've used them.  On an Intel 80386.

All Right, I heard of them now...

>
> > The memory model for 32 bit protected mode from the
> > application and compilers point of view is flat.
>
> Only if the OS artificially chooses to restrict the users to
> that.

Well, considering that a 32 bit CPU can never address more than that
anyway ( forget the 36 bit PAE shenanigans ), I don't get how 48 bit
addresses would be beneficial. Could you elaborate ( since you
actually used them ) ?

>
> > On 64 bit compilers this int/void* guarantee is broken...
> > Loading an invalid address into a register will not trap!
>
> It does on an Intel 80x86.  I've actually seen it happen.

Which instruction(s) specifically?

Balog Pal

unread,
Mar 27, 2009, 9:35:48 PM3/27/09
to
"rep_movsd" <rep....@gmail.com>

>> Obviously, you aren't familiar with the Intel architecture.
>All Right, I heard of them now...

Guess Intel's architecture manuals and data sheets are available for DL to
anyone actually interested to learn how stuff works.

>Well, considering that a 32 bit CPU can never address more than that
anyway ( forget the 36 bit PAE shenanigans ),

Says who? Normally the X-bit means the width of the processor's data bus.
Some other times the general-purpose registers. It has absolutely nothing to
do with the address bus width (-> accessible phisical memory) or the
internal address space.

> I don't get how 48 bit addresses would be beneficial. Could you elaborate
> ( since you
actually used them ) ?

For the discussed architecture the 48 bit refers to the pointer not the
address. The pointer has a 16 bit selector and a 32 bit offset. The
selector has a few control bits, the rest is used to select a *descriptor*
(from the descriptor table), that defines a memory segment in the virtual
address space with its base, length, access rights, etc.

>> It does on an Intel 80x86. I've actually seen it happen.

>Which instruction(s) specifically?

anything that loads the segment

Windows 3.1 used the segmented 16 bit model, with 32 bit pointers having
selector+16 bit offset. To pass a pointer value to a function sometimes it
used

void foo(void * vp);

void * p; // not inited
foo(p);
--> assy as

LES bx, [_p]
push bx
push es
call _foo

that even made sense if opt for space was asked for, it takes 1 byte less of
opcode...
Certainly if the loaded value was not a valid selector, it causes a trap,
and the famous 'unrecoverable application error'.


James Kanze

unread,
Mar 28, 2009, 7:19:13 AM3/28/09
to
On Mar 27, 9:56 pm, rep_movsd <rep.mo...@gmail.com> wrote:
> > Obviously, you aren't familiar with the Intel architecture.

> > > on 32 bit compilers sizeof(void*) == sizeof(int)
> > > I've never heard of 48 bit pointers.

> > And I've used them. On an Intel 80386.

> All Right, I heard of them now...

> > > The memory model for 32 bit protected mode from the
> > > application and compilers point of view is flat.

> > Only if the OS artificially chooses to restrict the users to
> > that.

> Well, considering that a 32 bit CPU can never address more than that
> anyway ( forget the 36 bit PAE shenanigans ),

Why? Modern Intel processors do have a 36 bit physical memory
bus. But that's not the point; different segments can be mapped
in different manners.

> I don't get how 48 bit addresses would be beneficial. Could
> you elaborate ( since you actually used them ) ?

In one case, there was no disk, so no virtual memory;
segmentation was the protection mechanism. In another, the "OS"
was MS-DOS, which at the time only ran in native, 16 bit mode;
with no support for virtual memory, again, segments were used to
go beyond the 1 MB supported by the OS.

> > > On 64 bit compilers this int/void* guarantee is broken...
> > > Loading an invalid address into a register will not trap!

> > It does on an Intel 80x86. I've actually seen it happen.

> Which instruction(s) specifically?

Any instruction which loads a segment register: LDS, LES, LFS,
LGS, and possibly some others. This goes back 25 or 30 years
ago, so I've forgotten the details. But the argument given in
the C committee for having just reading an invalide pointer be
undefined behavior was based on the Intel (and other machines at
the time which had "address" registers, but I've no experience
with those).

0 new messages