Safe reuse of allocated storage

Nikolay Ivchenkov

unread,

Feb 5, 2011, 7:06:53 AM2/5/11

to

Consider the following example:

#include <memory>

struct X
{
X(int &r) : ref(r) {}
int &ref;
};

int m;
int n;

int main()
{
std::allocator<X> a;
X *p = a.allocate(1);

a.construct(p, m);
p->ref = 1; // well-defined
a.destroy(p);

a.construct(p, n);
p->ref = 1; // leads to undefined behavior
a.destroy(p);

a.deallocate(p, 1);
}

This program sequentially creates two objects of type X on the same
memory location. The object of type X created first I will call "the
first object" and the object of type X created second I will call "the
second object".

According to N3225 - 3.8/7:
------------------------------------------
If, after the lifetime of an object has ended and before the storage
which the object occupied is reused or released, a new object is
created at the storage location which the original object occupied, a
pointer that pointed to the original object, a reference that referred
to the original object, or the name of the original object will
automatically refer to the new object and, once the lifetime of the
new object has started, can be used to manipulate the new object, if:

- the storage for the new object exactly overlays the storage location
which the original object occupied, and

- the new object is of the same type as the original object (ignoring
the top-level cv-qualifiers), and

- the type of the original object is not const-qualified, and, if a
class type, does not contain any non-static data member whose type is
const-qualified or a reference type, and

- the original object was a most derived object (1.8) of type T and
the new object is a most derived object of type T (that is, they are
not base class subobjects).
------------------------------------------

According to N3225 - 3.9.2/3:
------------------------------------------
If an object of type T is located at an address A, a pointer of type
cv T* whose value is the address A is said to point to that object,
regardless of how the value was obtained.
------------------------------------------

Thus, p cannot be used to access the member ref of the second object
(probably, a compiler is allowed to assume that in both cases p->ref
refers to the same object, though that's not so). Presumably, the
following approach should not imply undefined behavior:

#include <memory>

struct X
{
X(int &r) : ref(r) {}
int &ref;
};

int m;
int n;

int main()
{
std::allocator<X> a;
void *pv = a.allocate(1);

X *p1 = static_cast<X *>(pv);
a.construct(p1, m);
p1->ref = 1;
a.destroy(p1);

X *p2 = static_cast<X *>(pv);
a.construct(p2, n);
p2->ref = 1; // well-defined now?
a.destroy(p2);

a.deallocate(p2, 1);
}

Here formally pv can never point to the first object, because its type
is not "cv1 pointer to cv2 X". Pointer p2 never points to that object
too. But what may happen if we use single pointer p (as shown below)
instead of p1 and p2?

#include <memory>

struct X
{
X(int &r) : ref(r) {}
int &ref;
};

int m;
int n;

int main()
{
std::allocator<X> a;
void *pv = a.allocate(1);

X *p = static_cast<X *>(pv);
a.construct(p, m);
p->ref = 1;
a.destroy(p);

p = static_cast<X *>(pv); // reassignment
a.construct(p, n);
p->ref = 1; // well-defined?
a.destroy(p);

a.deallocate(p, 1);
}

Does the reassignment with the same pointer value help to avoid
undefined behavior in this case?

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Johannes Schaub (litb)

unread,

Feb 5, 2011, 11:07:06 AM2/5/11

to

Nikolay Ivchenkov wrote:

I have a question about these two texts. Why do pointers need to explicitly
be updated to point to the second object by 3.8/7, when 3.9.2/3 already
says
that the pointer will point to the second object? Why is 3.8/7 not
redundant
in the case of pointers? Let me make an example

int a[2][1];
int *p = a[0] + 1;
*p = 0;

This "p" is a past-the-end pointer for a[0], but it happens to "point to"
the integer at a[1][0]. Does the spec say somewhere that "p" is allowed to
assume to point at garbage, instead of the object of type "int" located at
&a[1][0] ?

If my code is valid, I can't understand why your code would be invalid.

Joshua Maurice

unread,

Feb 6, 2011, 3:50:10 AM2/6/11

to

The rules in the C and C++ standards that govern memory pooling
allocators written on top of new and malloc are, IMO, FUBAR. I've had
a thread up on comp.std.c++ for a while now about this - no replies,
and another more recent discussion going on in comp.std.c - this one
with no particularly interesting replies.

Yours appears to be yet another example of contradiction between well
accepted practice and ISO standard in this particular area. "What do
you mean I can't create an object with a reference data member in
memory returned from a pooling memory allocator which might have
previously contained an object of the same type?"

The Rules As Written make new and malloc special to the compiler and
language, and basically disallow general purpose memory allocators
written on top of system allocators (like new and malloc). This is
despite well accepted practice to the contrary, such as various open
source general purpose pooling memory allocators. I find it hard to
fathom that the C and C++ standards really intended to prohibit
general purpose userland pooling memory allocators.

I wanted to say what I think the intent is, but the problem is I can't
formalize it sufficiently well to bother posting it. (Something along
the lines of allowing the compiler to assume that a reference is not
reseated unless something in scope of this function or a called
function, transitively, reseats the reference - eg the storage has
been reused to create a new reference object.) That's the problem with
all of these rules IMO. I almost see what they wanted, but the
formalization is so lacking, and the rules don't make any sense when
you really start looking at them.

Sorry that I'm unable to help. I do hope that maybe this thread,
unlike the ones before it, comes to some actual conclusion.

Juan Pedro Bolivar Puente

unread,

Feb 7, 2011, 2:53:02 PM2/7/11

to

>>
>> Does the reassignment with the same pointer value help to avoid
>> undefined behavior in this case?

In my understading of the standard fragments you cited (I am by far not
an expert in the STD), yes, because 'p' no longer points to the original
object, but to the memory contained in 'pv' -- which happens to be the
same memory that contained the original object, but by definition pv
pointed to void* raw memory and not the original object.

Btw, have you tried the code with real-world compilers? What where the
results?

>
> The Rules As Written make new and malloc special to the compiler and
> language, and basically disallow general purpose memory allocators
> written on top of system allocators (like new and malloc). This is
> despite well accepted practice to the contrary, such as various open
> source general purpose pooling memory allocators. I find it hard to
> fathom that the C and C++ standards really intended to prohibit
> general purpose userland pooling memory allocators.
>

I don't understand why the technique exposed by the OP would be
unsuitable to build such pooling memory allocator. All you have to do it
to keep the pointer in the pool as raw void*, something that actually
you should be doing anyway to keep generated code bloat low.

JP

Joshua Maurice

unread,

Feb 8, 2011, 5:37:27 AM2/8/11

to

On Feb 7, 11:53 am, Juan Pedro Bolivar Puente <raskolni...@es.gnu.org>
wrote:

> > The Rules As Written make new and malloc special to the compiler and
> > language, and basically disallow general purpose memory allocators
> > written on top of system allocators (like new and malloc). This is
> > despite well accepted practice to the contrary, such as various open
> > source general purpose pooling memory allocators. I find it hard to
> > fathom that the C and C++ standards really intended to prohibit
> > general purpose userland pooling memory allocators.
>
> I don't understand why the technique exposed by the OP would be
> unsuitable to build such pooling memory allocator. All you have to do it
> to keep the pointer in the pool as raw void*, something that actually
> you should be doing anyway to keep generated code bloat low.

I don't understand what the void* has to do with anything. Let me
explain with the following (incomplete) program:

#include <new>
#include <stddef.h>

class UserspaceAllocator
{
public:
void* alloc(size_t );
void free(void* );
};

int main()
{
UserspaceAllocator alloc;

struct Foo { Foo(int& x) : x_(x) {} int& x_; };

int x, y;

void* p = alloc.alloc(sizeof(Foo));
Foo* foo = new(p) Foo(x);
foo -> ~Foo();
alloc.free(p);

// Line A

p = alloc.alloc(sizeof(Foo)); // Line B
foo = new(p) Foo(y); // Line C
return foo->x_;
}

Let's assume that the userspace allocator returns the same piece of
memory for both allocation requests.

Now, let's look at the text of N3225 - 3.8/7:

> If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied,

The object's lifetime ends at line A. The storage is not reused at
line A, and it is not released at line A. Releasing memory has a very
specific definition in the C++ standard which means that the memory is
returned to "the system" ala std::free, operator delete, and so on.
Simply returning it to a userspace memory allocator isn't the same.
That's what the "reuse" is meant to cover. (At least, I'm pretty sure.
That appears to be the only sensible reading.)

> a pointer that pointed to the original object [...] can be used to manipulate the new object, if:

Now, it's a little unclear here. Do they mean pointer value, or
pointer variable, or pointer object? At lines B and C, I have the same
pointer variable, the same pointer object, and under this thought
experiment the same pointer value as we assumed that the userspace
memory allocator returned the same piece of memory for both allocation
requests. So, while it is vague, I satisfy all reasonable
interpretations.

> [...]

> - the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and

Well, damn. Our type Foo has a non-static data member whose type is a
reference type. As I don't satisfy this bullet, I don't satisfy "a
pointer that pointed to the original object [...] can be used to
manipulate the new object, if:", which means that I cannot use that
pointer to manipulate the new object. This means that using the
pointer to manipulate the new object is undefined behavior. Thus my
(incomplete) program above has undefined behavior when the userspace
allocator returns the same piece of memory for both allocation
requests, and thus a general purpose userspace pooling memory
allocator cannot be written in C++ under the draft standard N3225.

You might be able to weasel your way out by saying that the two
pointer values aren't the same pointer value, even though they have
the same bit representation. You could argue that the data flow which
led to the pointer value, such as whether it came from a memberof
expression or an explicit cast, affects the pointer value (as was just
done on comp.std.c by some person), but I really don't like such
reasoning. I don't like it because it doesn't appear anywhere in the
standard AFAIK, and because it violates the community's general
understanding of what it means to say two pointer values are
equivalent.

Daniel Krügler

unread,

Feb 8, 2011, 11:43:50 AM2/8/11

to

On 2011-02-05 17:07, Johannes Schaub (litb) wrote:

[..]

> I have a question about these two texts. Why do pointers need to explicitly
> be updated to point to the second object by 3.8/7, when 3.9.2/3 already
> says
> that the pointer will point to the second object? Why is 3.8/7 not
> redundant
> in the case of pointers? Let me make an example
>
> int a[2][1];
> int *p = a[0] + 1;
> *p = 0;
>
> This "p" is a past-the-end pointer for a[0], but it happens to "point to"
> the integer at a[1][0]. Does the spec say somewhere that "p" is allowed to
> assume to point at garbage, instead of the object of type "int" located at
> &a[1][0] ?
>
> If my code is valid, I can't understand why your code would be invalid.

If we consider pointers that are invalidated by a deallocation I believe
this is active core issue

http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#523

Greetings from Bremen,

Daniel Kr�gler

Joshua Maurice

unread,

Feb 9, 2011, 6:24:37 PM2/9/11

to

On Feb 8, 2:37 am, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> #include <new>
> #include <stddef.h>
>
> class UserspaceAllocator
> {
> public:
> void* alloc(size_t );
> void free(void* );
> };
>
> int main()
> {
> UserspaceAllocator alloc;
>
> struct Foo { Foo(int& x) : x_(x) {} int& x_; };
>
> int x, y;
>
> void* p = alloc.alloc(sizeof(Foo));
> Foo* foo = new(p) Foo(x);
> foo -> ~Foo();
> alloc.free(p);
>
> // Line A
>
> p = alloc.alloc(sizeof(Foo)); // Line B
> foo = new(p) Foo(y); // Line C
> return foo->x_;
> }

Minor correction. The above program also has UB because it reads an
uninitialized int object. My mistake there. Once you fix that by
changing
int x, y;
to
int x = 1, y = 2;
then that program ought to have no UB, but it still would under N3225
- 3.8/7, so N3225 - 3.8/7 is a defect.

itaj sherman

unread,

Feb 10, 2011, 6:01:17 PM2/10/11

to

My guess, is that the intent of the restriction on const members is
due to optimization a compiler might take. Maybe in some cases the
compiler can avoid the initialization of const members while
constructing the object, which may involve using a different layout of
the class.
References are just like const pointers.

If that's the reason, then keeping the pointer value in another
variable and reassigning wouldn't help. You'll need to get the value
in the pointer that is passed to the later placement-new command that
created the later object.

However, if that had been the reason, I would expect that section to
require it recursively.
"...does not contain any non-static data member whose type is const-

qualified or a reference type"

I think should have added: and neither any of its non-static data
members or base class recursively.

It doesn't state that recursively for whatever reason. So formally if
you wrapped you class X in a wrapper, and allocate Wrapper<X> instead:

template< typename T >
class Wrapper
{
//data
private: T m; //non const member

//method
public: T* get() { return &(this->m); }

//forwarding constructors...
template< ... > Wrapper( ... ) { ... }
};

Then you are allowed to use the old Wrapper<X>* with it's get() method
(because "can be used to manipulate the new object").
* However, keeping the Foo* pointer returned by a previous call to
get() will become invalid by the wording of the standard.

If you can use some kind of smart pointer that keeps the Wrapper<X>*
and converts to Foo* by calling get() you could work-around that.
Although, it requires trusting that the wording of the standard
intentionally don't mention non-const recursively.

template< typename T >
Pointer
{
//data
private: Wrapper<T>* m;

//xtor
public: Pointer( Wrapper<T>* r )
:
m(r)
{}

//methods
T* operator->() const
{
return m->get();
}
};

int main()
{
std::allocator<Wrapper<X>> a;
Pointer<X> p = a.allocate(1);

a.construct(p, m);
p->ref = 1; // well-defined
a.destroy(p);

a.construct(p, n);
p->ref = 1; // now this is well-defined per the current wording
a.destroy(p);

a.deallocate(p, 1);
}

itaj

itaj sherman

unread,

Feb 12, 2011, 9:10:52 AM2/12/11

to

I should explain a few things clearer in my post:

On Feb 11, 1:01 am, itaj sherman <itajsher...@gmail.com> wrote:
> On Feb 5, 2:06 pm, Nikolay Ivchenkov <ts...@mail.ru> wrote:
>
>
>

...

>
> However, if that had been the reason, I would expect that section to
> require it recursively.
> "...does not contain any non-static data member whose type is const-
> qualified or a reference type"
> I think should have added: and neither any of its non-static data
> members or base class recursively.

I meant: and neither any of its non-static data members or base
classes contain such members recursivly.

> template< typename T >
> class Wrapper
> {
> //data
> private: T m; //non const member
>
> //method
> public: T* get() { return &(this->m); }
>
> //forwarding constructors...
> template< ... > Wrapper( ... ) { ... }
>
> };
>
> Then you are allowed to use the old Wrapper<X>* with it's get() method
> (because "can be used to manipulate the new object").
> * However, keeping the Foo* pointer returned by a previous call to
> get() will become invalid by the wording of the standard.
>
> If you can use some kind of smart pointer that keeps the Wrapper<X>*
> and converts to Foo* by calling get() you could work-around that.
> Although, it requires trusting that the wording of the standard
> intentionally don't mention non-const recursively.

I meant your class X not Foo. I confused with class Foo from Maurice's
post.

>
> template< typename T >
> Pointer
> {
> //data
> private: Wrapper<T>* m;
>
> //xtor
> public: Pointer( Wrapper<T>* r )
> :
> m(r)
> {}
>
> //methods
> T* operator->() const
> {
> return m->get();
> }
>
> };
>
> int main()
> {
> std::allocator<Wrapper<X>> a;
> Pointer<X> p = a.allocate(1);
>
> a.construct(p, m);
> p->ref = 1; // well-defined
> a.destroy(p);
>
> a.construct(p, n);
> p->ref = 1; // now this is well-defined per the current wording
> a.destroy(p);
>
> a.deallocate(p, 1);
>
> }
>
> itaj
>

Maybe I wasn't clear enough. The point is that Wrapper<X> doesn't have
any const or reference non-static members (unlike X). And thus, by the
current wording, the old Wrapper<X>* is valid after the later object
is created. Note that I use the valid Wrapper<X>* in order to access
the Wrapper<X> object and re-retrieve an X*. I do not use an old X*.

That is well-defined per the current wording, but I'm not sure they
really meant that at all, or on the other hand, that they
intentionally put that const/reference member restriction not
recursively.