Predynamic Storage Duration

260 views
Skip to first unread message

Andrew Tomazos

unread,
Oct 14, 2013, 12:25:29 AM10/14/13
to std-pr...@isocpp.org
I want this:

    std::string s = "foo";
    std::vector<int> v = {1,2,3};

to be as efficient as this:

    char s[] = "foo";
    int v[] = {1,2,3};

and I want to write this:

    constexpr std::vector<int> f(...)
    {
        std::vector<int> v;

        for (...)
        {
             ...

             v.push_back(v);
        }

        return v;
    }

Basically, I want std::string, std::vector and similar classes to be literal types.

The main reason they cannot currently be literal types is because they need to allocate objects of dynamic storage duration, and the heap isn't available during translation.

So here are the proposed changes:

- Allow new expressions and delete expressions within constexpr functions (provided the operands are of literal type.)

- A new expression evaluated within a constexpr function in constant-context returns a pointer to an object of predynamic storage duration

- A delete expression evaluated within a constexpr function in constant-context deletes an object of predynamic storage duration

- A delete expression evaluated at run-time deletes an object of either predynamic or dynamic storage duration

Essentially, predynamic storage duration starts during translation and can end as late as run-time.

During translation when evaluating a new expression within a constexpr function the implementation allocates the object within a memory pool (using whatever internal system it uses to hold variables of literal type) we'll call the preheap.  When encountering a delete expression in a constexpr function it deletes the operand from the preheap.  Any predynamic objects left over in the preheap after translation has completed are arranged into the program image in a .preheap section.  When the program loads the .preheap section is copied into memory and serves as part of the initial state of the run-time heap.

    constexpr int* f()
    {
        int a[] = {1,2,3};

        int* b = new int[3];  // predynamic object allocated during translation

        for (size_t i = 0; i < 3; i++)
            b[i] = a[i];

        return b;
     }

     constexpr int* p = f(); // f called duration translation, returns pointer to predynamic object

     int main()
     {
         delete p; // deletes predynamic object at runtime
     }

Feedback/thoughts appreciated.


Billy O'Neal

unread,
Oct 14, 2013, 12:53:05 AM10/14/13
to std-proposals
I can't think of a real world use case where this would have a significant performance impact, and it adds significant complexity penalties into the core language.

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


--
 
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

Richard Smith

unread,
Oct 14, 2013, 1:05:02 AM10/14/13
to std-pr...@isocpp.org
Supporting allocate-during-translation/deallocate-during-runtime is pretty tricky from an implementation perspective, especially since the usual allocation and deallocation functions can be replaced. We cannot know in advance what layout a custom heap will take in memory, so we cannot prepopulate a "preheap" as a static data structure and expect a replacement deallocation function to be able to cope with it. (Also, practically-speaking, the compiler vendor typically does not know how the library vendor will choose to implement the heap.) This can be mitigated by requiring the implementation to repeatedly call the (replacement) ::operator new to allocate the preheap during program startup, but that removes a lot of the value of the proposal.

More difficult is supporting types like std::vector, which want to manipulate bytes of storage, not just allocated objects. Constant expression evaluation very deliberately does not allow accessing the object representation of a type (that is, in many cases, impossible to implement during translation, because -- for instance -- the bit pattern of types containing pointers is not yet known). A suitably careful implementation of std::vector<T> could probably sidestep this issue, perhaps through 'new Uninitialized<T>[N];', where Uninitialized is a union type containing T, but it's unlikely that any such approach would be compatible with the other constraints on std::vector, such as guaranteed array-like contiguous allocation (so data()[N] works) and use of a custom allocator.

I considered the possibility of permitting new-expressions and delete-expressions during the design for N3597; the feedback I received from other members of the committee at that time was that this was a step too far for C++14, and I was inclined to agree.


Having said the above, I did come up with a design for this which I believe works. It's somewhat more restricted than what you were describing, but is enough to get many uses of dynamic storage to work:

 * In order to support deleting objects in a destructor, allow destructors to be marked 'constexpr', and implicitly mark all trivial destructors as 'constexpr'. Require a literal type to have a constexpr destructor rather than a trivial one.

 * Allow new-expressions and delete-expressions in core constant expressions. The rules for constant expressions are unchanged, so they can still only refer to objects of static storage duration (not of automatic, thread, or [now possible] dynamic storage duration). This allows temporary usage of dynamic allocation during constant expression evaluation, but does not allow it to leak outside the computation. Under N3664, and implementation is permitted to elide this allocation rather than actually invoking the allocation/deallocation functions (and during constant expression evaluation, it would be expected to do so).

 * Objects declared 'constexpr' can have non-trivial (but constexpr) destructors, but only if the evaluation of the destructor on the object is itself a constant expression (since the object is 'const', we can check this during translation). For such an object, we allow subobjects to be pointers or references to dynamic storage, so long as the evaluation of the destructor deletes the storage. Additionally, make it undefined behavior to delete such objects outside of the destructor.


But on to your example:

On Sun, Oct 13, 2013 at 9:25 PM, Andrew Tomazos <andrew...@gmail.com> wrote:
I want this:

    std::string s = "foo";
    std::vector<int> v = {1,2,3};

to be as efficient as this:

    char s[] = "foo";
    int v[] = {1,2,3};

All you need for this is a sufficiently talented compiler. The standard already permits heap allocation to be promoted to stack or static storage under N3664, and the library doesn't specify when ::operator new will be called to provide memory for std::allocator, nor how much will be requested, so what you're asking for is already allowed. Implementations are getting close to being able to do these optimizations.

David Krauss

unread,
Oct 14, 2013, 2:23:19 AM10/14/13
to std-pr...@isocpp.org
On 10/14/13 12:53 PM, Billy O'Neal wrote:
> I can't think of a real world use case where this would have a significant
> performance impact

It can't do anything for performance but shift predetermined
computations to compile time.

I think the motivation is just to promote code reuse between compile
time and runtime.

Andrew Tomazos

unread,
Oct 14, 2013, 2:47:31 AM10/14/13
to std-pr...@isocpp.org

On Monday, October 14, 2013 6:53:05 AM UTC+2, Billy O'Neal wrote:
I can't think of a real world use case where this would have a significant performance impact, and it adds significant complexity penalties into the core language.

The main use case would be enabling the use of strings, vectors and other types that use dynamic storage as parameters, local variables and return types of constexpr functions.  Constexpr functions can be evaluated during translation from within a constant expression - so enable trading one unit of compile-time for at least n units of run-time, where n is the number of times the program is executed, for a net benefit of at least n-1 units of performance.

The specific use cases of using strings and vectors in functions are too numerous to mention.  One example would be writing a character encoding transcoder.  Such a function may take a string in the input character encoding and return a string in the output character encoding and use a local vector object to hold an intermediate representation:

    constexpr std::string transcode(const std::string& input)
    {
        std::vector<int> points;

        for (char c : input)
        {
             ...
             if (b)
                 points.push_back(x);
        }

        std::string output;

        for (int x : points)
        {
            if (...)
               output += ...;
        }

        return output;
    }

    constexpr std::string s = transcode("foo");

In the above transcode is called during translation and s is then constant-initialized.  It doesn't take much creativity to think of a whole host of other uses.
   -Andrew.

Billy O'Neal

unread,
Oct 14, 2013, 2:50:18 AM10/14/13
to std-proposals
>The main use case would be enabling the use of strings, vectors and other types that use dynamic storage as parameters, local variables and return types of constexpr functions
 
Ok, I agree that's interesting. But your proposal started out with "I want [code sample] to be as efficient as [code sample]."

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


Andrew Tomazos

unread,
Oct 14, 2013, 3:00:09 AM10/14/13
to std-pr...@isocpp.org
On Monday, October 14, 2013 8:50:18 AM UTC+2, Billy O'Neal wrote:
>The main use case would be enabling the use of strings, vectors and other types that use dynamic storage as parameters, local variables and return types of constexpr functions
 
Ok, I agree that's interesting. But your proposal started out with "I want [code sample] to be as efficient as [code sample]."

Yeah sorry, I should have spent more time on motivations.

Thiago Macieira

unread,
Oct 14, 2013, 3:22:55 AM10/14/13
to std-pr...@isocpp.org
On domingo, 13 de outubro de 2013 21:25:29, Andrew Tomazos wrote:
> - A new expression evaluated within a constexpr function in
> constant-context returns a pointer to an object of predynamic storage
> duration
>
> - A delete expression evaluated within a constexpr function in
> constant-context deletes an object of predynamic storage duration

Maybe what you want is a constexpr overload of operator new with a suitable ()
parameter that does a "predynamic" allocation.

That is, it doesn't allocate on the heap, but only allocates space of global
static duration. The equivalent operator delete would be a no-op.

That is

new (std::predynamic) internal_string;

would be equivalent to having a global, unnamed static of type internal_string
and it would return its address.

We're doing this in Qt by way of lambdas:

(simplified)
#define QStringLiteral(str) []() -> QString { \
static const QStringData p = { sizeof(str), u"" str "" }; \
return QString(&p); \
}

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Andrew Tomazos

unread,
Oct 14, 2013, 4:24:32 AM10/14/13
to std-pr...@isocpp.org
On Monday, October 14, 2013 7:05:02 AM UTC+2, Richard Smith wrote:
Supporting allocate-during-translation/deallocate-during-runtime is pretty tricky from an implementation perspective, especially since the usual allocation and deallocation functions can be replaced. We cannot know in advance what layout a custom heap will take in memory, so we cannot prepopulate a "preheap" as a static data structure and expect a replacement deallocation function to be able to cope with it. (Also, practically-speaking, the compiler vendor typically does not know how the library vendor will choose to implement the heap.) This can be mitigated by requiring the implementation to repeatedly call the (replacement) ::operator new to allocate the preheap during program startup, but that removes a lot of the value of the proposal.

I wouldn't be so hasty to discard your "repeatedly calling new on the preheap at load-time" idea.  Even with such a workaround, I think it would still be useful to be able to do "allocate-during-translation/deallocate-during-runtime".
 
More difficult is supporting types like std::vector, which want to manipulate bytes of storage, not just allocated objects. Constant expression evaluation very deliberately does not allow accessing the object representation of a type (that is, in many cases, impossible to implement during translation, because -- for instance -- the bit pattern of types containing pointers is not yet known). A suitably careful implementation of std::vector<T> could probably sidestep this issue, perhaps through 'new Uninitialized<T>[N];', where Uninitialized is a union type containing T, but it's unlikely that any such approach would be compatible with the other constraints on std::vector, such as guaranteed array-like contiguous allocation (so data()[N] works) and use of a custom allocator.
 
What you are highlighting here are additional problems on top of the dynamic allocation issue to making `std::vector` a literal type.  I agree with your assessment, however given the proposed feature I think it may be possible for a library author to implement a `literal_vector` class with a similar interface to vector that sidesteps these issues.  We are seeing similar things for `std::array` where people are defining their own `literal_array` classes to sidestep some of the existing interface issues.  The `std::vector` issues are much larger as you have pointed out, but perhaps they can be solved with additional future relaxations.  I think we can agree that it would worthwhile for `std::vector` to be literal if possible, and in the interim implementing your own `literal_vector` class would be nice to be able to do too.

Having said the above, I did come up with a design for this which I believe works. It's somewhat more restricted than what you were describing, but is enough to get many uses of dynamic storage to work:

 * In order to support deleting objects in a destructor, allow destructors to be marked 'constexpr', and implicitly mark all trivial destructors as 'constexpr'. Require a literal type to have a constexpr destructor rather than a trivial one.

 * Allow new-expressions and delete-expressions in core constant expressions. The rules for constant expressions are unchanged, so they can still only refer to objects of static storage duration (not of automatic, thread, or [now possible] dynamic storage duration). This allows temporary usage of dynamic allocation during constant expression evaluation, but does not allow it to leak outside the computation. Under N3664, and implementation is permitted to elide this allocation rather than actually invoking the allocation/deallocation functions (and during constant expression evaluation, it would be expected to do so).

 * Objects declared 'constexpr' can have non-trivial (but constexpr) destructors, but only if the evaluation of the destructor on the object is itself a constant expression (since the object is 'const', we can check this during translation). For such an object, we allow subobjects to be pointers or references to dynamic storage, so long as the evaluation of the destructor deletes the storage. Additionally, make it undefined behavior to delete such objects outside of the destructor.
 
I think I see where you are going here, but why couldn't a pointer to a predynamic object be an address constant expression as per a pointer to a static storage duration object.

When a constexpr-specified object of static storage duration is encountered during translation, the implementation must track its value and pointers to it.  The pointers are in symbol+addend form, and the constant expression rules are setup to support this.  Handling predynamic objects in the preheap should be no different to handling objects of static storage duration in this regard.

At the end of translation the preheap objects could be stored in the program image in the same way as static storage duration objects.  Then at load-time as you suggest we could call operator new on each preheap object copying it into the heap.  Pointers to preheap objects are then relocated as for pointers to static storage duration objects.

This doesn't address mutability of preheap objects, but I don't see how this is any different to mutability of local variables in constexpr functions.  Implementations already have to deal with that for automatic storage duration objects, so why couldn't they deal with it the same way for predynamic storage duration objects.
-Andrew

Andrew Tomazos

unread,
Oct 14, 2013, 4:45:48 AM10/14/13
to std-pr...@isocpp.org

Having the delete operator be a no-op creates a memory leak.  If we imagine a vector-like class that reallocates on powers-of-2 for example being used as a stack within a constexpr function, then it would leave a trail of memory blocks during translation as it grows and shinks past its thresholds.  Amusingly these blocks may be cleaned-up at link-time through the optimizer somehow, but still the same dynamic memory management techniques used at run-time should have the same semantics during translation.

Thiago Macieira

unread,
Oct 14, 2013, 11:08:21 AM10/14/13
to std-pr...@isocpp.org
On segunda-feira, 14 de outubro de 2013 01:45:48, Andrew Tomazos wrote:
> Having the delete operator be a no-op creates a memory leak. If we imagine
> a vector-like class that reallocates on powers-of-2 for example being used
> as a stack within a constexpr function, then it would leave a trail of
> memory blocks during translation as it grows and shinks past its
> thresholds. Amusingly these blocks may be cleaned-up at link-time through
> the optimizer somehow, but still the same dynamic memory management
> techniques used at run-time should have the same semantics during
> translation.

There's no leak because the memory is static. It's freed by the program
unloading itself after exit.

When I said that it's a no-op, I meant that it doesn't do anything at runtime.
At compile time, if the compiler does detect the deletion, it might deallocate
and drop the memory area.

But it's not required to. Therefore, this may cause increased (static) memory
use if the constexpr function does alloc/dealloc multiple times. The
algorithms using this new operator new should be made so that they allocate
only once.

Mathias Gaunard

unread,
Oct 14, 2013, 12:36:49 PM10/14/13
to std-pr...@isocpp.org
On 14/10/13 06:25, Andrew Tomazos wrote:
> I want this:
>
> std::string s = "foo";
> std::vector<int> v = {1,2,3};
>
> to be as efficient as this:
>
> char s[] = "foo";
> int v[] = {1,2,3};

The only way to make them as efficient is to use COW, which isn't
allowed for those data types.

In the first case you can change the size, in the second you cannot.
This is the real source of the overhead.

Thiago Macieira

unread,
Oct 14, 2013, 1:34:44 PM10/14/13
to std-pr...@isocpp.org
On segunda-feira, 14 de outubro de 2013 18:36:49, Mathias Gaunard wrote:
> > std::string s = "foo";
> > std::vector<int> v = {1,2,3};
> >
> > to be as efficient as this:
> > char s[] = "foo";
> > int v[] = {1,2,3};
>
> The only way to make them as efficient is to use COW, which isn't
> allowed for those data types.
>
> In the first case you can change the size, in the second you cannot.
> This is the real source of the overhead.

True, but it might be possible for a const version of those:

constexpr std::string s = "foo long string so SSO doesn't kick in";

This requires "allocating" the internal representation of std::string
somewhere on .rodata.

That requires first that we can overload a constexpr constructor with a non-
such, and it will require the same for a constexpr destructor.

A simple implementation would do:

string(const CharT *str) constexpr
: _M_p(_allocdup(str))
{ }

CharT *allocdup(const CharT *str) constexpr
{
auto result = static_cast<_M_Internal *>(operator new (strlen(str) +
HeaderSize, std::predynamic));
new (result) _M_Internal(str);
return result;

Andrew Tomazos

unread,
Oct 15, 2013, 1:17:31 AM10/15/13
to std-pr...@isocpp.org
On Monday, October 14, 2013 7:34:44 PM UTC+2, Thiago Macieira wrote:
  string(const CharT *str) constexpr
    : _M_p(_allocdup(str))
  { }

  CharT *allocdup(const CharT *str) constexpr
  {
    auto result = static_cast<_M_Internal *>(operator new (strlen(str) +
HeaderSize, std::predynamic));
    new (result) _M_Internal(str);
    return result;
  }

I think you're missing the consequences/properties of the proposed change.

The point is that you could write a literal type that uses dynamic memory:

    struct String
    {
        size_t len;
        char* p;

        String(const char* s)
            : len(strlen(s))
            , p(new char[len+1])
         {
               for (size_t i = 0; i < len+1; i++)
                   p[i] = s[i];
         }
 
         ~String()
         {
             delete [] p;
         }
    };

    constexpr String s = "foo";

    int main(int argc, char** argv)
    {
        String t = argv[0];
    }

In the above, s is constructed in constant-context, the internal new expression returns a predynamic storage duration object allocated on the preheap.  That object, as it isn't deleted during translation, is carried over to run-time and becomes a normal dynamic storage duration object. This allows the String class to be a literal type, but still use dynamic memory.

t is constructed at run-time, the internal new expression returns a dynamic storage duration object as usual.

Notice that this is achieved with the _same_ definition of String.  Only the underlying semantics of new and delete expressions have been extended.  The change is at a layer _below_ SSO or COW.  You can use either optimization and it will transparently just work.

With the proposed change and some additional relaxations, standard types such as `std::string`, `std::vector` and others could be made literal types.  This works independantly and below how they share/manage their internal buffers.

Thiago Macieira

unread,
Oct 15, 2013, 11:58:17 AM10/15/13
to std-pr...@isocpp.org
On segunda-feira, 14 de outubro de 2013 22:17:31, Andrew Tomazos wrote:
> I think you're missing the consequences/properties of the proposed change.

I might be actually discarding it, intentionally.

> The point is that you could write a literal type that uses dynamic memory:
>
> struct String
> {
> size_t len;
> char* p;
>
> String(const char* s)
>
> : len(strlen(s))
>
> , p(new char[len+1])
> {
> for (size_t i = 0; i < len+1; i++)
> p[i] = s[i];
> }
>
> ~String()
> {
> delete [] p;
> }
> };
>
> constexpr String s = "foo";
>
> int main(int argc, char** argv)
> {
> String t = argv[0];
> }
>
> In the above, s is constructed in constant-context, the internal new
> expression returns a predynamic storage duration object allocated on the
> preheap. That object, as it isn't deleted during translation, is carried
> over to run-time and becomes a normal dynamic storage duration object. This
> allows the String class to be a literal type, but still use dynamic memory.

Please, no. The object s is constexpr and the memory it allocated cannot be
deleted at runtime. The preheap is the read-only data segment of the
executable, not the normal heap.

It's quite impossible for the compiler to create a preheap in the model you're
describing, since the compiler doesn't know what malloc() or operator new()
do. The standard allows operator new() to be overridden by the user.

The only way to call the user's operator new() would be at runtime and we
already have that solution in the form of dynamic initialisation at load time.

Andrew Tomazos

unread,
Oct 15, 2013, 2:23:24 PM10/15/13
to std-pr...@isocpp.org
On Tuesday, October 15, 2013 5:58:17 PM UTC+2, Thiago Macieira wrote:
Please, no. The object s is constexpr and the memory it allocated cannot be
deleted at runtime. The preheap is the read-only data segment of the
executable, not the normal heap.

That is not what I proposed, no.  Objects allocated (and not yet deleted) on the preheap during translation are saved in the program and logically become part of the heap at run-time.  That is the point, and what makes it useful - because you can effectively use new/delete in literal types.  Literal types can be used freely both during translation and at run-time with a single reusable type definition.


It's quite impossible for the compiler to create a preheap in the model you're
describing, since the compiler doesn't know what malloc() or operator new()
do. The standard allows operator new() to be overridden by the user.

No, it is possible.  Preheap objects are represented during translation in the same way as literal temporaries, local variables in constexpr functions and constexpr objects of static storage duration.  At load-time preheap objects are automatically loaded into the heap.  As Richard suggests implementations can allocate run-time storage for preheap objects at load-time using the allocation functions at that time (overridden or not) - although they would be free to elide the "copying" you are imagining in favor of more efficient ways given sufficient integration/intelligence of the different components of the implementation.  But _with or without_ such load-time copying, it is still enormously useful.  I wouldn't get too hung up on the particular mechanism and overlook the larger utility.
 
The only way to call the user's operator new() would be at runtime and we
already have that solution in the form of dynamic initialisation at load time.
 
You're confusing the value of an object, with the storage/representation for that object.  In particular just because a preheap object is already logically constructed during translation, doesn't mean we can't use a custom allocation function to get the run-time storage for it later at load-time.  Note that the utility of preheap objects extends way beyond what can be achieved with dynamic initialization.  In particular they can be used during translaiton within literal types _before_ dynamic initialization takes place at run-time.  Admittedly my opening example did not do a good job of demonstrating this, but it is a clear consequence of the proposed change.

Magnus Fromreide

unread,
Oct 15, 2013, 3:18:20 PM10/15/13
to std-pr...@isocpp.org
But what prevents a sufficiently advanced compiler from doing this using
C++11 (or for that sake, C++98)?

This sounds like the as-if rule being applied to object initializations.

/MF


Thiago Macieira

unread,
Oct 15, 2013, 5:09:24 PM10/15/13
to std-pr...@isocpp.org
On terça-feira, 15 de outubro de 2013 21:18:20, Magnus Fromreide wrote:
> But what prevents a sufficiently advanced compiler from doing this using
> C++11 (or for that sake, C++98)?

I don't think it can do that. The compiler does not know what other side
effects operator new() might have. It violates the requirements for constexpr.

Thiago Macieira

unread,
Oct 15, 2013, 5:39:17 PM10/15/13
to std-pr...@isocpp.org
On terça-feira, 15 de outubro de 2013 11:23:24, Andrew Tomazos wrote:
> On Tuesday, October 15, 2013 5:58:17 PM UTC+2, Thiago Macieira wrote:
> > Please, no. The object s is constexpr and the memory it allocated cannot
> > be
> > deleted at runtime. The preheap is the read-only data segment of the
> > executable, not the normal heap.
>
> That is not what I proposed, no.

I know.

>> It's quite impossible for the compiler to create a preheap in the model
>> you're
>> describing, since the compiler doesn't know what malloc() or operator new()
>> do. The standard allows operator new() to be overridden by the user.
>
> No, it is possible. Preheap objects are represented during translation in
> the same way as literal temporaries, local variables in constexpr functions
> and constexpr objects of static storage duration. At load-time preheap
> objects are automatically loaded into the heap. As Richard suggests
> implementations can allocate run-time storage for preheap objects at
> load-time using the allocation functions at that time (overridden or not) -
> although they would be free to elide the "copying" you are imagining in
> favor of more efficient ways given sufficient integration/intelligence of
> the different components of the implementation. But _with or without_ such
> load-time copying, it is still enormously useful. I wouldn't get too hung
> up on the particular mechanism and overlook the larger utility.

It might be useful if you describe what you expect to happen during
compilation, during load time, during runtime, and during unload time. The way
I am currently seeing things, you're asking for a lot in the compiler for very
little benefit: since the allocation must be redone at load-time, there's no
gain.

Let's take your simple but not trivial constructor from a few emails back:

String(const char* s)
: len(strlen(s))
, p(new char[len+1])
{
for (size_t i = 0; i < len+1; i++)
p[i] = s[i];
}

And we have the following global declaration:
constexpr String mystring("Hello");

During compilation, what will the compiler do? The first thing it needs to do
is call strlen("Hello"). Let's assume that it is constexpr, so the compiler
can know that it will return 5.

Now we have a call to ::operator new(6). What will the compiler do?

I'm going to assume the compiler allocates a "preheap" -- that is, some static
storage of length 6. The compiler knows the return value since it allocated
the address. After that, it will execute the loop.

At the end of the function, it could conclude that the memory layout should be
(pseudo-assembly):

.section .rodata, read-only, sharable
Lalloc_size_6:
.asciz "Hello"
mystring:
.quad 5
.quad Lalloc_size_6

If nothing happened at load time, we would have a perfectly valid String
object, with len == 5 and p pointing to a read-only section of memory.
Obviously, you can't delete p. This is already enormously useful for QString,
which uses copy-on-write semantics. All we need is a flag indicating that this
string is immutable, which QString has.

std::string has no such semantic, true. But all it requires to support an
immutable string data is a special allocator template, which causes the
destructor to be empty and the move constructor to be equal to the copy
constructor. Since the object is already const, no mutator functions can be
called, so no reallocation will ever happen.

Now, you said "At load-time preheap objects are automatically loaded into the
heap." That means the compiler must emit load- and unload-time functions like:

void init()
{
decltype(mystring.p) tmp = new char[6];
std::memcpy(tmp, mystring.p, 6);
mystring.p = tmp;
}
void fini()
{
delete[] mystring.p;
}

The second consequence is that the "mystring" object cannot be placed in a
read-only & sharable section of memory, since we needed to modify mystring.p.
The compiler needs to place this constexpr object in a read-write section of
memory. At best, with the help from the program loader, the memory page can be
set to read-only after initialisation (similar to the GNU -z relro solution).

Here's my question: what's the benefit? If we replace that constexpr with a
simple const keyword, current C++98 code would generate the following load-
and unload-time functions (after inlining the constructor and destructor):

void init()
{
mystring.len = 5;
decltype(mystring.p) tmp = new char[6];
std::memcpy(tmp, "Hello", 6);
mystring.p = tmp;
}
void fini()
{
delete[] mystring.p;
}

The difference is the assignment of mystring.len. That's hardly worth the
effort, IMHO.

> > The only way to call the user's operator new() would be at runtime and we
> > already have that solution in the form of dynamic initialisation at load
> > time.
>
> You're confusing the value of an object, with the storage/representation
> for that object. In particular just because a preheap object is already
> logically constructed during translation, doesn't mean we can't use a
> custom allocation function to get the run-time storage for it later at
> load-time. Note that the utility of preheap objects extends way beyond
> what can be achieved with dynamic initialization. In particular they can
> be used during translaiton within literal types _before_ dynamic
> initialization takes place at run-time. Admittedly my opening example did
> not do a good job of demonstrating this, but it is a clear consequence of
> the proposed change.

I'm not questioning the usefulness of having new in constexpr. They'd come in
extremely handy.

I'm questioning the need to reload preheap objects into the real heap. That
kills most of the benefit of the functionality.

Andrew Tomazos

unread,
Oct 16, 2013, 3:10:54 AM10/16/13
to std-pr...@isocpp.org
On Tuesday, October 15, 2013 11:39:17 PM UTC+2, Thiago Macieira wrote:
>>  It's quite impossible for the compiler to create a preheap in the model
>> you're
>> describing, since the compiler doesn't know what malloc() or operator new()
>> do. The standard allows operator new() to be overridden by the user.
>
> No, it is possible.  Preheap objects are represented during translation in
> the same way as literal temporaries, local variables in constexpr functions
> and constexpr objects of static storage duration.  At load-time preheap
> objects are automatically loaded into the heap.  As Richard suggests
> implementations can allocate run-time storage for preheap objects at
> load-time using the allocation functions at that time (overridden or not) -
> although they would be free to elide the "copying" you are imagining in
> favor of more efficient ways given sufficient integration/intelligence of
> the different components of the implementation.  But _with or without_ such
> load-time copying, it is still enormously useful.  I wouldn't get too hung
> up on the particular mechanism and overlook the larger utility.

It might be useful if you describe what you expect to happen during
compilation, during load time, during runtime, and during unload time. The way
I am currently seeing things, you're asking for a lot in the compiler for very
little benefit: since the allocation must be redone at load-time, there's no
gain.

No, the run-time storage allocation is not redone at load-time, it is done for the first time at load-time.
 
Let's take your simple but not trivial constructor from a few emails back:

        String(const char* s)
            : len(strlen(s))
            , p(new char[len+1])
         {
               for (size_t i = 0; i < len+1; i++)
                   p[i] = s[i];
         }

And we have the following global declaration:
        constexpr String mystring("Hello");

During compilation, what will the compiler do? The first thing it needs to do
is call strlen("Hello"). Let's assume that it is constexpr, so the compiler
can know that it will return 5.

Now we have a call to ::operator new(6). What will the compiler do?
 
No, that is not how it works.  `new char[6]` creates a literal preheap object of type array of 6 char.  It does not call new(bytes).  This is the same as what happens in the following currently:

    typedef char A[6];

    constexpr char e = A{'H' , 'e', 'l', 'l', 'o', '\0'}[1];

In the above the compiler creates a temporary during translation of type array of 6 char.  It can do this because that type is a literal type. It does not allocate storage nor does it know the run-time representation (specific bit pattern).  The literal type is represented within the implementations compile-time system symbolically.  The preheap at compile-time is simply a collection of such literal temporaries with their values represented symbolically.

I'm going to assume the compiler allocates a "preheap" -- that is, some static
storage of length 6. The compiler knows the return value since it allocated
the address. After that, it will execute the loop.

No, it does not know the specific address.  This is the same as:

     int x;
     constexpr int* p = &x;

During translation p is represented symbolically.  It doesn't have the final address until load-time.  p can still be used in a constant expression because the rules are setup to support this symbolic literal representation.  It cannot however, for example, undergo a reinterpet_cast to an integer, specifically for this reason (it doesn't have the bits yet).
 
At the end of the function, it could conclude that the memory layout should be
(pseudo-assembly):

        .section .rodata, read-only, sharable
        Lalloc_size_6:
                .asciz        "Hello"
        mystring:
                .quad        5
                .quad        Lalloc_size_6

No, not what I am proposing.  You are focusing on the mechanism and not the logical behaviour.  As one way of implementing the mechanism: a new section separate from .rodata could be created to hold preheap objects called .preheap.  At load-time the implementation loads that section read-only into memory, calls new(bytes) for each preheap object, copies some bits from the .preheap section to the newly allocated run-time storage, then based on the returned pointers relocates pointers to preheap objects, then _unloads_ the .preheap section (alternatively it will just page out back to the program image because the .preheap section will never be read again, but that is an implementation detail).  The .preheap section doesn't need to occupy resources during run-time, it is released after loading.  But again, that is all just mechanism, and overlooks the utility.

Notice how the preheap objects are created and used during translation, the run-time storage is allocated at load-time, and they can then continue to be used at run-time and then deleted during run-time.  That is the point.  So you can write code that uses new and delete, and it will work the same during translation or at run-time.
 
If nothing happened at load time, we would have a perfectly valid String
object, with len == 5 and p pointing to a read-only section of memory.
Obviously, you can't delete p. This is already enormously useful for QString,
which uses copy-on-write semantics. All we need is a flag indicating that this
string is immutable, which QString has.

You're talking about something different now.  Literal types are not necessarily immutable, as for example literal temporaries and local variables in constexpr functions are not.  Also a dynamic object must be able to be deleted during run-time in order to release its resources.  Objects of static storage duration do not have this property.

std::string has no such semantic, true. But all it requires to support an
immutable string data is a special allocator template, which causes the
destructor to be empty and the move constructor to be equal to the copy
constructor. Since the object is already const, no mutator functions can be
called, so no reallocation will ever happen.

Again, you are talking about something different now.  How to design a string class with respect to immutability.  It is separate from preheap objects, and overlooks the utility.
 
Now, you said "At load-time preheap objects are automatically loaded into the
heap." That means the compiler must emit load- and unload-time functions like:

        void init()
        {
                decltype(mystring.p) tmp = new char[6];
                std::memcpy(tmp, mystring.p, 6);
                mystring.p = tmp;
        }
        void fini()
        {
                delete[] mystring.p;
        }

The second consequence is that the "mystring" object cannot be placed in a
read-only & sharable section of memory, since we needed to modify mystring.p.
The compiler needs to place this constexpr object in a read-write section of
memory. At best, with the help from the program loader, the memory page can be
set to read-only after initialisation (similar to the GNU -z relro solution).
 
No, in my example the array of char is not defined with constexpr.  Preheap objects can be mutated at both compile-time and run-time.  They only have to be literal so they can be used at compile-time.

Here's my question: what's the benefit? If we replace that constexpr with a
simple const keyword, current C++98 code would generate the following load-
and unload-time functions (after inlining the constructor and destructor):

I'm questioning the need to reload preheap objects into the real heap. That
kills most of the benefit of the functionality.

No, it doesn't.  You're confusing immutability of strings with the utility of what is proposed.

I agree I need to write a more elaborate demo to show off the full utility of what I propose.  Clearly you are not seeing it yet.
  -Andrew.

Thiago Macieira

unread,
Oct 16, 2013, 11:41:02 AM10/16/13
to std-pr...@isocpp.org
On quarta-feira, 16 de outubro de 2013 00:10:54, Andrew Tomazos wrote:
> I agree I need to write a more elaborate demo to show off the full utility
> of what I propose. Clearly you are not seeing it yet.

Indeed, I'm not.

You've said "no" to almost everything I said, but then followed up by
explaining exactly the same that I had already explained, just with different
words or by changing something that represents no net effect (for example, it
doesn't matter that the "Hello\0" is in .rodata or .preheap; something needs
to know its address and allocate memory for it).

Again, I'm not questioning the benefit of the feature of using operator new()
in constexpr. I am questioning the "reallocate preheap into real heap at load-
time".
signature.asc

Nevin Liber

unread,
Oct 16, 2013, 11:46:09 AM10/16/13
to std-pr...@isocpp.org
On 16 October 2013 02:10, Andrew Tomazos <andrew...@gmail.com> wrote:

No, not what I am proposing.  You are focusing on the mechanism and not the logical behaviour.

Because you are trying to shoehorn this into existing types where there is behavior you care about and behavior you don't.  We have to make sure that the behavior you don't care about isn't observable behavior according to the standard.

If this were a proposal for basic_string<char, char_traits<char>, magic_allocator<char>>, a lot of the problem goes away, but so does the interoperability.
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

Andrew Tomazos

unread,
Oct 17, 2013, 1:39:12 AM10/17/13
to std-pr...@isocpp.org
On Wednesday, October 16, 2013 5:41:02 PM UTC+2, Thiago Macieira wrote:
it doesn't matter that the "Hello\0" is in .rodata or .preheap; something needs
to know its address and allocate memory for it.
 
.rodata needs to be accessible for the lifetime of the program, the .preheap section can be unloaded immediately after loading has completed, reclaiming its resources.  (If you mix them together it will lock the pages in memory.)  The concern is about resource management.  If we didn't care about resource usage then making the current normal delete a no-op would still result in a correct program, it would just use extra memory over its lifetime, making the heap into a memory pool.

Again, I'm not questioning the benefit of the feature of using operator new()
in constexpr. I am questioning the "reallocate preheap into real heap at load-
time".

Such an architecture would use less resources.  Placing preheap objects in the heap allows them to be deleted during run-time, reclaiming their resources.  If we placed them in .rodata they could not be deleted.

Consider the following program:

    typedef int T;

    constexpr T* p = new T(3);

    static_assert(*p == 3);

    int main()
    {
        /* loader() */
        first_half();
        delete p;
        second_half();
    }

In the above, for the duration of second_half() the resources used to back the predynamic object are available for use.  This is achieved because preheap objects are loaded into the heap.  Note the .preheap section is unloaded by the end of `loader()`.

If we allocated it as a static storage duration object as you suggest, the resources would still be occupied for second_half().

Thiago Macieira

unread,
Oct 17, 2013, 2:43:23 AM10/17/13
to std-pr...@isocpp.org
On quarta-feira, 16 de outubro de 2013 22:39:12, Andrew Tomazos wrote:
> Such an architecture would use less resources. Placing preheap objects in
> the heap allows them to be deleted during run-time, reclaiming their
> resources. If we placed them in .rodata they could not be deleted.

It actually uses a different type of resource. Data stored in .rodata is
sharable, always clean memory that the OS virtual memory manager can discard
at will, since it can simply reload it from disk.

The regular heap, however, is unsharable memory that can become dirty (it can
be written to). If the OS virtual memory manager needs to reclaim pages and
this page is dirty, it will first need to write to the swap / pagefile.

In that sense, regular heap is more expensive to store the same bytes.

Add to that:
- the overhead that malloc() has -- 8 to 32 bytes per allocation, depending
on the implementation
- the cost of calling malloc() during load-time (startup performance penalty)

> typedef int T;
>
> constexpr T* p = new T(3);
>
> static_assert(*p == 3);
>
> int main()
> {
>
> /* loader() */
> first_half();
> delete p;
> second_half();
>
> }

We don't need new in constexpr for that. The above program becomes better if
we replace constexpr with const, plus it will work with C++98.
signature.asc

Andrew Tomazos

unread,
Oct 17, 2013, 3:20:30 AM10/17/13
to std-pr...@isocpp.org
On Thursday, October 17, 2013 8:43:23 AM UTC+2, Thiago Macieira wrote:
On quarta-feira, 16 de outubro de 2013 22:39:12, Andrew Tomazos wrote:
> Such an architecture would use less resources.  Placing preheap objects in
> the heap allows them to be deleted during run-time, reclaiming their
> resources.  If we placed them in .rodata they could not be deleted.

It actually uses a different type of resource. Data stored in .rodata is
sharable, always clean memory that the OS virtual memory manager can discard
at will, since it can simply reload it from disk.
 
But if the .rodata section contains objects of static storage duration as well as preheap objects then use of the static storage duration objects at run-time will load the pages into memory, also unnecessarily loading any preheap objects along with them.

The shareability is only helpful if you have multiple instances of the program running simulatenously, in general that is not the case.  When only running one instance of the program at once the swap file backs the heap the same as the program image backs the .rodata section.  Also note that preheap objects can be mutable, so actually what you are suggesting is placing them in the .rodata, .bss or .data section as appropriate - the read-only aspect is not relevant.

The regular heap, however, is unsharable memory that can become dirty (it can
be written to). If the OS virtual memory manager needs to reclaim pages and
this page is dirty, it will first need to write to the swap / pagefile.
 
Preheap objects can be written to.  They are not necessarily read-only.

In that sense, regular heap is more expensive to store the same bytes.

Add to that:
 - the overhead that malloc() has -- 8 to 32 bytes per allocation, depending
   on the implementation
 - the cost of calling malloc() during load-time (startup performance penalty)
 
Those overheads are implementation-dependant and generally not asymptotically significant.

>     typedef int T;
>    
>     constexpr T* p = new T(3);
>    
>     static_assert(*p == 3);
>    
>     int main()
>     {
>    
>         /* loader() */
>         first_half();
>         delete p;
>         second_half();
>    
>     }

We don't need new in constexpr for that. The above program becomes better if
we replace constexpr with const, plus it will work with C++98.

Then the static_assert wouldn't work, as the object wouldn't be a preheap object, and would be unavailable during translation for use in constant expressions.  That was the whole point of predynamic storage duration to begin with.

Andrew Tomazos

unread,
Oct 17, 2013, 3:24:35 AM10/17/13
to std-pr...@isocpp.org


On Thursday, October 17, 2013 9:20:30 AM UTC+2, Andrew Tomazos wrote:
On Thursday, October 17, 2013 8:43:23 AM UTC+2, Thiago Macieira wrote:
 
In that sense, regular heap is more expensive to store the same bytes.

Add to that:
 - the overhead that malloc() has -- 8 to 32 bytes per allocation, depending
   on the implementation
 - the cost of calling malloc() during load-time (startup performance penalty)
 
Those overheads are implementation-dependant and generally not asymptotically significant.

Sorry, I meant the first one.  Yes the cost of loading them may be significant, but it is no different to the cost of initializing a std::string with a string literal now.  The point is preheap objects can be used during translation and run-time.  Dynamic storage duration objects are not available during translation currently.


Billy O'Neal

unread,
Oct 17, 2013, 3:55:34 AM10/17/13
to std-proposals

Erm, *none* of these things are asymptotically sufficient.

All of this talk about binary sections sounds awfully platform specific for the standard at this point. You're talking about ELF sections, right? Not all platforms use ELF. As I said earlier, none of this is going to be significant in run time in the real world. If the point is to allow the types to be constexpr that's interesting, but the question there is "how does the constexpr language feature interact with operator new inside string/vector", not where in the binary various things go. The standard defines language semantics, not binary layout.

Another possible implementation would be to have no binary preheap concept. Instead, it is transformed by the compiler to generate the correct memory state at run time, which may involve calls to operator new. If involved in something constexpr, the compiler could figure out the right answers, but the behavior under constexpr and non constexpr conditions could be completely different.

Reply all
Reply to author
Forward
0 new messages