Securely wiped containers?

Rob Meijer

unread,

Jul 22, 2014, 3:35:40 AM7/22/14

to std-pr...@isocpp.org

Trying to write a library that handles a lot of sensitive security tokens in std::string's, I queried some C++ fora about the possibilities of assuring these tokens not lingering about in memory. Next to the usual half informed responses, the apparently most cluefull responses pointed out that anything I and others suggested I could do was either STL implementation specific or was something that the compiler would (and was in its right to) optimize away.

Given that this is apparently a rather hard problem, and given that I'm not that deeply into the subject that I could propose a solidified solution for this issue, rather than proposing one or multiple likely stupid proposals, I would just like to drop the problem for others with a solid understanding of the internals of the standard libraries and the optimization freedoms of compilers to think about.

Could we think of any library proposal that would allow standard containers to optionally get securely wiped, without the compiler legally optimizing away the actual wiping ?

Rob

Dean Michael Berris

unread,

Jul 22, 2014, 3:38:48 AM7/22/14

to std-pr...@isocpp.org

Can you not do this with a custom allocator for your types?

> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposal...@isocpp.org.
> To post to this group, send email to std-pr...@isocpp.org.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.

David Krauss

unread,

Jul 22, 2014, 3:48:42 AM7/22/14

to std-pr...@isocpp.org

On 2014–07–22, at 3:43 PM, David Krauss <pot...@gmail.com> wrote:

I’ve never thought about the compiler being allowed to eliminate a change to an object merely because the change cannot be observed before its life ends. It doesn’t sound likely in practice.

To be sure, it’s commonplace for stack objects, I’ve just not heard of such analysis for objects of dynamic lifetime. It would require the deallocation function (operator delete or free) to have a compiler hook for that specific purpose.

David Krauss

unread,

Jul 22, 2014, 3:51:15 AM7/22/14

to std-pr...@isocpp.org

On 2014–07–22, at 3:35 PM, Rob Meijer <pib...@gmail.com> wrote:

Trying to write a library that handles a lot of sensitive security tokens in std::string's, I queried some C++ fora about the possibilities of assuring these tokens not lingering about in memory. Next to the usual half informed responses, the apparently most cluefull responses pointed out that anything I and others suggested I could do was either STL implementation specific or was something that the compiler would (and was in its right to) optimize away.

Not sure what you mean by either STL implementation specific or optimization.

You can provide a custom allocator to std::basic_string which wipes memory in its deallocate() method. This receives a size_type argument. However, it causes the string to have a different type than std::string, which can cause great pain.

As of C++14, you can wipe memory by replacing your library’s ::operator delete and requesting a std::size_t second argument. This will take care of every deallocation in the program, not only from std::string.

I’ve never thought about the compiler being allowed to eliminate a change to an object merely because the change cannot be observed before its life ends. It doesn’t sound likely in practice. But, you can take care of the possibility by either marking the bytes as volatile while wiping them, or passing a reference into the allocation block to an extern function after wiping it.

Rob Meijer

unread,

Jul 22, 2014, 5:43:06 AM7/22/14

to std-pr...@isocpp.org

There are two problems with that:

* The allocator supplied to basic_string is aparently not guaranteed to be used for smaller string sizes.

If its not used, than no dealocate is called and no sensitive data gets wiped.

* Code mutating memory locations that after mutations will not longer be used could simply be

legally optimized away by the compiler. Not sure if/how 'volatile' could fix this ?

Thiago Macieira

unread,

Jul 22, 2014, 2:57:49 PM7/22/14

to std-pr...@isocpp.org

Hi Rob

The original question was: how do I ensure the bytes in a std::basic_string
get wiped?

The problems with giving a full answer to that are:

1) libstdc++ used a refcounted basic_string because C++98 didn't outright ban
it, so you can't be sure that the last reference was dropped

2) the standard does permit the "small string optimisation" (SSO) solution,
which allows basic_string to skip the allocator

3) memset can be optimised out of existence -- see Greg K-H's post
https://plus.google.com/111049168280159033135/posts/dr2pf4ku3Cd and follow-up
on how BSD solves it:
https://plus.google.com/111049168280159033135/posts/YTDoSRTrktc

Let's ignore case #1 for the moment, since C++11 ruled that refcounting was
not permitted.

If you want to scrub, there are three options:
- before basic_string's destructor
- during its destructor
- after its destructor

Obviously, the standard cannot recommend doing it before the destructor.
Unlike QByteArray, basic_string does not return a non-const pointer to its
data.

Doing it during the destructor is the expected case of using the allocator,
but it won't work if SSO is in operation. And even if that were the case, the
compiler can prove the memory isn't getting used again, so problem #3 applies.

So the only case available is to do it *after* the destructor has run. That
requires you legitimately have access to the memory area where the
basic_string was placed. The only way I see that happening is if you have an
indirection to the basic_string in the first place.

Such as:
struct Wrapper
{
std::basic_string<char, std::char_traits<char>, MyAllocator> data;
void *operator new(std::size_t);
operator delete(void *);
};

And always allocating that in the heap, so the operator delete member gets to
run.

At that point, it's easier to ditch std::basic_string and just new char[N]
yourself, thus avoiding a double indirection to the data in the non-small
case.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

David Rodríguez Ibeas

unread,

Jul 22, 2014, 5:25:45 PM7/22/14

to std-pr...@isocpp.org

On Tue, Jul 22, 2014 at 2:57 PM, Thiago Macieira <thi...@macieira.org> wrote:

On Tuesday 22 July 2014 09:35:38 Rob Meijer wrote:If you want to scrub, there are three options:
- before basic_string's destructor
- during its destructor
- after its destructor

Obviously, the standard cannot recommend doing it before the destructor.
Unlike QByteArray, basic_string does not return a non-const pointer to its
data.

But you *do* have access to it:
{
std::string str = f();
memset(&str[0], 0, str.size());
}

Now you have to make sure that the optimizer does not remove that memset, but since security trumps performance, you could hide that in a extern function:

void reset(std::string & str) {
memset(&str[0], 0, str.size());
}

{
std::string str = f();
reset(str);
}

And now you are left with only link time optimization...

David

Thiago Macieira

unread,

Jul 23, 2014, 12:08:42 AM7/23/14

to std-pr...@isocpp.org

On Tuesday 22 July 2014 17:25:42 David Rodríguez Ibeas wrote:
> But you *do* have access to it:
> {
> std::string str = f();
> memset(&str[0], 0, str.size());
> }

Interesting, I had never noticed that operator[] is non-const.

Does the standard require that basic_string keep data contiguously? It could
collapse into one string only when you call c_str().

If it requires contiguous data, why is operator[] returning non-const but
data() is only const?

Daniel Krügler

unread,

Jul 23, 2014, 2:07:04 AM7/23/14

to std-pr...@isocpp.org

2014-07-23 6:08 GMT+02:00 Thiago Macieira <thi...@macieira.org>:
> Does the standard require that basic_string keep data contiguously? It could
> collapse into one string only when you call c_str().

It is guaranteed, see [string.require] p4:

The char-like objects in a basic_string object shall be stored
contiguously. That is, for any basic_string
object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold
for all values of n such that 0
<= n < s.size().

> If it requires contiguous data, why is operator[] returning non-const but
> data() is only const?

Presumably it could be changed, see

http://cplusplus.github.io/LWG/lwg-active.html#2391

There exists the requirement that the character at position == size()
is equal to charT(), so you can not *arbitrarily* scribble into it,
therefore the current hesitation (I guess).

- Daniel

David Krauss

unread,

Jul 23, 2014, 2:50:02 AM7/23/14

to std-pr...@isocpp.org

On 2014–07–23, at 2:07 PM, Daniel Krügler <daniel....@gmail.com> wrote:

There exists the requirement that the character at position == size()
is equal to charT(), so you can not *arbitrarily* scribble into it,
therefore the current hesitation (I guess).

There has been some churn, but if I recall correctly, operator[] ( n ) with n ≥ size() must return a null (default-constructed) character value but this does not imply that there is a character stored at position = size(). The implementation is free to leave the null terminator uninitialized until c_str() is called, but this doesn’t apply to data(). (Scribbling on it would be UB because it’s not part of the sequence, though!)

I cannot recall how the semantics of s.data() are supposed to differ from &s[0], if at all. The latter, if performed on a non-const name s, invalidates a null terminator from a prior c_str() regardless of whether anything is subsequently overwritten.

The fundamental question, IIRC, is whether the user can do c_str() and then data() and then still use the C string. Currently it is allowed, but non-const data() could in theory break something. The difference is so pedantic that it’s starting to give me a headache already.

Sebastian Gesemann

unread,

Jul 23, 2014, 4:28:05 AM7/23/14

to std-pr...@isocpp.org

On Tue, Jul 22, 2014 at 9:35 AM, Rob Meijer <pib...@gmail.com> wrote:
> Trying to write a library that handles a lot of sensitive security tokens in
> std::string's, I queried some C++ fora about the possibilities of assuring
> these tokens not lingering about in memory. Next to the usual half informed
> responses, the apparently most cluefull responses pointed out that anything
> I and others suggested I could do was either STL implementation specific or
> was something that the compiler would (and was in its right to) optimize
> away.

Maybe Botan's secure_vector is an option:
http://botan.randombit.net/manual/secmem.html

I'm not sure if it even prevents the memory from being swapped to disk
(which you might want to add if it's not already there. I'm sure
modern operating systems support it). At least, it does zeroing before
deallocation.

For the reasons mentioned (small string optimization, copy-on-write),
I would use a vector-like type for the secrets instead of std::string.

> Could we think of any library proposal that would allow standard containers
> to optionally get securely wiped, without the compiler legally optimizing
> away the actual wiping ?

It's not my call but I would guess that standardizing something like
this is not justified.

Ion Gaztañaga

unread,

Jul 23, 2014, 3:41:56 PM7/23/14

to std-pr...@isocpp.org

El 22/07/2014 23:25, David Rodríguez Ibeas wrote:
> But you *do* have access to it:
> {
> std::string str = f();
> memset(&str[0], 0, str.size());
> }
>
> Now you have to make sure that the optimizer does not remove that
> memset, but since security trumps performance, you could hide that in a
> extern function:
>
> void reset(std::string & str) {
> memset(&str[0], 0, str.size());
> }
>
> {
> std::string str = f();
> reset(str);
> }
>
> And now you are left with only link time optimization...
>
> David

Or use memset_s:

https://www.securecoding.cert.org/confluence/display/seccode/MSC06-C.+Beware+of+compiler+optimizations

Best,

Ion

Sean Middleditch

unread,

Jul 23, 2014, 11:48:10 PM7/23/14

to std-pr...@isocpp.org

None of the answers so far are taking into account the security problems on common general-purpose OSes with memory safety. "Wiping" memory before deallocation is simply not enough.

General purpose OSes can use a virtual memory system to move physical pages around in memory, to disk, etc. Just because you clear some data doesn't mean that the OS didn't already make copies of the data elsewhere in RAM or on disk. Many OSes provide some calls to "lock" virtual pages disallowing any such copies or ensuring that the original copied-from pages are cleared by the OS after copy. Using these usually incurs some kind of performance overhead (across all processes) or requires special process privileges.

An STL allocator wouldn't just want to clear memory but also ensure that all "secure" allocations come out of a pool of pages that are properly protected. It would essentially be a separate heap/free-store just for secure block allocations.

Given that not all platforms have these features available, or are available with different conditions or limitations, or that they work in different ways, standardizing them may be a little difficult. Possibly worth it, but a bit more than just a new allocator or container.

David Krauss

unread,

Jul 24, 2014, 12:12:58 AM7/24/14

to std-pr...@isocpp.org

Security always requires end-to-end, holistic engineering. A locked page on a machine running inside a VM will still be written to disk when the hypervisor suspends it.

The best solution is to encrypt the (physical) hard drive, including the VM backing store. OS support varies. Also, don’t connect the machine to the Internet!

It might be nice if POSIX offered queries about hardware-level security, so the kernel could let you know if it detects that your CPU silicon has an MMU bug. IMHO C++ goes far enough already in its custom memory management, not to mention the ability to derive from std::string. A separate datatype with non-public inheritance is probably appropriate, to avoid accidental treatment of secure data as insecure.

Olaf van der Spek

unread,

Jul 24, 2014, 5:21:21 AM7/24/14

to std-pr...@isocpp.org

On Wednesday, July 23, 2014 8:50:02 AM UTC+2, David Krauss wrote:

The implementation is free to leave the null terminator uninitialized until c_str() is called, but this doesn’t apply to data().

Is it?

http://www.cplusplus.com/reference/string/string/data/ says they're equivalent.

David Krauss

unread,

Jul 24, 2014, 5:37:07 AM7/24/14

to std-pr...@isocpp.org

Is that an appeal to authority? Who even edits that site? It’s clickbait.

Olaf van der Spek

unread,

Jul 24, 2014, 5:43:48 AM7/24/14

to std-pr...@isocpp.org

I know, I know, I should refer to the actual standard...

> Who even edits that site?

I've got no idea. Haven't been able to find any names.

> It’s clickbait.
>
> --
>
> ---
> You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/e9S7xwhHg3k/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.

> To post to this group, send email to std-pr...@isocpp.org.
> Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

--

Olaf

Ville Voutilainen

unread,

Jul 24, 2014, 6:12:18 AM7/24/14

to std-pr...@isocpp.org

On 24 July 2014 12:43, Olaf van der Spek <olafv...@gmail.com> wrote:
>> Is that an appeal to authority?
>
> I know, I know, I should refer to the actual standard...

[string.accessors] specifies that .c_str() and .data() have identical effects.

David Krauss

unread,

Jul 24, 2014, 6:19:17 AM7/24/14

to std-pr...@isocpp.org

On 2014–07–24, at 6:12 PM, Ville Voutilainen <ville.vo...@gmail.com> wrote:

[string.accessors] specifies that .c_str() and .data() have identical effects.

Well, it constrains the two functions identically. But, it does require data() to initialize the null terminator (tricky spec, mind the range), which is surprising and may be a defect. I’d certainly expect data() to be a simple one-instruction deal like &s[0].

Ville Voutilainen

unread,

Jul 24, 2014, 6:25:26 AM7/24/14

to std-pr...@isocpp.org

data() can be a one-instruction deal, as long as string-modifying operations and
constructors already make sure that the terminator is initialized.

Thiago Macieira

unread,

Jul 24, 2014, 11:03:30 AM7/24/14

to std-pr...@isocpp.org

On Thursday 24 July 2014 11:43:46 Olaf van der Spek wrote:
> > Who even edits that site?
>
> I've got no idea. Haven't been able to find any names.

WHOIS data shows a person in California. Domain registered in 1999.

dr.al....@googlemail.com

unread,

Aug 9, 2014, 2:01:54 PM8/9/14

to std-pr...@isocpp.org

Could we think of any library proposal that would allow standard containers to optionally get securely wiped, without the compiler legally optimizing away the actual wiping ?

Back to the original question: The request could basically be solved by something like this:

template<typename OutputIterator>
void wipe(OutputIterator it, OutputIterator end) {
    // exposition only, the implementation is required to make
    // sure that this won't get optimized away
    while (it != end) {
        *it = /* implementation-defined */;
        ++it;
    }
}

I think having something like this in the stdlib would be a very good idea.

Ross Smith

unread,

Aug 10, 2014, 5:03:51 PM8/10/14

to std-pr...@isocpp.org

> I think having something like this in the stdlib would be a *very* good
> idea.

Rather than leaving the wipe value implementation defined, it might be
better to have something like a "std::secure_fill()" algorithm, with
interface and semantics the same as std::fill(), apart from the
additional security requirement.

Simply requiring the compiler not to optimize the code away would not
be enough by itself; you would also need to add implicit memory
barriers before and after the secure_fill() call (or an
acquire/release pair). We're assuming the fill loop doesn't touch
anything used by nearby code (that's why it's in danger of being
optimized out of existence), so the compiler might feel free to
move code from after the loop to before it (opening a potential
security hole) without the barriers.

Ross Smith

Thiago Macieira

unread,

Aug 10, 2014, 5:25:23 PM8/10/14

to std-pr...@isocpp.org

On Monday 11 August 2014 09:03:34 Ross Smith wrote:
> Simply requiring the compiler not to optimize the code away would not
> be enough by itself; you would also need to add implicit memory
> barriers before and after the secure_fill() call (or an
> acquire/release pair). We're assuming the fill loop doesn't touch
> anything used by nearby code (that's why it's in danger of being
> optimized out of existence), so the compiler might feel free to
> move code from after the loop to before it (opening a potential
> security hole) without the barriers.

As discussed in this thread, secure filling after use is not enough. You need
to securely allocate the memory so it is memory-locked and won't get swapped
out.

Reply all

Reply to author

Forward