Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

question: how can I make a simple byte buffer using std::vector but no initialization?

2,351 views
Skip to first unread message

anhongl...@gmail.com

unread,
Jan 22, 2019, 12:00:04 AM1/22/19
to
Hi experts,

I am new to C++, I would like to have a buffer using std::vector, like

```cpp
std::vector<char> buf;

buf.resize(64 * 1024 * 1024 * 1024);
fread(buf.data(), buf.size(), 1, fp);

func(buf.data()); // consuming it

```

I like vector, but the resize will initialize the whole vector as zero.
Which is unwanted, I don't even need it, it's a waste!

How could I skip this?
Some people warned me the following code is dangerous...
Because assessing data() outside of range [data, data + size) is undefined:


```cpp
std::vector<char> buf;

buf.reserve(64 * 1024 * 1024 * 1024);
fread(buf.data(), buf.size(), 1, fp);

func(buf.data()); // consuming it

```

then what's the proper/elegant way?
I need pretty big buffer so I have to dynamically allocate it.


Thanks,
Anhong

Daniel

unread,
Jan 22, 2019, 12:27:34 AM1/22/19
to
On Tuesday, January 22, 2019 at 12:00:04 AM UTC-5, anhongl...@gmail.com wrote:
> Hi experts,
>
> I am new to C++, I would like to have a buffer using std::vector, like
>
> ```cpp
> std::vector<char> buf;
>
> buf.resize(64 * 1024 * 1024 * 1024);
> fread(buf.data(), buf.size(), 1, fp);
>
> func(buf.data()); // consuming it
>
> ```
>
> I like vector, but the resize will initialize the whole vector as zero.
> Which is unwanted, I don't even need it, it's a waste!
>
> How could I skip this?

Some alternatives:

http://andreoffringa.org/?q=uvector

Personally, I wouldn't worry about it.

Daniel

Barry Schwarz

unread,
Jan 22, 2019, 1:31:00 AM1/22/19
to
On Mon, 21 Jan 2019 20:59:52 -0800 (PST), anhongl...@gmail.com
wrote:

>Hi experts,
>
>I am new to C++, I would like to have a buffer using std::vector, like
>
>```cpp
>std::vector<char> buf;
>
>buf.resize(64 * 1024 * 1024 * 1024);
>fread(buf.data(), buf.size(), 1, fp);

Your call to fread specifies a single object 64GB in size. How will
you know exactly how much data was actually read in?

Just out of curiosity, how long will it take your system to read a
64GB block?

>func(buf.data()); // consuming it
>
>```
>
>I like vector, but the resize will initialize the whole vector as zero.
>Which is unwanted, I don't even need it, it's a waste!
>
>How could I skip this?
>Some people warned me the following code is dangerous...
>Because assessing data() outside of range [data, data + size) is undefined:

It's no more dangerous than any other indexing operation. fread will
certainly not access memory beyond the end of the array. It is up to
you to make sure that func won't either.

--
Remove del for email

Jorgen Grahn

unread,
Jan 22, 2019, 1:45:00 AM1/22/19
to
On Tue, 2019-01-22, anhongl...@gmail.com wrote:
> Hi experts,
>
> I am new to C++, I would like to have a buffer using std::vector, like
>
> ```cpp
> std::vector<char> buf;
>
> buf.resize(64 * 1024 * 1024 * 1024);
> fread(buf.data(), buf.size(), 1, fp);
>
> func(buf.data()); // consuming it
>
> ```

Is this your actual use case -- to read a 64 gigabyte file into memory
in one big chunk, then process the data?

> I like vector, but the resize will initialize the whole vector as zero.
> Which is unwanted, I don't even need it, it's a waste!

I can see why you don't want it in this particular case, but normally
it's a very good thing: std::vector<T> should help protect the invariant
of T.

If you want the huge file in memory as-is, one alternative is mmap().
It's a Unix function, but Windows has something similar and I suspect
Boost has a portable API for it.

Another option is to read the file in smaller chunks and process the
data as it arrives, discarding it when you don't need it anymore. But
that may not be possible depending on the file format and the task.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Paavo Helde

unread,
Jan 22, 2019, 2:23:23 AM1/22/19
to
On 22.01.2019 6:59, anhongl...@gmail.com wrote:
> Hi experts,
>
> I am new to C++, I would like to have a buffer using std::vector, like
>
> ```cpp
> std::vector<char> buf;
>
> buf.resize(64 * 1024 * 1024 * 1024);
> fread(buf.data(), buf.size(), 1, fp);
>
> func(buf.data()); // consuming it
>

Wow, 64 GB? If this is really a case, you should use memory mapping
instead of reading it in in one go (but memory mapping is non-standard
and has its own caveats).

> ```
>
> I like vector, but the resize will initialize the whole vector as zero.
> Which is unwanted, I don't even need it, it's a waste!

Programming is an engineering disciple, which means you have to be
pragmatic. As the file reading will most probably be several orders of
magnitudes slower than the memory zeroing, it is a safe bet one can just
ignore the time spent on zeroing in this scenario. Of course you can
measure the timings first and then decide if this is something to spend
time on or not.

> How could I skip this?

Buffer zeroing may cause significant overhead in other scenarios (not
related to file reading). I ended up with making my own small wrapper
class wrapping (the equivalent of) malloc() directly. But I did this
only after repeated performance profiling had convinced me that one
could win more than 10% this way, and there seemed to be no other way to
squeeze the performance.

TLDR; first profile your program and then decide which parts of it need
speeding up.

Juha Nieminen

unread,
Jan 22, 2019, 3:12:53 AM1/22/19
to
anhongl...@gmail.com wrote:
> I like vector, but the resize will initialize the whole vector as zero.
> Which is unwanted, I don't even need it, it's a waste!

As others have pointed out, in your particular scenario allocating a 64GB
block of memory might not be really what you want (mmap is usually what's
used to handle huge raw files as efficiently as possible, even though it's
not a standard function). But anyway, to answer your actual question:

std::vector isn't really designed to manage uninitalized memory, and this
is by design.

Perhaps the easiest way to get a block of uninitalized memory, C++ style,
is this:

std::unique_ptr<unsigned char[]> data
((unsigned char*)::operator new[](1234));

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

David Brown

unread,
Jan 22, 2019, 3:36:21 AM1/22/19
to
On 22/01/2019 07:30, Barry Schwarz wrote:
> On Mon, 21 Jan 2019 20:59:52 -0800 (PST), anhongl...@gmail.com
> wrote:
>
>> Hi experts,
>>
>> I am new to C++, I would like to have a buffer using std::vector, like
>>
>> ```cpp
>> std::vector<char> buf;
>>
>> buf.resize(64 * 1024 * 1024 * 1024);
>> fread(buf.data(), buf.size(), 1, fp);
>
> Your call to fread specifies a single object 64GB in size. How will
> you know exactly how much data was actually read in?
>
> Just out of curiosity, how long will it take your system to read a
> 64GB block?
>

My guess is that the OP is looking at this as an easy way to read in the
whole file as simply as possible. Allocate a memory chunk that is
bigger than any file he might want, but don't access that memory so that
it is left unmapped (on many systems). Read the file into the buffer -
no need to worry about buffer overrun with such a large buffer, and
fread a larger size than any file might be. That way you are sure
you've got everything in the file, with just a few lines of code. No
need to read in chunks, or move or copy memory around.

This is, of course, a /highly/ questionable approach in many ways. So
my advice is for the OP to take a step back and look at what he wants to
achieve - ask about that, rather than details of an attempted
implementation. mmap() is probably a better answer, but that might
depend on the OS, the types of file, etc.

Or question the choice of C++ in the first place, especially if you are
new to the language - make sure you use the right language for the job.
In Python, this is:

data = file("xxx.dat").read()


But if you /really/ want to get a buffer of 64 GB space without touching
any of it, use malloc(), not a C++ container. Just wrap it in an RAII
class to be sure it is freed.

Paavo Helde

unread,
Jan 22, 2019, 3:47:09 AM1/22/19
to
On 22.01.2019 9:23, Paavo Helde wrote:
> On 22.01.2019 6:59, anhongl...@gmail.com wrote:
>> Hi experts,
>>
>> I am new to C++, I would like to have a buffer using std::vector, like
>>
>> ```cpp
>> std::vector<char> buf;
>>
>> buf.resize(64 * 1024 * 1024 * 1024);

Side remark: as written, this line is UB because of signed integer
overflow, on most (all?) platforms.

leigh.v....@googlemail.com

unread,
Jan 22, 2019, 4:27:21 AM1/22/19
to
On Tuesday, January 22, 2019 at 5:00:04 AM UTC, anhongl...@gmail.com wrote:
> Hi experts,
>
> I am new to C++, I would like to have a buffer using std::vector, like
>
> ```cpp
> std::vector<char> buf;
>
> buf.resize(64 * 1024 * 1024 * 1024);
> fread(buf.data(), buf.size(), 1, fp);
>
> func(buf.data()); // consuming it
>
> ```
>
> I like vector, but the resize will initialize the whole vector as zero.
> Which is unwanted, I don't even need it, it's a waste!
>
> How could I skip this?

Simply use a dynamic array instead:

char* buf = new char[64 * 1024 * 1024 * 1024];

Allocating uninitialised buffers (including allocating space for placement new) are about the only use-cases for dynamic arrays these days.

/Leigh

Alf P. Steinbach

unread,
Jan 22, 2019, 5:10:15 AM1/22/19
to
Platforms with 64-bit `int` include the HAL Computer Systems port of
Solaris to the SPARC64 and Classic UNICOS, according to Wikipedia's
article on the issue.


Cheers!,

- Alf

Öö Tiib

unread,
Jan 22, 2019, 5:19:22 AM1/22/19
to
On Tuesday, 22 January 2019 07:00:04 UTC+2, anhongl...@gmail.com wrote:
> Hi experts,
>
> I am new to C++, I would like to have a buffer using std::vector, like
>
> ```cpp
> std::vector<char> buf;
>
> buf.resize(64 * 1024 * 1024 * 1024);
> fread(buf.data(), buf.size(), 1, fp);
>
> func(buf.data()); // consuming it
>
> ```
>
> I like vector, but the resize will initialize the whole vector as zero.
> Which is unwanted, I don't even need it, it's a waste!
>
> How could I skip this?
> Some people warned me the following code is dangerous...
> Because assessing data() outside of range [data, data + size) is undefined:
>
>
> ```cpp
> std::vector<char> buf;
>
> buf.reserve(64 * 1024 * 1024 * 1024);

That is buf.reserve(0); with gcc 8.1.0 also following warning:
warning: integer overflow in expression of type 'int' results in '0' [-Woverflow]


> fread(buf.data(), buf.size(), 1, fp);
>
> func(buf.data()); // consuming it
>
> ```
>
> then what's the proper/elegant way?
> I need pretty big buffer so I have to dynamically allocate it.

Elegant way is to make code that compiles and runs and does
what you actually want it to do then post it here as whole.
It does not matter how fast our customers get incorrect results.

Let's look at performance.
DRAM is typically about order of magnitude faster than SSD
so bottle-neck in your posted code is reading the file not
zeroing memory. Even if you buy very decent SSD then freading
64 GB will take over 20 seconds no matter if the buffer
was initialized or not. If you wan't better performance then
you should make your algorithm less naive and to make it
to work with sub-sets of whole data and to load it on need
basis. Full HD video is about 3 GB/hour.


fir

unread,
Jan 22, 2019, 5:23:59 AM1/22/19
to
W dniu wtorek, 22 stycznia 2019 08:23:23 UTC+1 użytkownik Paavo Helde napisał:
> As the file reading will most probably be several orders of
> magnitudes slower than the memory zeroing,

on good pc zeroing 64GB should take 6 seconds (in one core, if you use 6 maybe
it could take 1 second - if they hae independand bandwidth, probably yes)

reading from disk: some write good SSD hae 500 GB/s but i never tested it and i belive it may be not exact (to high) but lets take it then - 2 minutes

Öö Tiib

unread,
Jan 22, 2019, 5:24:12 AM1/22/19
to
Yes, but Anhong won't get anywhere near one of such platforms
anytime soon.

Öö Tiib

unread,
Jan 22, 2019, 5:25:28 AM1/22/19
to
On Tuesday, 22 January 2019 12:10:15 UTC+2, Alf P. Steinbach wrote:

Bonita Montero

unread,
Jan 22, 2019, 5:28:32 AM1/22/19
to
> Wow, 64 GB? If this is really a case, you should use memory mapping
> instead of reading it in in one go (but memory mapping is non-standard
> and has its own caveats).

Memory-mapping is slower and / or (depending on the platform) consumes
more cpu-resources because there is usually no prefetching an faults on
every page (thus the higher cpu-overhead).

> Buffer zeroing may cause significant overhead in other scenarios (not
> related to file reading).

Is a buffer of std::byte really zeroed? I think std::vector simply calls
the default-constructor, which does nothing for std::byte and thereby
the whole internal initialization loop when enlarging the vector shoud
be optimized away.

Alf P. Steinbach

unread,
Jan 22, 2019, 5:29:02 AM1/22/19
to
I would not do that, because `std::unique_ptr` for an array effectively
uses an ordinary `delete[]`-expression to deallocate, while the above
requires a direct call of `::operator delete[]`.

There is a difference: a `new[]`-expression may ask `operator new[]` for
more than the size of the array, in order to store a count of objects
(so that destructors can be called on the appropriate number of items),
and the `delete[]` expression that `unique_ptr` uses by default, takes
that into account, i.e. it may perform a pointer adjustment. As I recall
it's unspecified whether this happens for an array of POD type. And then
it /can/ happen, i.e. the code wouldn't be portable.

A simple remedy is to just use an ordinary `new[]`-expression, or
`make_unique`.


Cheers!,

- Alf

fir

unread,
Jan 22, 2019, 5:30:47 AM1/22/19
to
3GB/s SSD ?

Öö Tiib

unread,
Jan 22, 2019, 5:37:02 AM1/22/19
to
No vector::resize must value-initialize the buffer if value wasn't provided.
With char it means zeroes.

Öö Tiib

unread,
Jan 22, 2019, 5:42:00 AM1/22/19
to
Yes. 64/3 > 20

Bonita Montero

unread,
Jan 22, 2019, 5:46:43 AM1/22/19
to
>> Is a buffer of std::byte really zeroed? I think std::vector simply calls
>> the default-constructor, which does nothing for std::byte and thereby
>> the whole internal initialization loop when enlarging the vector shoud
>> be optimized away.

> No vector::resize must value-initialize the buffer if value wasn't provided.
> With char it means zeroes.

The container doesnt know anything about zeroes; it calls the
default-constructor, which is a NOP in case of std::byte.

fir

unread,
Jan 22, 2019, 5:54:53 AM1/22/19
to
but where you seen that fast ones?
i not checked like some 2 years but it used to be 500 MB/s

fir

unread,
Jan 22, 2019, 6:00:45 AM1/22/19
to
ok indeed i found some

https://www.techradar.com/reviews/wd-black-sn750-nvme-ssd

interesting also is how much speed drop down on non-sequential highly random read (0.5MB its disaster)

Alf P. Steinbach

unread,
Jan 22, 2019, 6:05:34 AM1/22/19
to
The effect of insertion (or construction or resizing) is described as a
/default-insertion/. Which in turn in C++17 is described by §26.2.1/15.2
as-if initialized by `allocator_traits<A>::construct(m, p)` where `m` is
an lvalue of the allocator type `A` and `p` points to the uninitialized
storage in the vector buffer. Then §23.10.8.2/5 describes that as either
calling `a.construct(p, std::forward<Args>(args)... )`, if that is
well-formed, or otherwise that it invokes `::new
(static_cast<void*>(p)) T(std::forward<Args>(args)...)`. Since with
C++11 and later there is no `construct` in the default allocator it's
the latter, and with no args it's a value initialization. Which zeroes.

But it means that one /could/ use a vector with a custom allocator that
defines a `construct` that does nothing:


-------------------------------------------------------------------------
#include <memory> // std::allocator
#include <vector>
#include <utility>

template< class Type > using P_ = Type*;

template< class Type >
class Non_zeroing_alloc
{
std::allocator<Type> m_std;

public:
using value_type = Type;
using propagate_on_container_move_assignment = std::true_type;
using is_always_equal = std::true_type;

auto allocate( const size_t n ) -> P_<Type> { return
m_std.allocate( n ); }
void deallocate( const P_<Type> p, const size_t n ) {
m_std.deallocate( p, n ); }

template< class... Args >
void construct( P_<Type> storage, Args&&... )
{ (void) storage; } //{ *storage = 'A'; }

Non_zeroing_alloc() noexcept {}
Non_zeroing_alloc( const Non_zeroing_alloc& ) noexcept {}
template< class U > Non_zeroing_alloc( const Non_zeroing_alloc<U>&
) noexcept {}
};

#include <iostream>
auto main()
-> int
{
using namespace std;
vector<char, Non_zeroing_alloc<char>> v( 13 );
for( const char ch : v ) { cout << +ch << ' '; }
cout << endl;
}
-------------------------------------------------------------------------


Cheers!,

- Alf

Öö Tiib

unread,
Jan 22, 2019, 6:15:07 AM1/22/19
to
On what platform? On platforms that have C++17 compiler and that I can
reach ... std::vector<std::byte>::resize does zero-initialize.

Bonita Montero

unread,
Jan 22, 2019, 7:27:41 AM1/22/19
to
> On what platform? On platforms that have C++17 compiler and that I
> can reach ... std::vector<std::byte>::resize does zero-initialize.

THE COMPILER DOESN'T KNOW ANYTHING ABOUT ZEROES.
It calls the default-constructor, which is in
case of std:byte a nop.

Bonita Montero

unread,
Jan 22, 2019, 7:31:55 AM1/22/19
to
> The effect of insertion (or construction or resizing) is described as a
> /default-insertion/. Which in turn in C++17 is described by §26.2.1/15.2
> as-if initialized by `allocator_traits<A>::construct(m, p)` where `m` is
> an lvalue of the allocator type `A` and `p` points to the uninitialized
> storage in the vector buffer. Then §23.10.8.2/5 describes that as either
> calling `a.construct(p, std::forward<Args>(args)... )`, if that is
> well-formed, or otherwise that it invokes  `::new
> (static_cast<void*>(p)) T(std::forward<Args>(args)...)`. Since with
> C++11 and later there is no `construct` in the default allocator it's
> the latter, and with no args it's a value initialization. Which zeroes.

The default-constructor of any built-in type doesn't zero. Just try it
yourself by doing a malloc and afterwards an iterated placement-new on
every item.
If you get a zero-initializatio, i.e. when you reserve a range so long
that it isn't backed by the fragmented heap but mmap() or VirtualAlloc()
which map to a common zero-page until there is a write-access to any
page, this just happens by accident.

Alf P. Steinbach

unread,
Jan 22, 2019, 8:11:35 AM1/22/19
to
On 22.01.2019 13:31, Bonita Montero wrote:
>> The effect of insertion (or construction or resizing) is described as
>> a /default-insertion/. Which in turn in C++17 is described by
>> §26.2.1/15.2 as-if initialized by `allocator_traits<A>::construct(m,
>> p)` where `m` is an lvalue of the allocator type `A` and `p` points to
>> the uninitialized storage in the vector buffer. Then §23.10.8.2/5
>> describes that as either calling `a.construct(p,
>> std::forward<Args>(args)... )`, if that is well-formed, or otherwise
>> that it invokes  `::new (static_cast<void*>(p))
>> T(std::forward<Args>(args)...)`. Since with C++11 and later there is
>> no `construct` in the default allocator it's the latter, and with no
>> args it's a value initialization. Which zeroes.
>
> The default-constructor of any built-in type doesn't zero. Just try it
> yourself by doing a malloc and afterwards an iterated placement-new on
> every item.

You're confusing default initialization and value initialization.

In a way you're in good company: that confusion was there in C++98.

But it was corrected in C++03, by Andrew Koenig, and now we're some 16
years later.


> If you get a zero-initializatio, i.e. when you reserve a range so long
> that it isn't backed by the fragmented heap but mmap() or VirtualAlloc()
> which map to a common zero-page until there is a write-access to any
> page, this just happens by accident.

No.


Cheers & hth.,

- Alf

Chris Vine

unread,
Jan 22, 2019, 8:47:35 AM1/22/19
to
On Tue, 22 Jan 2019 09:36:09 +0100
David Brown <david...@hesbynett.no> wrote:
> But if you /really/ want to get a buffer of 64 GB space without touching
> any of it, use malloc(), not a C++ container. Just wrap it in an RAII
> class to be sure it is freed.

Leaving aside the undesirability you mention of allocating such a large
array, I should use std::unique_ptr for RAII rather than a home-grown
class, since it is available:

std::unique_ptr<char[], decltype(&free)> buf{(char*)malloc([my-big-size]),
&free};

Paavo Helde

unread,
Jan 22, 2019, 8:56:27 AM1/22/19
to
You need to read the docs. std::vector does not call the default
constructor, it calls std::allocator_traits<char>::construct() which
falls back to calling placement new with the expression 'char()', which
uses the value initialization and initializes the char to 0 (8.5/11: "An
object whose initializer is an empty set of parentheses, i.e., (), shall
be value-initialized").

This also means there is an option to leave the std::vector buffer
uninitialized by using a suitable specialization of std::allocator_traits.

Chris Vine

unread,
Jan 22, 2019, 9:15:28 AM1/22/19
to
And for the sake of the original poster (I know you know this), an
alternative is the following, because the new[] expression does not
value initialize built-in types:

std::unique_ptr<char[]> buf{new char[my-big-size]};

However this might allocate trivially more memory because the compiler
might store the size of the array in the allocated memory. I believe
gcc and clang would not, but VS might because its two argument version
of operator delete[]() is not standard-conforming.

David Brown

unread,
Jan 22, 2019, 9:18:47 AM1/22/19
to
Or you can use a class:

class Buffer {
char * data;
public:
Buffer() {
data = malloc(64ull * 1024 * 1024 * 1024);
}
~Buffer() {
free(data);
}
};

It is /massively/ simpler for a beginner at C++, and provides a skeleton
that can be built on with more methods. Learning C++ is hard enough,
without suggesting he jumps in at the deep end of the pool wearing lead
shoes.

Juha Nieminen

unread,
Jan 22, 2019, 9:43:18 AM1/22/19
to
Alf P. Steinbach <alf.p.stein...@gmail.com> wrote:
> A simple remedy is to just use an ordinary `new[]`-expression, or
> `make_unique`.

Doesn't the ordinary new[] operator default-initialize the array elements?

I have found conflicting information on this online (with some claiming
that it doesn't default-initialize basic integral types, while others
claim it always does), so I really don't know if that's the case.

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

Juha Nieminen

unread,
Jan 22, 2019, 9:46:20 AM1/22/19
to
leigh.v....@googlemail.com wrote:
> char* buf = new char[64 * 1024 * 1024 * 1024];

Does that default-initialize the elements or not? I have found conflicting
information online about that.

Also, 64*1024*1024*1024 is of type int. Wouldn't it overflow in most
typical target architectures? Or does the compiler automatically
promote literals to 64-bit if they would overflow otherwise?

Chris Vine

unread,
Jan 22, 2019, 9:48:00 AM1/22/19
to
On Tue, 22 Jan 2019 15:18:37 +0100
There is a bit more to it than that. You need to be able to get at the
buffer, such as by having your class support operator[].

Is the two argument constructor of unique_ptr that intimidating? Maybe
it is. In that case, for handling memory I would make a deleter struct
with operator() rather than have the hassle of making a handle class[1].
Then it is just:

std::unique_ptr<char[], MyDeleter> buf{(char*)malloc([my-big-size])};

Maybe the OP should do both for pedagogical purposes.

[1] Something as simple as this would do:
struct MyDeleter {
void operator()(void* p) {free(p);}
};

Chris Vine

unread,
Jan 22, 2019, 10:02:00 AM1/22/19
to
On Tue, 22 Jan 2019 14:46:11 +0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
> leigh.v....@googlemail.com wrote:
> > char* buf = new char[64 * 1024 * 1024 * 1024];
>
> Does that default-initialize the elements or not? I have found conflicting
> information online about that.

It default initializes but does not value initialize. Default
intialization of built-in types does nothing. The values are
indeterminate.

David Brown

unread,
Jan 22, 2019, 10:13:39 AM1/22/19
to
Yes, I know. It was not meant to be complete code - just a starting point.

>
> Is the two argument constructor of unique_ptr that intimidating? Maybe
> it is.

I think it could be, and I think the "decltype" is also an advanced
concept. (I have no idea where the OP is in the path along C++.)

> In that case, for handling memory I would make a deleter struct
> with operator() rather than have the hassle of making a handle class[1].
> Then it is just:
>
> std::unique_ptr<char[], MyDeleter> buf{(char*)malloc([my-big-size])};
>
> Maybe the OP should do both for pedagogical purposes.

Perhaps so.

Of course, he could also re-think the whole implementation - for both
pedagogical and practical reasons!

Öö Tiib

unread,
Jan 22, 2019, 10:30:06 AM1/22/19
to
What Paavo said just that on case of std::vector<std::byte> the
std::allocator<std::byte> uses std::byte() not char() for value-initialization
that results with zero on all platforms that I can reach.
So on what platform it behaves differently? No need for screaming
caps if you didn't actually try but just assumed that it is logical.
C++ does what its specs say and that is often considered quite
unexpected and irrational.

james...@alumni.caltech.edu

unread,
Jan 22, 2019, 11:12:45 AM1/22/19
to
On Tuesday, January 22, 2019 at 9:46:20 AM UTC-5, Juha Nieminen wrote:
> leigh.v....@googlemail.com wrote:
> > char* buf = new char[64 * 1024 * 1024 * 1024];
>
> Does that default-initialize the elements or not? I have found conflicting
> information online about that.
>
> Also, 64*1024*1024*1024 is of type int. Wouldn't it overflow in most
> typical target architectures? Or does the compiler automatically
> promote literals to 64-bit if they would overflow otherwise?

Individual literals are always given a type that is large enough to
represent them (5.13.2p2), as long as there is any such type. However,
there's no such process involved in the evaluations of expressions like
that.

Even the first partial product, 64*1024, is not guaranteed representable
in an int, and the final result is not guaranteed to be representable
even as an unsigned long. The minimal fix that's guaranteed to avoid an
overflow is to change 64 to 64LL - that will require all of the
calculations to be performed as a long long.

james...@alumni.caltech.edu

unread,
Jan 22, 2019, 11:23:05 AM1/22/19
to
On Tuesday, January 22, 2019 at 12:00:04 AM UTC-5, anhongl...@gmail.com wrote:
...
> Some people warned me the following code is dangerous...
> Because assessing data() outside of range [data, data + size) is undefined:

Yes, such access does indeed have undefined behavior - but it's not
clear that the following code contains any examples of that problem:

> ```cpp
> std::vector<char> buf;
>
> buf.reserve(64 * 1024 * 1024 * 1024);
> fread(buf.data(), buf.size(), 1, fp);
>
> func(buf.data()); // consuming it

Depending upon the definition of func(), you might be accessing beyond
the end of the array. If for example, func() is expecting to receive a
pointer to the first character of a null-terminated array of char, and
if you've done nothing to guarantee that one of the bytes of that array
is null, that would explain the warning you got. However, the code
you've provided us gives no evidence about that possibility.

Alf P. Steinbach

unread,
Jan 22, 2019, 12:41:44 PM1/22/19
to
On 22.01.2019 15:43, Juha Nieminen wrote:
> Alf P. Steinbach <alf.p.stein...@gmail.com> wrote:
>> A simple remedy is to just use an ordinary `new[]`-expression, or
>> `make_unique`.
>
> Doesn't the ordinary new[] operator default-initialize the array elements?

It does, yes, but for POD that means doing nothing.

C++17 §8.3.4/1 “If the new-initializer is omitted, the object is
default-initialized”, which goes on to note that if the results in
nothing being done, then the object has an indeterminate value.


> I have found conflicting information on this online (with some claiming
> that it doesn't default-initialize basic integral types, while others
> claim it always does), so I really don't know if that's the case.

Either claim would essentially be correct, because
default-initialization of an object of basic integral type (such as each
item in the array) does nothing.

In C++17 this is covered by the definition of default-initialization in
$11.6/7, where for the case of not a class type and not an array either,
$11.6/7.3 simply says “no initialization is performed”.

So, since

new T[314]

default-initializes, for a POD T it's not guaranteed to zero the memory,
or anything: it just yields indeterminate values. That's the same lack
of guarantee as with calling the allocation function directly.

Since indeterminate value includes a possible value 0 an implementation
might choose to “helpfully” let the allocation function clear the
memory, possibly with some option to specify or avoid that, but I found
no such option for g++ now.

However,

new T[314]()

is guaranteed to value-initialize the array items, which for POD array
items results in a zeroing. In C++17 the value initialization is
specified indirectly by §8.3.4/2, that when there is an initializer “the
new-initializer is interpreted according to the initialization rules of
11.6 for direct-initialization”, where §11.6/17.4 says “If the
initializer is (), the object is value-initialized”.

Cheers!,

- Alf

Christian Gollwitzer

unread,
Jan 22, 2019, 1:02:18 PM1/22/19
to
Am 22.01.19 um 12:00 schrieb fir:
1) NVMe is a new bus/protocol for fast SSDs that are attached directly
to the PCIe bus. 500 MB/s are the limit of the SATA bus, not the drive.

2) 0.5MB/s is pretty bad, but WD is not the best brand. The really fast
SSDs are from Samsung (Evo Pro) and Intel (Optane).

https://www.samsung.com/semiconductor/minisite/ssd/product/consumer/ssd960/

EVO Pro
Sequential read: 3.5 GB/s
Random 4k blocks : 440,000 /s ~ 1.6 GB/s

Christian

Juha Nieminen

unread,
Jan 23, 2019, 3:56:33 AM1/23/19
to
Alf P. Steinbach <alf.p.stein...@gmail.com> wrote:
> So, since
>
> new T[314]
>
> default-initializes, for a POD T it's not guaranteed to zero the memory,
> or anything: it just yields indeterminate values. That's the same lack
> of guarantee as with calling the allocation function directly.

So a better answer to the original question would indeed be

std::unique_ptr<unsigned char[]> data(new unsigned char[1234]);

which is guaranteed to not initialize the allocated array.

I suppose even an old dog can learn new things about C++.
0 new messages