On Monday, 24 June 2019 14:28:50 UTC+3, Juha Nieminen wrote:
> Öö Tiib <
oot...@hot.ee> wrote:
> > Indeed. But that does not mean we need shared_ptr to raw array
> > either. We can use shared_ptr to std::array or std::vector like we
> > use just plain std::array or std::vector when we need non-shared
> > array.
>
> The size of an std::array needs to be known at compile time. I think it's
> probably very rare to need a shared pointer to a compile-time array that's
> being nevertheless being allocated dynamically at runtime. I suppose it's
> not inconceivable, but rare.
That likelihood perhaps depends on problem domain / domain of knowledge
for what we write applications. In domains for what I have written the upper
limits are often clear and shared ownership of virtually unbounded arrays
feels unusual to extreme.
> As for allocating an std::vector dynamically, and having a shared pointer
> to it... Unless you *really* need the functions provided by the class,
> why would you want this double indirection? If all you need is just an
> array of values and pretty much nothing else from it (other than being
> able to index it), why add needless overhead by using a dynamically
> allocated std::vector?
Because it would simplify life with (practically unbounded?) shared array.
If it is potentially major, multi-megabyte array there then I would likely
make std::vector<Element> to be data member of separate class and
so what we talk about here is std::shared_ptr<ThatSeparateClass> and
not shared_ptr<Element[]>.
> >> Lately I have myself starting thinking, however, that for things like
> >> temporary buffers and other "lightweight" dynamic arrays that do not
> >> need any of the functionality that std::vector provides, nor whose
> >> size can be determined at compile time, whether it would be better
> >> to use an std::unique_ptr<Type[]> instead of std::vector<Type>.
> >> The former can be slightly more efficient if Type is a primitive type,
> >> and you don't burden your code with everything that std::vector
> >> provides. (Of course this is at the cost of everything that it does
> >> provide, such as its various assign() functions, size(), empty(),
> >> and so on and so forth.)
> >
> > On the ultra rare cases (when that matters)
>
> Not so ultra rare. If you are eg. reading data from a file using
> std::fread(), for instance, you often need a temporary array to
> read into. If you are reading the *entire* file into the array,
> the size of that array is only known at runtime.
The particular use-case (unless I misunderstood) is sequential
std::fread() from potentially large file or from unknown length
stream?
My experience is that reading in 4096 byte chunks is close to
optimal default (so length does not need to be dynamic) and
same buffer can be reused for whole file/stream (so also its
allocation does not need to be so dynamic). IOW std::array is
usually splendid for that. The run-time data structure into what
the file is read can be anything but usually has finer
organization than (potentially) large immutable scoped
arrays.
> In this (not so "ultra rare") case std::unique_ptr<[]> might be
> a slightly more lightweight solution than std::vector (especially
> since in this case you are probably using a primitive type, and you
> probably don't need the array initialized because you are filling it
> with data anyway), with little to no drawbacks, if you don't need
> anything that std::vector provides.
My starting point is one, 4-8 KB std::array (+ one integer to
indicate how "full" it is) for sequentially reading one file.
It typically remains that regardless if it is small or multi-megabyte
file. On ~5% of cases when that is not performant enough I go full
way to best what is available (like mmap) and for managing that
it is better to have special class (not to hack unique_ptr<char[]>
somehow). OTOH if we talk about hundreds of megabytes
of data then there are database engines for that.
> > we may want to take a
> > step back and to investigate why we are allocating/releasing arrays
> > in such very busy loop and/or we may want to take a step forward
> > and to consider non-standard things like alloca() or VLA for array.
>
> Besides it being non-standard, you'll be burdening the stack with
> a potentially very large amount of data. It would be bad if you
> run out of stack space.
We were talking about allocations/deallocations in so tight sequence
that it alters performance. It means there are lot of relatively small
(not virtually unbounded) buffers involved? Is it is uncertain for
software designer if what the program is dealing with is small or
potentially huge data?
Take projectiles for analogy. Buckshot cartridges are better for
small game and centre-fire cartridges are better for heavy game
but for a tank we need mortar shells. "Silver bullets" that are optimal
for all cases do not exist, so if it is potentially rabbit but potentially
also tank then we perhaps have to carry all three weapons and
choose dynamically what we shoot it with. Now there is also bow
with what small game is hard to hit but heavy game is dangerous
to wound and is hopeless against tank.
Same with software: small std::arrays (possibly in stack) are better
for small data and medium data is better in containers like
std::unordered_map but for large data we likely need database
engine (that deals with memory mapping and indexing internally).
I can imagine how software can pick dynamically between those
three choices but std::unique_ptr<x[]> feels like a bow there. ;)