Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Vector is guaranteed to store all elements in contiguous locations?

51 views
Skip to first unread message

Frederick Gotham

unread,
Oct 18, 2019, 5:50:52 AM10/18/19
to

I read this today:
"Vector is guaranteed to store all elements in contiguous locations"

Is this true?

Up until now, I've been copying vectors into normal C-style arrays to pass to encryption functions. But if the aforementioned is true then I can simply give the address of the first element: v.begin()

Ian Collins

unread,
Oct 18, 2019, 5:57:10 AM10/18/19
to
On 18/10/2019 22:50, Frederick Gotham wrote:
>
> I read this today: "Vector is guaranteed to store all elements in
> contiguous locations"
>
> Is this true?

Yes.

> Up until now, I've been copying vectors into normal C-style arrays to
> pass to encryption functions. But if the aforementioned is true then
> I can simply give the address of the first element: v.begin()

Yes.

--
Ian.

Bonita Montero

unread,
Oct 18, 2019, 10:51:43 AM10/18/19
to
> I read this today:
> "Vector is guaranteed to store all elements in contiguous locations"

"The elements of a vector are stored contiguously, meaning that
if v is a vector where T is some type other than bool, then it
obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size()."

Paavo Helde

unread,
Oct 18, 2019, 11:21:54 AM10/18/19
to
On 18.10.2019 12:50, Frederick Gotham wrote:
>
> I read this today:
> "Vector is guaranteed to store all elements in contiguous locations"
>
> Is this true?

Yes, this has been true formally since 2003. For strings a similar
guarantee was introduced in C++11.

AFAIK in practice both strings and vectors were contiguous in all known
implementations anyway, so the standards just formalized the existing
practice.

Note there is a spatial case of std::vector<bool> which is still
contiguous, but does not have the same memory layout than a bool array[].

Alf P. Steinbach

unread,
Oct 18, 2019, 10:37:22 PM10/18/19
to
On 18.10.2019 17:21, Paavo Helde wrote:
> On 18.10.2019 12:50, Frederick Gotham wrote:
>>
>> I read this today:
>> "Vector is guaranteed to store all elements in contiguous locations"
>>
>> Is this true?
>
> Yes, this has been true formally since 2003.

I am surprised that you know that.

In 2008 Herb Sutter corrected Andrew Koenig, who had implicitly opined
that the contiguous buffer was not part of the vector abstraction, not
something that one should rely on; <url:
https://herbsutter.com/2008/04/07/cringe-not-vectors-are-guaranteed-to-be-contiguous/>

It was a bit ironic since Andrew was secretary of the international C++
standardization committee for C++03, so if he didn't write the new
vector requirement letter by letter then at least he copy and pasted it,
and noted it in his "unofficial" list of C++03 changes.


> For strings a similar guarantee was introduced in C++11.

Or marginally less formally, at the Lillehammer meeting in 2005, as I
recall.


> AFAIK in practice both strings and vectors were contiguous in all known
> implementations anyway, so the standards just formalized the existing
> practice.
>
> Note there is a spatial case of std::vector<bool> which is still
> contiguous, but does not have the same memory layout than a bool array[].

s/does not have/does not necessarily have/

or

s/does not have/does not guarantee/

or

s/does not have/does not in practice have/

- Alf

Soviet_Mario

unread,
Oct 19, 2019, 5:16:12 AM10/19/19
to
On 18/10/2019 17:21, Paavo Helde wrote:
> On 18.10.2019 12:50, Frederick Gotham wrote:
>>
>> I read this today:
>> "Vector is guaranteed to store all elements in contiguous
>> locations"
>>
>> Is this true?
>
> Yes, this has been true formally since 2003. For strings a
> similar guarantee was introduced in C++11.

but strings means the "descriptors", not the actual text
data, right ? I mean, there is a further indirection before
accessing the content, comparing to native types, or am I
wrong ? If so, even if the array of pointers is monolythic,
the true text data of each item should still be "sparse", or
not ?

>
> AFAIK in practice both strings and vectors were contiguous
> in all known implementations anyway, so the standards just
> formalized the existing practice.
>
> Note there is a spatial case of std::vector<bool> which is
> still contiguous, but does not have the same memory layout
> than a bool array[].


--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)

Chris Vine

unread,
Oct 19, 2019, 7:40:50 AM10/19/19
to
On Sat, 19 Oct 2019 11:16:01 +0200
Soviet_Mario <Sovie...@CCCP.MIR> wrote:
> On 18/10/2019 17:21, Paavo Helde wrote:
> > On 18.10.2019 12:50, Frederick Gotham wrote:
> >>
> >> I read this today:
> >> "Vector is guaranteed to store all elements in contiguous
> >> locations"
> >>
> >> Is this true?
> >
> > Yes, this has been true formally since 2003. For strings a
> > similar guarantee was introduced in C++11.
>
> but strings means the "descriptors", not the actual text
> data, right ? I mean, there is a further indirection before
> accessing the content, comparing to native types, or am I
> wrong ? If so, even if the array of pointers is monolythic,
> the true text data of each item should still be "sparse", or
> not?

Strings don't have "descriptors" so it is not clear (to me at least)
what you mean.

In any event the requirement for contiguous storage of string data is
in §21.4.1/5 of C++11:

"The char-like objects in a basic_string object shall be stored
contiguously. That is, for any basic_string object s, the identity
&*(s.begin() + n) == &*s.begin() + n shall hold for all values of n
such that 0 <= n < s.size()."

What is meant by "char-like objects" is set out in §21.4/1:

"The class template basic_string describes objects that can store a
sequence consisting of a varying number of arbitrary char-like objects
with the first element of the sequence at position zero. Such a
sequence is also called a “string” if the type of the char-like
objects that it holds is clear from context. In the rest of this
Clause, the type of the char-like objects held in a basic_string
object is designated by charT".

For std::string the char-like objects are of type char; for
std::wstring they are of type wchar_t; and so on for std::u16string and
std::u32string ... .

Paavo Helde

unread,
Oct 19, 2019, 11:02:50 AM10/19/19
to
On 19.10.2019 12:16, Soviet_Mario wrote:
> On 18/10/2019 17:21, Paavo Helde wrote:
>> On 18.10.2019 12:50, Frederick Gotham wrote:
>>>
>>> I read this today:
>>> "Vector is guaranteed to store all elements in contiguous locations"
>>>
>>> Is this true?
>>
>> Yes, this has been true formally since 2003. For strings a similar
>> guarantee was introduced in C++11.
>
> but strings means the "descriptors", not the actual text data, right ? I
> mean, there is a further indirection before accessing the content,
> comparing to native types, or am I wrong ? If so, even if the array of
> pointers is monolythic, the true text data of each item should still be
> "sparse", or not ?

I was comparing std::string to std::vector<char>, not to
std::vector<std::string>. The contiguousness guarantee for a std::string
formalized in 2005 or 2011 just means that &s[0] + i == &s[i] inside a
single string.

Regarding std::vector<std::string>, the std::string objects in a
std::vector are still contiguous, i.e. you can fetch out a std::string*
pointer via data() and just use pointer arithmetic to access other
std::string objects in the contiguous buffer. But you are right, this
does not mean the character data of all those strings were contiguous or
even in order in memory, typically they would be all scattered around.

If the string implementation uses small string optimization and all
strings are small enough, then the character arrays would reside
directly inside the std::string objects, making them to be in order and
somewhat "contiguous" in memory, but not quite.

Regardless of implementation details it is physically impossible to have
the character data of more than one std::string contiguous in memory,
because the character array of each string has to be followed by a zero
byte (to satisfy the const c_str() member function requirements), but
this zero byte is not considered to be a part of the character data of
the string. So the character arrays of two different strings cannot be
contiguous, there must be at least one zero byte between them.


Soviet_Mario

unread,
Oct 19, 2019, 7:24:25 PM10/19/19
to
On 19/10/2019 17:02, Paavo Helde wrote:
> On 19.10.2019 12:16, Soviet_Mario wrote:
>> On 18/10/2019 17:21, Paavo Helde wrote:
>>> On 18.10.2019 12:50, Frederick Gotham wrote:
>>>>
>>>> I read this today:
>>>> "Vector is guaranteed to store all elements in
>>>> contiguous locations"
>>>>
>>>> Is this true?
>>>
>>> Yes, this has been true formally since 2003. For strings
>>> a similar
>>> guarantee was introduced in C++11.
>>
>> but strings means the "descriptors", not the actual text
>> data, right ? I
>> mean, there is a further indirection before accessing the
>> content,
>> comparing to native types, or am I wrong ? If so, even if
>> the array of
>> pointers is monolythic, the true text data of each item
>> should still be
>> "sparse", or not ?
>
> I was comparing std::string to std::vector<char>, not to
> std::vector<std::string>.

oh ! Now I got the point.
I had always taken for granted that the "content" (of a
std::string) could not be broken in slices spread over and
linked somewhat.

for vector<char> I did not know, but later I read the same
features stated here in the 3D


> The contiguousness guarantee for a
> std::string formalized in 2005 or 2011 just means that &s[0]
> + i == &s[i] inside a single string.
>
> Regarding std::vector<std::string>, the std::string objects
> in a std::vector are still contiguous, i.e. you can fetch
> out a std::string* pointer via data() and just use pointer
> arithmetic to access other std::string objects in the
> contiguous buffer. But you are right, this does not mean the
> character data of all those strings were contiguous or even
> in order in memory, typically they would be all scattered
> around.

Okay. It seemed a too memory restrictive approach

>
> If the string implementation uses small string optimization
> and all strings are small enough, then the character arrays
> would reside directly inside the std::string objects, making
> them to be in order and somewhat "contiguous" in memory, but
> not quite.
>
> Regardless of implementation details it is physically
> impossible to have the character data of more than one
> std::string contiguous in memory, because the character
> array of each string has to be followed by a zero byte (to
> satisfy the const c_str() member function requirements), but
> this zero byte is not considered to be a part of the
> character data of the string. So the character arrays of two
> different strings cannot be contiguous, there must be at
> least one zero byte between them.

ah, nice ! So inkoking c_str() does not imply any copying !
Faster, for sure.

in Basic (gambas) passing strings to C routines alas implies
copying (as a basic string is zero-indifferent and stores
pointer and size).

Paavo Helde

unread,
Oct 20, 2019, 3:29:44 AM10/20/19
to
On 20.10.2019 2:24, Soviet_Mario wrote:
> ah, nice ! So inkoking c_str() does not imply any copying ! Faster, for
> sure.

Yes, see e.g. "https://en.cppreference.com/w/cpp/string/basic_string/c_str":

"c_str() and data() perform the same function. (since C++11)"

Initially the language designers wanted to leave room for clever
optimizations where c_str() indeed might have involved some copying and
zero terminator appending, but this did not quite work out. The rules
about when iterators and pointers might get invalidated were getting
unwieldy and with the raise of multithreading era optimizations like COW
started to appear more like pessimizations, so in C++11 they decided to
go the other way, making the invalidation rules simple and unsurprising.
The main change was the requirement that no call of operator[]() may
invalidate any iterators, references and pointers obtained earlier via
e.g. data() or c_str().

This effectively restricted any possible basic_string implementations to
always use a simple unshared contiguous zero-terminated character array.
And thus data() and c_str() became the same noexcept trivial operations.

One can argue that the interface of basic_string is just too broad and
diffuse to allow for certain kind of optimizations like discontiguous
data buffers. There is a non-standard extension std::rope provided by
some implementations to solve such issues, but this is not in the
official C++ standard (I guess it's considered too specialized).


Öö Tiib

unread,
Oct 20, 2019, 5:30:48 AM10/20/19
to
On Sunday, 20 October 2019 10:29:44 UTC+3, Paavo Helde wrote:
>
> One can argue that the interface of basic_string is just too broad and
> diffuse to allow for certain kind of optimizations like discontiguous
> data buffers. There is a non-standard extension std::rope provided by
> some implementations to solve such issues, but this is not in the
> official C++ standard (I guess it's considered too specialized).

AFAIK the rope was in STL of SGI (but was not in STL of HP) already
before C++ standardization. It wasn't incorporated into C++ standard
library.

Both the name "STL" and usage of "std" namespace by those libraries
has caused confusion forever.
With boost it is lot easier, the differences for example between
boost::unordered_set and std::unordered_set are lot simpler to
discuss than the differences between "std::"hash_set of SGI STL
and std::unordered_set.


Melzzzzz

unread,
Oct 20, 2019, 6:46:01 AM10/20/19
to
On 2019-10-20, Öö Tiib <oot...@hot.ee> wrote:
> On Sunday, 20 October 2019 10:29:44 UTC+3, Paavo Helde wrote:
>>
>> One can argue that the interface of basic_string is just too broad and
>> diffuse to allow for certain kind of optimizations like discontiguous
>> data buffers. There is a non-standard extension std::rope provided by
>> some implementations to solve such issues, but this is not in the
>> official C++ standard (I guess it's considered too specialized).
>
> AFAIK the rope was in STL of SGI (but was not in STL of HP) already
> before C++ standardization. It wasn't incorporated into C++ standard
> library.

Why? What was said against it?



--
press any key to continue or any other to quit...
U ničemu ja ne uživam kao u svom statusu INVALIDA -- Zli Zec
Na divljem zapadu i nije bilo tako puno nasilja, upravo zato jer su svi
bili naoruzani. -- Mladen Gogala

Öö Tiib

unread,
Oct 20, 2019, 2:32:36 PM10/20/19
to
On Sunday, 20 October 2019 13:46:01 UTC+3, Melzzzzz wrote:
> On 2019-10-20, Öö Tiib <oot...@hot.ee> wrote:
> > On Sunday, 20 October 2019 10:29:44 UTC+3, Paavo Helde wrote:
> >>
> >> One can argue that the interface of basic_string is just too broad and
> >> diffuse to allow for certain kind of optimizations like discontiguous
> >> data buffers. There is a non-standard extension std::rope provided by
> >> some implementations to solve such issues, but this is not in the
> >> official C++ standard (I guess it's considered too specialized).
> >
> > AFAIK the rope was in STL of SGI (but was not in STL of HP) already
> > before C++ standardization. It wasn't incorporated into C++ standard
> > library.
>
> Why? What was said against it?

I have not seen any relevant WG21 discussion notes on strings and ropes.

The templates themselves were quite revolutionary back then and so heavily
opposed. The EC++ for example kicked templates out despite it is clear how
good are templates in embedded development. Committee's focus was on standardizing what was most urgently needed and most ready
to ship. Anyway some weak things slipped in (like valarray or
vector<bool>) but overall it was great that C++ got standardized with
library.

Melzzzzz

unread,
Oct 20, 2019, 2:45:02 PM10/20/19
to
On 2019-10-20, Öö Tiib <oot...@hot.ee> wrote:
> On Sunday, 20 October 2019 13:46:01 UTC+3, Melzzzzz wrote:
>> On 2019-10-20, Öö Tiib <oot...@hot.ee> wrote:
>> > On Sunday, 20 October 2019 10:29:44 UTC+3, Paavo Helde wrote:
>> >>
>> >> One can argue that the interface of basic_string is just too broad and
>> >> diffuse to allow for certain kind of optimizations like discontiguous
>> >> data buffers. There is a non-standard extension std::rope provided by
>> >> some implementations to solve such issues, but this is not in the
>> >> official C++ standard (I guess it's considered too specialized).
>> >
>> > AFAIK the rope was in STL of SGI (but was not in STL of HP) already
>> > before C++ standardization. It wasn't incorporated into C++ standard
>> > library.
>>
>> Why? What was said against it?
>
> I have not seen any relevant WG21 discussion notes on strings and ropes.
>
> The templates themselves were quite revolutionary back then and so heavily
> opposed. The EC++ for example kicked templates out despite it is clear how
> good are templates in embedded development.

In that time embedded computers where really limiting, so no wander. Now
embedded devices have 512MB+...

Committee's focus was on standardizing what was most urgently needed and most ready
> to ship. Anyway some weak things slipped in (like valarray or
> vector<bool>) but overall it was great that C++ got standardized with
> library.

So how we got queue and deque but not rope? Strange....

Öö Tiib

unread,
Oct 20, 2019, 4:24:21 PM10/20/19
to
On Sunday, 20 October 2019 21:45:02 UTC+3, Melzzzzz wrote:
> On 2019-10-20, Öö Tiib <oot...@hot.ee> wrote:
> > On Sunday, 20 October 2019 13:46:01 UTC+3, Melzzzzz wrote:
> >> On 2019-10-20, Öö Tiib <oot...@hot.ee> wrote:
> >> > On Sunday, 20 October 2019 10:29:44 UTC+3, Paavo Helde wrote:
> >> >>
> >> >> One can argue that the interface of basic_string is just too broad and
> >> >> diffuse to allow for certain kind of optimizations like discontiguous
> >> >> data buffers. There is a non-standard extension std::rope provided by
> >> >> some implementations to solve such issues, but this is not in the
> >> >> official C++ standard (I guess it's considered too specialized).
> >> >
> >> > AFAIK the rope was in STL of SGI (but was not in STL of HP) already
> >> > before C++ standardization. It wasn't incorporated into C++ standard
> >> > library.
> >>
> >> Why? What was said against it?
> >
> > I have not seen any relevant WG21 discussion notes on strings and ropes.
> >
> > The templates themselves were quite revolutionary back then and so heavily
> > opposed. The EC++ for example kicked templates out despite it is clear how
> > good are templates in embedded development.
>
> In that time embedded computers where really limiting, so no wander. Now
> embedded devices have 512MB+...

So what? I think the primary issue with opponents was (and still is) that
they consider templates something space consuming and inefficient.
Something with shortcomings of containers of Simula or Smalltalk or
with generics of Ada. However that is not the case. Templates of
C++ tend let compilers to generate as time- and space-efficient code
as best specialist will do by hand.


> Committee's focus was on standardizing what was most urgently needed and most ready
> > to ship. Anyway some weak things slipped in (like valarray or
> > vector<bool>) but overall it was great that C++ got standardized with
> > library.
>
> So how we got queue and deque but not rope? Strange....

Back then it felt good that we at least got more-or-less
passable standard string. Otherwise each library had its own astring,
bstring, cstring, ... qstring ... wxstring.

.

Paavo Helde

unread,
Oct 20, 2019, 4:33:26 PM10/20/19
to
On 20.10.2019 21:44, Melzzzzz wrote:
> On 2019-10-20, Öö Tiib <oot...@hot.ee> wrote:
>> On Sunday, 20 October 2019 13:46:01 UTC+3, Melzzzzz wrote:
>>> On 2019-10-20, Öö Tiib <oot...@hot.ee> wrote:
>>>> On Sunday, 20 October 2019 10:29:44 UTC+3, Paavo Helde wrote:
>>>>>
>>>>> One can argue that the interface of basic_string is just too broad and
>>>>> diffuse to allow for certain kind of optimizations like discontiguous
>>>>> data buffers. There is a non-standard extension std::rope provided by
>>>>> some implementations to solve such issues, but this is not in the
>>>>> official C++ standard (I guess it's considered too specialized).
>>>>
>>>> AFAIK the rope was in STL of SGI (but was not in STL of HP) already
>>>> before C++ standardization. It wasn't incorporated into C++ standard
>>>> library.
>>>
>>> Why? What was said against it?
>>
>> I have not seen any relevant WG21 discussion notes on strings and ropes.
>>
>> The templates themselves were quite revolutionary back then and so heavily
>> opposed. The EC++ for example kicked templates out despite it is clear how
>> good are templates in embedded development.
>
> In that time embedded computers where really limiting, so no wander. Now
> embedded devices have 512MB+...

Templates take a lot of resources only in the compiler, not in the
target environment.

It is true that one can instantiate a template for 10 different types
and get 10x code size with bad optimizers, but if the embedded
developers do not realize how many different templates they instantiate
they do not deserve their paychecks.

>
> Committee's focus was on standardizing what was most urgently needed and most ready
>> to ship. Anyway some weak things slipped in (like valarray or
>> vector<bool>) but overall it was great that C++ got standardized with
>> library.
>
> So how we got queue and deque but not rope? Strange....

Deque is a very fine container, compared to some other like list which
only got standardized because of their name while having almost no usage
cases.


Melzzzzz

unread,
Oct 20, 2019, 5:16:15 PM10/20/19
to
I think that primary reason for ditching templates is that "write once
get errors everywhere" was main problem with them.

>
>
>> Committee's focus was on standardizing what was most urgently needed and most ready
>> > to ship. Anyway some weak things slipped in (like valarray or
>> > vector<bool>) but overall it was great that C++ got standardized with
>> > library.
>>
>> So how we got queue and deque but not rope? Strange....
>
> Back then it felt good that we at least got more-or-less
> passable standard string. Otherwise each library had its own astring,
> bstring, cstring, ... qstring ... wxstring.

Yeah, that's really annoying, having to use this string with that
library....

Melzzzzz

unread,
Oct 20, 2019, 5:18:21 PM10/20/19
to
This is case now. Not when EC++ was postulated. Other thing is that
exceptions were also out of question...

David Brown

unread,
Oct 20, 2019, 6:21:53 PM10/20/19
to
On 20/10/2019 23:18, Melzzzzz wrote:
> On 2019-10-20, Paavo Helde <myfir...@osa.pri.ee> wrote:
>> On 20.10.2019 21:44, Melzzzzz wrote:
>>> On 2019-10-20, Öö Tiib <oot...@hot.ee> wrote:
>>>> On Sunday, 20 October 2019 13:46:01 UTC+3, Melzzzzz wrote:
>>>>> On 2019-10-20, Öö Tiib <oot...@hot.ee> wrote:
>>>>>> On Sunday, 20 October 2019 10:29:44 UTC+3, Paavo Helde wrote:
>>>>>>>
>>>>>>> One can argue that the interface of basic_string is just too broad and
>>>>>>> diffuse to allow for certain kind of optimizations like discontiguous
>>>>>>> data buffers. There is a non-standard extension std::rope provided by
>>>>>>> some implementations to solve such issues, but this is not in the
>>>>>>> official C++ standard (I guess it's considered too specialized).
>>>>>>
>>>>>> AFAIK the rope was in STL of SGI (but was not in STL of HP) already
>>>>>> before C++ standardization. It wasn't incorporated into C++ standard
>>>>>> library.
>>>>>
>>>>> Why? What was said against it?
>>>>
>>>> I have not seen any relevant WG21 discussion notes on strings and ropes.
>>>>
>>>> The templates themselves were quite revolutionary back then and so heavily
>>>> opposed. The EC++ for example kicked templates out despite it is clear how
>>>> good are templates in embedded development.
>>>
>>> In that time embedded computers where really limiting, so no wander. Now
>>> embedded devices have 512MB+...
>>
>> Templates take a lot of resources only in the compiler, not in the
>> target environment.

There are many things about EC++ that made no sense at all. It also
banned namespaces - which take no resources on the target, and
negligible resources on the compiler end. It is not a surprise that it
never really caught on, and is now completely dead.

>>
>> It is true that one can instantiate a template for 10 different types
>> and get 10x code size with bad optimizers, but if the embedded
>> developers do not realize how many different templates they instantiate
>> they do not deserve their paychecks.

Modern tools will often reduce this extra code by combining identical
functions - though this was not the case when EC++ was thrown together.

>
> This is case now. Not when EC++ was postulated. Other thing is that
> exceptions were also out of question...
>

EC++ banned exceptions, and also multiple inheritance - many modern C++
standards (especially for embedded programming) do that. There are
arguments for and against this (unlike many of EC++'s restrictions, for
which there are no good justifications).
0 new messages