On 15.10.2018 22:22,
james...@alumni.caltech.edu wrote:
> On Monday, October 15, 2018 at 3:21:46 PM UTC-4, Paul wrote:
>> I think that, when s is a std::string,
>> s[s.size()] == 0; However, I haven't been able to confirm it.
>> It works like that in my compiler.
>>
>> Is this correct?
>>
>
> [Re: std::basic_string<charT> s; s[pos]]
>
> "Returns: *(begin() + pos) if pos < size(). Otherwise, returns a
> reference to an object of type charT with value charT(), where modifying
> the object to any value other than charT() leads to undefined behavior."
> (20.3.2.5p2)
>
> charT defaults to char, so for typical uses charT() is essentially the
> same as 0. That will also be true, except for the type of the result,
> for the other three types that the standard templates are required to be
> specialize for: char16_t, char32_t, and wchar_t, respectively.
Worth noting for those using elder C++03 compilers, that until C++11
this guarantee and well-definedness was only offered for the `const`
accessor.
C++03 §21.3.4/1:
“Returns: If pos < size(), returns data()[pos]. Otherwise, if pos ==
size(), the const version returns charT(). Otherwise, the behavior is
undefined.”
The C++11 guarantees are more clean. But C++03 supported efficient
reference counted string buffers, called “COW” strings (copy on write),
and used by g++'s standard library implementation, by regarding pointers
and references as invalidated by calls to `data` or `c_str`, and by the
first call to non-`const` `operator[]`, `at`, `begin`, `rbegin`, `end`
and `rend`. In C++11 and later these operations don't invalidate, i.e.
C++11 std::string is easier to use correctly, and furthermore indexing
is required to be O(1), which rules out a COW buffer copying on first
access. C++11 `std::stringview` indexing is correspondingly designed to
return a reference to constant `char` instead of the `char` itself,
relying on the compiler to optimize away that indirection overhead.
What's paid for: that COW strings won't be re-introduced.
My impression is that the decision to rule out COW strings was caused by
a bug in the g++ implementation. Rather than admit that it was a bug,
g++ supporters took the position that a safe implementation was just
impossible, and so, implicitly, that the buggy one was a practical
trade-off, the best that one could do. They got got downright nasty
about the possibility of implementing COW correctly if the standard
didn't stand actively in the way. As I recall (this was years ago) I
supplied proof in the form of an actual implementation, to no avail.
Hardcore fans of this or that are usually immune to facts and logic, and
I've encountered that a great many times so I should have known, but.
Cheers!,
- Alf