A contradiction in the description of basic

Vlad from Moscow

unread,

Apr 17, 2013, 2:00:08 PM4/17/13

to std-dis...@isocpp.org

Let start from the description of operators [] of class std::basic_string ([string.access]). The decription points out only one requirement pos <= size.

If pos exactly less than size() then operators return *(begin() + pos). Otherwise, they return "a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior" (quatation of the C++ STandard).

So using of operators does not require that an object of std::basic_string will be non-empty.

Now let consider the decription of member functions front(). There is written that effects of their usage is equivalent to operator [](0).

What does this mean? it means that if an object is empty the functions will return "a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior".

It seems that all is o'k. However it is not the truth. The description of member functions front() requires that the condition !empty() were satisfied.

So this code

std:;string s;

if ( s[0] ) { /*...*? }

is valid while this code

std::string s;

if ( s.front() ) { /*...*? }

is wrong.

The worse situation with member functions back(). They are equivalent to operator[](size() - 1). Take into account the expression in the parentheses.

Now let consider a simple task check whether the first and the last characters ina string are equal.

If the string is empty you can write on the one hand

s[0] to denote the first character. But what about the last character? Can you write s[ s.size() -1]? No you can not because if the string 's' is empty this expression is equivalent to s[-1].

ISo you need to make the code more compound. You need check whether the string is empty that to know may you directly compare the first characterv with the last.

For example you can write

if ( s.empty() || s[0] == s[s.size() - 1 ) { /*...*/ }

This is not very readable record. It would be much simpler to write

if ( s.front() == s.back() ) { /*...*/ }

This statement is very clear/ But you may not write such a way because the string can be empty and according to the standard you again will have

if ( s.front() == s[s.size() - 1] ) { /*...*/}

because s.back is equivalent to size() - 1.

I do not see any great sense in this requirement of the C++ Standard. Moreover it looks like a defect. In my opinion the both functions, front() and back(), indeed shall be equivalent to operators [] that is for example member functions back shall be equivalent to "operator[](size() - 1)" if the condition !empty() is true. "Otherwise, return a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior"

So I suggest remove the requirement !empty() from the descriptions of front() and back(). This requirement only confuses users and prevents to write generic clear and readable code.

operator []( 0 ), front() and back() shall behave yourself the same way when an obejct of type std::basic_string is empty.

Ville Voutilainen

unread,

Apr 17, 2013, 2:07:17 PM4/17/13

to std-dis...@isocpp.org

On 17 April 2013 21:00, Vlad from Moscow <vlad....@mail.ru> wrote:

Let start from the description of operators [] of class std::basic_string ([string.access]). The decription points out only one requirement pos <= size.

If pos exactly less than size() then operators return *(begin() + pos). Otherwise, they return "a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior" (quatation of the C++ STandard).

So using of operators does not require that an object of std::basic_string will be non-empty.

Correct.

Now let consider the decription of member functions front(). There is written that effects of their usage is equivalent to operator [](0).

What does this mean? it means that if an object is empty the functions will return "a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior".

That's not what it says.

It seems that all is o'k. However it is not the truth. The description of member functions front() requires that the condition !empty() were satisfied.

Yes, which means it's not equivalent to operator[](0) if the precondition isn't satisfied. Nothing to see here, move along.

So this code

std:;string s;
if ( s[0] ) { /*...*? }

is valid while this code

std::string s;
if ( s.front() ) { /*...*? }

is wrong.

The worse situation with member functions back(). They are equivalent to operator[](size() - 1). Take into account the expression in the parentheses.

It's equivalent only if the precondition !empty() is satisfied. Otherwise it isn't.

So I suggest remove the requirement !empty() from the descriptions of front() and back(). This requirement only confuses users and prevents to write generic clear and readable code.
operator []( 0 ), front() and back() shall behave yourself the same way when an obejct of type std::basic_string is empty.

That would mean that for an empty string, front()/back() return a character which is not even part of the string. While that's just
fine for operator[], it's not fine for front/back.

Vlad from Moscow

unread,

Apr 17, 2013, 2:13:15 PM4/17/13

to std-dis...@isocpp.org

Yes, you are right. And there is no any sense to keep this contradiction/ All free kinds of functions, operator []( 0 ), back(), front() shall behave the same way whether a string is empty or no.

Vlad from Moscow

unread,

Apr 17, 2013, 2:14:36 PM4/17/13

to std-dis...@isocpp.org

I am sorry A made a typo. It should be read as

All three kinds of functions, operator []( 0 ), back(), front() shall behave the same way whether a string is empty or no.

Nevin Liber

unread,

Apr 17, 2013, 2:18:49 PM4/17/13

to std-dis...@isocpp.org

On 17 April 2013 19:13, Vlad from Moscow <vlad....@mail.ru> wrote:

Yes, you are right. And there is no any sense to keep this contradiction/ All free kinds of functions, operator []( 0 ), back(), front() shall behave the same way whether a string is empty or no.

Even if you wish to remove it from front() (and I agree with Ville that we shouldn't for the reasons he stated), you cannot remove it from back(), as the back of a string is not and has never been the nul-terminator.

--
Nevin ":-)" Liber <mailto:ne...@eviloverlord.com> (847) 691-1404

Vlad from Moscow

unread,

Apr 17, 2013, 2:50:02 PM4/17/13

to std-dis...@isocpp.org

I am dorry but I do not see arguments. You are saying that back().."is not and has never been the nul-terminator." And what?

I think that for an empty string there shall be an invariant: operator [](0 ) == front() == back(). This will allow to write a generic code.

Consider a very simple task of finding a pare of adjacent elements of a container of strings where the last character of the previous string shall be equal to the first character of the next string. In fact you are using the condition

prev.back() == next.front()

But you may not use this condition. You can change it the following bway

prev.back() == next[0]

Now the right operand is a valid expresson. But what to do with the left operand?

Take into account also that either the ledt operand or the write operand or the both can be empty. So you should make the condition more compound and test each string whether it is empty. Why should you do all this? I do not see any sense. It would be much simpler and ckear to write the original expression

prev.back() == next.front()

and do not bother that strings can be empty.

Nicol Bolas

unread,

Apr 17, 2013, 8:33:31 PM4/17/13

to std-dis...@isocpp.org

On Wednesday, April 17, 2013 11:50:02 AM UTC-7, Vlad from Moscow wrote:

On Wednesday, April 17, 2013 10:18:49 PM UTC+4, Nevin ":-)" Liber wrote:
On 17 April 2013 19:13, Vlad from Moscow <vlad....@mail.ru> wrote:

Yes, you are right. And there is no any sense to keep this contradiction/ All free kinds of functions, operator []( 0 ), back(), front() shall behave the same way whether a string is empty or no.

Even if you wish to remove it from front() (and I agree with Ville that we shouldn't for the reasons he stated), you cannot remove it from back(), as the back of a string is not and has never been the nul-terminator.

--
Nevin ":-)" Liber <mailto:ne...@eviloverlord.com> (847) 691-1404

I am dorry but I do not see arguments. You are saying that back().."is not and has never been the nul-terminator." And what?
I think that for an empty string there shall be an invariant: operator [](0 ) == front() == back(). This will allow to write a generic code.
Consider a very simple task of finding a pare of adjacent elements of a container of strings where the last character of the previous string shall be equal to the first character of the next string. In fact you are using the condition

prev.back() == next.front()

But you may not use this condition.

Um, why not? If this container could contain empty strings, then such a test is very much wrong. Not unless you filter out empty strings and do something different for them.

You can change it the following bway

prev.back() == next[0]

Now the right operand is a valid expresson. But what to do with the left operand?

Take into account also that either the ledt operand or the write operand or the both can be empty. So you should make the condition more compound and test each string whether it is empty. Why should you do all this? I do not see any sense. It would be much simpler and ckear to write the original expression

prev.back() == next.front()

and do not bother that strings can be empty.

If the code worked the way you wanted, you would the wrong answer. If you had two empty strings next to each other, they would be considered to match this test. That is wrong, because two empty strings are empty; they contain no characters. Therefore, they cannot fit your description: "the last character of the previous string shall be equal to the first character of the next string".

Indeed, this is precisely why we don't do what you're asking for. It would give the false impression that `front` and `back` always return values no matter the size of the string. It would give the false impression that `begin` and `end` always return valid iterators. And so forth. People such as yourself will start treating the NULL-terminator as part of the string. Which it is not.

Vlad from Moscow

unread,

Apr 18, 2013, 4:07:58 AM4/18/13

to std-dis...@isocpp.org

On Thursday, April 18, 2013 4:33:31 AM UTC+4, Nicol Bolas wrote:

On Wednesday, April 17, 2013 11:50:02 AM UTC-7, Vlad from Moscow wrote:

On Wednesday, April 17, 2013 10:18:49 PM UTC+4, Nevin ":-)" Liber wrote:
On 17 April 2013 19:13, Vlad from Moscow <vlad....@mail.ru> wrote:

Yes, you are right. And there is no any sense to keep this contradiction/ All free kinds of functions, operator []( 0 ), back(), front() shall behave the same way whether a string is empty or no.

Even if you wish to remove it from front() (and I agree with Ville that we shouldn't for the reasons he stated), you cannot remove it from back(), as the back of a string is not and has never been the nul-terminator.

--
Nevin ":-)" Liber <mailto:ne...@eviloverlord.com> (847) 691-1404

I am dorry but I do not see arguments. You are saying that back().."is not and has never been the nul-terminator." And what?
I think that for an empty string there shall be an invariant: operator [](0 ) == front() == back(). This will allow to write a generic code.
Consider a very simple task of finding a pare of adjacent elements of a container of strings where the last character of the previous string shall be equal to the first character of the next string. In fact you are using the condition

prev.back() == next.front()

But you may not use this condition.

Um, why not? If this container could contain empty strings, then such a test is very much wrong. Not unless you filter out empty strings and do something different for them.

You can change it the following bway

prev.back() == next[0]

Now the right operand is a valid expresson. But what to do with the left operand?

Take into account also that either the ledt operand or the write operand or the both can be empty. So you should make the condition more compound and test each string whether it is empty. Why should you do all this? I do not see any sense. It would be much simpler and ckear to write the original expression

prev.back() == next.front()

and do not bother that strings can be empty.

If the code worked the way you wanted, you would the wrong answer. If you had two empty strings next to each other, they would be considered to match this test. That is wrong, because two empty strings are empty; they contain no characters. Therefore, they cannot fit your description: "the last character of the previous string shall be equal to the first character of the next string".

It is not wrong. It is a predictable behaviour. If you do not want to find adjacent empty strings in this task you can write your own compount condition. It is the matter of taste.

Indeed, this is precisely why we don't do what you're asking for. It would give the false impression that `front` and `back` always return values no matter the size of the string. It would give the false impression that `begin` and `end` always return valid iterators. And so forth. People such as yourself will start treating the NULL-terminator as part of the string. Which it is not.

You already gave the false impression that operator []( 0 0 always returns a value no matter whether an object an empty.

The problem is more deeper than you think. We already have a contradiction that operator [](0 ) returns a value while front() does not returns a value though front() is described in terms of operator []. And I have not seen any reasonable arguments that to remove the artificial and non logical restriction of front() and back() for empty strings. As I said it is only prevent to write generic safe code. In fact you may not use front and back without any additional check. Everywhere where you are using front and back you should insert something as

s.size() == 0 ? s[0] : s.front();

or

s.size() == 0 ? s[0] : s.back();

So if I need to compare the first character with the last character then the expression will look something as

if ( s[0] == ( s.size() == 0 ? s[0] : s.back() )

But even this expression is not satisfied because according to your wrong logic if size() == 0 then I may not compare operator []( 0 ) with operator {} ( 0 ).

Take into account that you even may not to write s.front() instead of s[0] as the left operand of the condition.:)

Olaf van der Spek

unread,

Apr 18, 2013, 6:00:44 AM4/18/13

to std-dis...@isocpp.org

Op donderdag 18 april 2013 10:07:58 UTC+2 schreef Vlad from Moscow het volgende:

So if I need to compare the first character with the last character then the expression will look something as

if ( s[0] == ( s.size() == 0 ? s[0] : s.back() )

But even this expression is not satisfied because according to your wrong logic if size() == 0 then I may not compare operator []( 0 ) with operator {} ( 0 ).

Take into account that you even may not to write s.front() instead of s[0] as the left operand of the condition.:)

It's quite simple and works for all containers: if (s.empty() || s.front() == s.back())

Or if (!s.empty() && s.front() == s.back())

Vlad from Moscow

unread,

Apr 18, 2013, 6:44:30 AM4/18/13

to std-dis...@isocpp.org

I simplified the original example. The condition will be a more compound if you will compare the first character of the previous string with the last character of the next sttring.

Something as

if ( ( prev.empty() && next.empty() ) || ( !prev.empty() && !next.empty() && prev.back() == next.front() ) )

It is an awful code.

The problem is that s[0] and s.front() are not equivalent. I would prefer in many cases to use s.front() instead of s[0] and s.back() instead of !s.empty() && s.back() but these two records are not equivalent. It only confuses users. In my opinion the behavior of s[0], s.front() and s.back() for empty strings shall be the same.

You should take into account that std::basic_string is a special type of containers. It simulates character arrays.

Olaf van der Spek

unread,

Apr 18, 2013, 6:47:58 AM4/18/13

to std-dis...@isocpp.org

The point is that an empty string does NOT have any characters. It
doesn't have a first character (front), it doesn't have a last
character (back).

s[0] for an empty string refers to the terminator.

--
Olaf

Nevin Liber

unread,

Apr 18, 2013, 6:53:34 AM4/18/13

to std-dis...@isocpp.org

On 18 April 2013 11:44, Vlad from Moscow <vlad....@mail.ru> wrote:

\The problem is that s[0] and s.front() are not equivalent. I would prefer in many cases to use s.front() instead of s[0] and s.back() instead of !s.empty() && s.back() but these two records are not equivalent. It only confuses users.

Not fitting your mental model is not a bug in the standard. front(), back(), empty() and operator[] are consistent for all the sequence containers for the range [0..size()). The only extension for the model behind string is we have to nul-terminate it, as the underlying data needs to get passed to C APIs.

You should take into account that std::basic_string is a special type of containers. It simulates character arrays.

At this point, everyone who has commented do not agree with you. If you still feel strongly about this issue, your best bet is to write a proposal (since this is a feature request, not a bug) and come and present it at the Chicago meeting.

Unless new information is presented, I'm bowing out of this discussion.

Vlad from Moscow

unread,

Apr 18, 2013, 7:00:14 AM4/18/13

to std-dis...@isocpp.org

I agree with you but I am sure that front() and back() shall behave the same way as s[0] that is they shall refer to the terminator for an empty sttring. In this case it would be logically more consistent and allow to write generic code without some compound conditions.

For example for usual character arrays you can also introduce function front. For example

template <typename size_t N>

inline char front( char ( *s )[N] ) { return s[0]; }

It will be awful to introduce an exception in such a function when s is empty.

Olaf van der Spek

unread,

Apr 18, 2013, 7:02:40 AM4/18/13

to std-dis...@isocpp.org

On Thu, Apr 18, 2013 at 1:00 PM, Vlad from Moscow <vlad....@mail.ru> wrote:
> I agree with you but I am sure that front() and back() shall behave the same
> way as s[0] that is they shall refer to the terminator for an empty sttring.

I'm quite sure they shall not. :p

> In this case it would be logically more consistent and allow to write
> generic code without some compound conditions.

You should stop abusing the string terminator.

> For example for usual character arrays you can also introduce function
> front. For example
>
> template <typename size_t N>
> inline char front( char ( *s )[N] ) { return s[0]; }
>
> It will be awful to introduce an exception in such a function when s is
> empty.

Why?

--
Olaf

Vlad from Moscow

unread,

Apr 18, 2013, 7:05:48 AM4/18/13

to std-dis...@isocpp.org

You contradict yourself. You already said that "The only extension for the model behind string is we have to nul-terminate it, as the underlying data needs to get passed to C APIs." that is that std::basic_string is a special container with an extension or in other words with an exception.. If s[0] is an exception why can not front() and back() be exceptions from this model and be consistent with s[0]?

Vlad from Moscow

unread,

Apr 18, 2013, 7:08:56 AM4/18/13

to std-dis...@isocpp.org

Because this contradicts the whole model of using C-string functions. If you want be sure that s is not NULL or s is not empty then you can do this before calling the function. You do not check inside the function for example that the destination and source in function strcpy can be equal to NULL do you?

Olaf van der Spek

unread,

Apr 18, 2013, 7:33:59 AM4/18/13

to std-dis...@isocpp.org

On Thu, Apr 18, 2013 at 1:08 PM, Vlad from Moscow <vlad....@mail.ru> wrote:
> Because this contradicts the whole model of using C-string functions. If

You should be using the C++ sequences model, not the C strings model.
--
Olaf

Vlad from Moscow

unread,

Apr 18, 2013, 2:11:41 PM4/18/13

to std-dis...@isocpp.org

Thta it would be more clear let consider a simple task. There is some sequense of words among which empty words can ve present and we need to build a second sequence that will contains pairs of the first and the last characters in a words.

So how would be these characters gotten?

The natural way is to use member functions front() and back(). However the current C++ standard make their usage in this situation impossible because some words an be empty.

So instead of front() we have to use s[0] ( the first stupidi). But what to do with back()?

Its substitution can look the following way

s[ s.size() == 0 ? 0 : s.size() - 1] (the second stupidi)

This code is error-prone. A user can forget to substract 1 in case then the string is empty.

Now consider how the task could be done in C.

We would write two functions

char front( const char *s )
{
    return s[0];
}

char back( const char *s )
{
    size_t n = strlen( s );

    return ( ( n == 0 ) ? s[0] : s[n - 1] );
}

These function are written in the spirit of C that is in the spirit of C string functions.

The corresponding code could look the following way

#include <string.h>
#include <stdio.h>

char front( const char *s )
{
    return s[0];
}

char back( const char *s )
{
    size_t n = strlen( s );

    return ( ( n == 0 ) ? s[0] : s[n - 1] );
}

int main( void )
{
    const char *words[] = { "first", "second", "", "forth" };
    const size_t N = sizeof( words ) / sizeof( *words );
    struct Pair { char first, second; } char_pairs[N];

    for ( size_t i = 0; i < N; i++ )
    {
        char_pairs[i].first = front( words[i] );
        char_pairs[i].second = back( words[i] );
    }

    for ( size_t i = 0; i < N; i++ )
    {
        if ( char_pairs[i].second ) printf( "<%c, %c> ", char_pairs[i].first, char_pairs[i].second );
    }

    puts( "" );

    return 0;
}

As you see there is no any requirements in exceptions.

So I want to get the same simple code in C++. Why the code in C++ shall be more compound then in C?

I need simple and safe methods to get the first and the last characters of a character array that is wrapped in std::string. And such simple and safe methods shall be front() and back(). That is it is not the user who shall write the code

s[ s.size() == 0 ? 0 : s.size() - 1]

It is std::string that shall provide a correct, simple and safe method instead of this construction. This code shall be inside back(). Of course it will be optimized and will not call method size two times.

So if these methods would exist I could write a simple code which does the same task as the C code.

#include <iostream>
#include <algorithm>
#include <string>
#include <vector>
#include <utility>

int main()
{
    std::vector<std::string> words = { "first", "second", "empty", "forth" };
    std::vector<std::pair<char, char>> char_pairs( words.size() );

    std::transform( words.begin(), words.end(), char_pairs.begin(),
                    []( const std::string &w )
                    {
                        return ( std::pair<char, char>( w.front(), w.back() ) );
                    } );

    for ( auto p : char_pairs ) std::cout << '<' << p.first << ", " << p.second << "> ";
    std::cout << std::endl;

    return 0;
}

The restrictions on methods front and back in the current C++ standard are artificial, not compatible with the spirit of C character arrays and corresponding functions and totally useless. They only confuse uses because s[0] is not equivalent to front (why?!) and prevent to write simple and clear code.,

David Rodríguez Ibeas

unread,

Apr 18, 2013, 3:01:20 PM4/18/13

to std-dis...@isocpp.org

I think you have an issue with how the problem is defined:

On Thu, Apr 18, 2013 at 2:11 PM, Vlad from Moscow <vlad....@mail.ru> wrote:

Thta it would be more clear let consider a simple task. There is some sequense of words among which empty words can be present and we need to build a second sequence that will contains pairs of the first and the last characters in a words.

The problem is ill defined because an empty word does not have a 'first' or 'last' character. You are assuming a C-programming context in which an empty string does contain a character as an artifact and mixing that with the natural language description. A better description of your problem would be:

Given a sequence of possibly empty words, build a sequence of pairs that contain the first and last characters _or_ two null characters if the word is empty.

With this new definition of the problem, the C++ program becomes a direct translation:

std::transform( words.begin(), words.end(), char_pairs.begin(),

                    []( const std::string &w ) -> std::pair<char,char>
                    {
                        return w.empty() ? {0,0} : {w.front(), w.back()};
                    } );

Which is not much more complex than the original code. Note that in a real world problem the description could very well be:

Given a sequence of possibly empty words, build a sequence of pairs that contain the first and last characters _or_ two SPACE characters if the word is empty.

I have the feeling that there is some confusion as of what the null character is. It is NOT a character in the 'std::string', but an artifact to enable compatibility with C strings. The member functions 'front()' and 'back()' yield characters _in_ the 'std::string', and the terminator is not such a thing.

Vlad from Moscow

unread,

Apr 18, 2013, 4:47:15 PM4/18/13

to std-dis...@isocpp.org, dib...@ieee.org

Container std::string is different than other containers in that that it is a wrapper for C character arrays. So except the memory management it should behave the same way as character arrays and C string functions. This is the reason that it has so many build-in algotithms that reproduce similar C string functions. I do not like the crutch you are using in the task:

return w.empty() ? {0,0} : {w.front(), w.back()};

For the task this additional condition is not required. It could be required in some other task for example to count how many words have equal the first and the last characters. You could write

!v.empty() && w.front() == w.back()

I have the strong conviction that for an empty string operator []( 0 ), front() and back() shall behave the same way.

Vlad from Moscow

unread,

Apr 18, 2013, 4:53:51 PM4/18/13

to std-dis...@isocpp.org, dib...@ieee.org

That is the container std::string need simple and safe methods even when a string is empty. I do not want to write code as

s[ s.size() == 0 ? 0 : s.size() - 1]

or

s[ s.empty() ? 0 : s.size() - 1]

I prefer to write

s.back(), s.front, s[0] and be sure that these functions are safe and that I can substitute s[0] for s.front()

Vlad from Moscow

unread,

Apr 18, 2013, 5:11:52 PM4/18/13

to std-dis...@isocpp.org, dib...@ieee.org

By the way if I am not mistaken the realization of std::string by MS supports the concept I described.

Vlad from Moscow

unread,

Apr 18, 2013, 5:23:14 PM4/18/13

to std-dis...@isocpp.org, dib...@ieee.org

That is s[0] and s.front() are trying to return the same value though the MS VC++ 2010 does not allow to get s[0] for empty strings.

corn...@google.com

unread,

Apr 22, 2013, 2:19:45 PM4/22/13

to std-dis...@isocpp.org

Here's the mental model you *should* have for std::string: it's a type that represents an unterminated string and follows the C++ sequence concept, with a few compatibility features that make it easier to replace existing uses of C-style NUL-terminated strings and interoperate with C APIs. This is a *good* model, because NUL-terminated strings are an unfortunate legacy of C, and while we want to have easy transition and interoperability, we really want to depend on this representation.

If you have a different mental model, you will find various aspects of std::string unintuitive.

Now, with this in mind, we can divide the operations on std::string into sequence features (part of the sequence concept), compatibility features (C interoperability and transition), and bloat^H^H^H^H^Hstring-specific features (operator +=, find_first_of, etc.).

Note that *only* compatibility features act as if std::string is NUL-terminated. All others do not subscribe to that idea. In other words, you should *only* treat std::string as NUL-terminated if you're interoperating with or transitioning from C.

On Wednesday, April 17, 2013 8:00:08 PM UTC+2, Vlad from Moscow wrote:

Let start from the description of operators [] of class std::basic_string ([string.access]). The decription points out only one requirement pos <= size.
If pos exactly less than size() then operators return *(begin() + pos). Otherwise, they return "a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior" (quatation of the C++ STandard).

Sequence feature: Random access via operator [].

Compatibility feature: A C-style string allows str[strlen(str)] which gives the terminator, but overwriting the terminator destroys the NUL-termination property and is effectively undefined. Therefore, define str[str.size()] to return something equivalent to a NUL-terminator that can't be modified. This way, existing char* variables where this property is used can be replaced by std::strings without subtle changes in semantics.

Now let consider the decription of member functions front(). There is written that effects of their usage is equivalent to operator [](0).

What does this mean? it means that if an object is empty the functions will return "a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior".

It seems that all is o'k. However it is not the truth. The description of member functions front() requires that the condition !empty() were satisfied.

The problem is that you look at the description piecewise, trying to infer behavior from it, without considering the description as a whole. You should read it as, "Given a non-empty string, returns the first character." For ease of description, the "returns the first character" part was written as "operator [](0)". But that doesn't change the pre-condition.

Sequence feature: Access to the first element of non-empty sequence via front() member.

Compatibility feature: None. C strings don't have member functions. There is no need for front() to act as if a NUL-terminator exists.

So this code

std:;string s;
if ( s[0] ) { /*...*? }

is valid while this code

std::string s;
if ( s.front() ) { /*...*? }
is wrong.

If your intent in the first snippet is to find out whether the string is empty, that's wrong too. A std::string may have embedded NULs.

The worse situation with member functions back(). They are equivalent to operator[](size() - 1). Take into account the expression in the parentheses.

That seems absolutely clear to me. If size() is 0, bad things happen.

By the way, back() is a sequence feature. No compatibility here.

Now let consider a simple task check whether the first and the last characters ina string are equal.

What if the string is empty? Then there are no first and last characters. You need to defined this case separately in your requirements.

If the string is empty you can write on the one hand s[0] to denote the first character.

There *is* no first character in an empty string! Where do you get this absurd idea from?

But what about the last character?

That doesn't exist either.

Can you write s[ s.size() -1]? No you can not because if the string 's' is empty this expression is equivalent to s[-1].

Which is a big clue that what you're doing doesn't make sense.

ISo you need to make the code more compound. You need check whether the string is empty that to know may you directly compare the first characterv with the last.

For example you can write

if ( s.empty() || s[0] == s[s.size() - 1 ) { /*...*/ }

This is not very readable record. It would be much simpler to write

if ( s.front() == s.back() ) { /*...*/ }

This statement is very clear/

No, it's not. Why on earth would I expect this to have any meaning for an empty string? As I said above, the requirements need to define what happens for empty strings, and therefore the code should have separate clauses too.

Anyway, you're presenting a false dichotomy. You make it sound as if, because we can't write the second version, we have to use the ugly first version. Have you tried

if (s.empty() || s.front() == s.back()) { /*...*/ }

? Much more readable than the first version, and perfectly well defined.

I do not see any great sense in this requirement of the C++ Standard. Moreover it looks like a defect. In my opinion the both functions, front() and back(), indeed shall be equivalent to operators [] that is for example member functions back shall be equivalent to "operator[](size() - 1)" if the condition !empty() is true. "Otherwise, return a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior"

Now you're the one who's not making sense.

On the one hand, you complain that front() is inconsistent because its definition is "operator [](0)", therefore it should be defined for an empty string and drop the precondition to make it consistent with the indexing operator.

On the other hand, you complain that the definition of back(), which is "operator [](size() - 1)", is bad *because* it is perfectly consistent with the indexing operator. Instead you want a special case introduced for empty strings.

So I suggest remove the requirement !empty() from the descriptions of front() and back(). This requirement only confuses users and prevents to write generic

There's nothing generic about special cases for std::string. No other sequence (std::vector, std::list and std::deque in the standard) allows front() or back() to be called when the sequence is empty. Why should std::string?

clear and readable code.

Neither is there anything clear or readable in suggesting that "front() == back()" should be defined and return true for an empty string. That's actually extremely obscure and unintuitive behavior, a true special case.

Sebastian

Vlad from Moscow

unread,

Apr 23, 2013, 1:55:14 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

I have read this and have nothing rationale against that front and back would return terminating zero for emplty strings. Terminating zero is not a string character? And what?! It is no9t an argument. operator []( 0 ) also returns non-character for an empty string.

I need a safe simple method to use front and back in generic code. If a will need more exact condition I can append the original condition with some other condition as for example !empty().

It is invariant that operator []( 0 ), front() and back() shall return terminating character for an empty string. It is a very artificial, useless, and only confusing requirement that back and front may be applied only for non empty strings. If somebody wants to make his life more complicated and write

s.size() == 0 ? s[0] : s.front();

it is his problem. Why should other suffer from such a stupidy?

class std::string as a wrapper for C sttrings that provides the memory management. Otherwise it would not exist and you would use std::vector<char> It is the reason that operator []( 0 ) returns 0.for an empty string. But it is very strange and inconsistently that back and front has undefined behavior for an empty string. I am not speaking even that compilers supply in fact 0 for front for an empty string because thay simply should return something.

Vlad from Moscow

unread,

Apr 23, 2013, 2:06:42 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

You should take into account the very important difference between "usual" arrays and character arrays. "Usual" arrays may not be empty while character arrays in fact can be empty. And you can deal with empty character arrays the usual way. This is the mental model as you are saying behind character arrays. So operator [[] ( 0 ), front and back have exact sense for empty strings while the same member functions have no sense for std::vector with empty contents.

Nicol Bolas

unread,

Apr 23, 2013, 3:08:03 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

On Monday, April 22, 2013 11:06:42 PM UTC-7, Vlad from Moscow wrote:

You should take into account the very important difference between "usual" arrays and character arrays. "Usual" arrays may not be empty while character arrays in fact can be empty.

... since when? Arrays most certainly can be empty. APIs that take arrays will either have an explicit expectation of a size, a heuristic to determine the size, or will be provided a size. In the last two cases, those APIs must check to see if the size is 0.

You seem to be missing the whole point here. C++ does not want a difference between "usual arrays" and "character arrays". C requires there to be a difference, and most C APIs expect this. C++ does not, because we don't have to. std::basic_string only provides a minor difference as a concession to C API compatibility. Nothing more.

When using basic_string as a C++ array (ie: using iterators and such), it will be treated as an array, not as a null-terminated array. When using it for compatibility's sake, you can see that there is a null-terminator, to make it easier to use with C APIs. C++ APIs do their best to make the null-terminator invisible.

Your are wrong to think that a string is a null-terminated array. Your proposed suggestion is a horrible idea because it encourages people to think of strings that way. It encourages people to grab front character, check for 0, and assume that this means that the string is empty (which it most certainly does not).

If you don't like that, if you want your string to expose the null-terminator explicitly, then write your own string class. But C++ is not going to abandon the progress it's made away from null-terminated strings just because someone doesn't like it.

Vlad from Moscow

unread,

Apr 23, 2013, 4:32:14 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

One again I do not see any serious arguments. I see only a lack of understanding.

Well let return to the realizations of the same functions in C that were shown already above. I will show them again.

char front( const char *s )
{
return s[0];
}

char back( const char *s )
{
size_t n = strlen( s );

return ( ( n == 0 ) ? s[0] : s[n - 1] );
}

Are these function written in the spirit of C? Yes, they are. Are these functions safe? Yes, they are except that they do not check that the parameter can be NULL. But neither C string function checks this condition.

So what are you proposing?

You are saying: "Let write this functions unsafe!":) And you are writing

char back( const char *s )
{
size_t n = strlen( s );

return ( s[n - 1] );
}

And you are adding that somewhere in the C++ Standard we will write that the function has undefined behavior if a string is empty.

Moreover this function does not signal that its usage for an empty string is invalid. Neither exception no some return code that will signal its bad usage. So it is written neither in the spirit of C no in the spirit of C++.

And what about users?

As you removed the check ( n == 0 ) ? s[0] : s[n - 1] from the function body you are now suggesting users to make this check theirself each time when they will call the function.

Splendid! My applause!:)

The only problem that users will ask the reasonable question: "Why do we need such a function if in fact we shall do all ourselves? This function is very dangerous and should not be used":)

In my opinion this function is unsafe, provokes to errors that are difficult to find out,,only confuses users and is useless.

corn...@google.com

unread,

Apr 23, 2013, 4:57:14 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

On Tuesday, April 23, 2013 7:55:14 AM UTC+2, Vlad from Moscow wrote:

I have read this and have nothing rationale against that front and back would return terminating zero for emplty strings.

Read the part about "mental model" and "sequence concept" again. Read it very carefully.

front() and back() are part of the sequence view of std::string. There is no terminating zero for them to return. A terminating zero only exists in the compatibility view, and only operator[], c_str(), and (since C++11, as an accidental side effect of making std::string reading thread-safe) data() offer this compatibility view.

Terminating zero is not a string character? And what?! It is no9t an argument.

You can't just take my arguments and then discard them without reason. The terminating zero is indeed *not* a part of the string. Read section 21 (lib.strings) of the C++03 standard - there's exactly two places that mention a terminator for std::basic_string, and that's the description of operator[] and c_str.

operator []( 0 ) also returns non-character for an empty string.

And at(0) throws an exception. operator[] is the odd one out. You need to realize that.

I need a safe simple method to use front and back in generic

You keep using that word. I do not think it means what you think it means.

IOW, if your code uses a specific quirk of std::string when using the sequence operations front() and back(), it is *not* generic. It's specialized.

code.

Here you go:

template <typename Sequence>

typename Sequence::value_type safe_front(const Sequence& s) {

typedef Sequence::value_type vt;

return s.empty() ? vt() : s.front();

}

template <typename Sequence>

typename Sequence::value_type safe_back(const Sequence& s) {

typedef Sequence::value_type vt;

return s.empty() ? vt() : s.back();

}

Look, this is truly generic: it works for *any* sequence (std::vector, std::string, std::list, std::deque, and whatever 3rd party sequences you use), not just std::string.

Now please stop trying to make your special requirements part of the standard.

It is invariant that operator []( 0 ), front() and back() shall return terminating character for an empty string.

Where do you get that idea? Seriously, what is your mental model of std::string that produces such a notion? I cannot come up with one good definition (not formal, just a simple explanation in prose) where back() returning a terminator for an empty string makes sense.

It is a very artificial, useless, and only confusing requirement that back and front may be applied only for non empty strings.

No, it's a completely obvious requirement once you stop thinking of std::string as a terminated sequence. It is not. It has never been. It will never be. Remember this, internalize this, or write your own string class. Otherwise, you will keep running into these conceptual disconnects.

If somebody wants to make his life more complicated and write

s.size() == 0 ? s[0] : s.front();

it is his problem. Why should other suffer from such a stupidy?

Strawman argument. Why would I, or anyone, ever write that code? It is semantically equivalent to "s[0]". If I need compatibility with this weird C behavior, I will use operator[]. If I want a string, I will special-case empty strings, because that's the right thing to do. Special cases should be special-cased. That leads to readable code. Using quirks does not.

class std::string as a wrapper for C sttrings that provides the memory management.

See, that's your basic, fundamental mistake! That's not what std::string is. Read the part of my post about "mental model" again. And again. Read it until you understand what I'm saying. What I'm saying is that you're WRONG! std::string is not a wrapper for C strings. std::string is not a wrapper for C strings. std::string is not a wrapper for C strings. This is important.

Seriously, I feel a bit insulted here. I write a long post to explain how std::string actually works, and you simply *ignore* my explanation and then claim that I have not made any arguments against your point. That's like clapping your hands over your ears and going "LA LA LA LA".

Otherwise it would not exist and you would use std::vector<char>.

That doesn't make any sense. std::string is a string type. It supports concatenation. It supports interoperability with the sad thing that is C strings, like construction from a string literal. But that doesn't mean it's a wrapper for C strings, it just provides interoperability.

It is the reason that operator []( 0 ) returns 0.for an empty string.

No, the reason for that is compatibility. Not that std::string is a wrapper for C strings. If that were so, at(0) would return 0 too, instead of throwing an exception. If that were so, you could dereference the end iterator and it would return 0. Neither of these is the case. So it appears that you're wrong.

But it is very strange and inconsistently that back and front has undefined behavior for an empty string.

Again, operator[] is the only thing in std::string that is meaningful for an empty string. front(), back(), at(), begin(), end() and, in C++03, data() all require non-empty strings for any meaningful values to be returned (iterators and data() return something, but you can't dereference it).

I am not speaking even that compilers supply in fact 0 for front for an empty string because thay simply should return something.

What part of "undefined behavior" do you not understand?

Vlad from Moscow

unread,

Apr 23, 2013, 5:32:24 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

One more I do not see any serious arguments. You are riting for example that "operator[] is the only thing in std::string that is meaningful for an empty string". Why?! I do not see any sense in this operator behaviour without corresponding bahaviour of for example front. If you said 'A' then please said also 'B'. Either operator [](0 ) has a sense for an empty string and in this case front has the same sense or trhe both have no sense and behave the same way as the corresponding functions in any other sequential container.

Again you are showing functions

template <typename Sequence>

typename Sequence::value_type safe_front(const Sequence& s) {

typedef Sequence::value_type vt;

return s.empty() ? vt() : s.front();

}

template <typename Sequence>

typename Sequence::value_type safe_back(const Sequence& s) {

typedef Sequence::value_type vt;

return s.empty() ? vt() : s.back();

}

I do not understand why I need such a stupid compound code? Can you explain me? As for me it is enough to write correctly function front. As for you you may build one template function over another template function. It is you problem. Why are you tryying to force others to write a bad code?!

As for functions data and c_str then the only requirement in the C++ 11 standard is the following

The program shall not alter any of the values stored in the character array

So if functions front and back will be correctly described in the Standard it will not change the currrent code base.

corn...@google.com

unread,

Apr 23, 2013, 5:34:18 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

On Tuesday, April 23, 2013 10:32:14 AM UTC+2, Vlad from Moscow wrote:

One again I do not see any serious arguments. I see only a lack of understanding.

Well let return to the realizations of the same functions in C that were shown already above. I will show them again.

char front( const char *s )
{
return s[0];
}

char back( const char *s )
{
size_t n = strlen( s );

return ( ( n == 0 ) ? s[0] : s[n - 1] );
}

Are these function written in the spirit of C? Yes, they are. Are these functions safe? Yes, they are except that they do not check that the parameter can be NULL. But neither C string function checks this condition.

And now please provide plain English explanations for what these functions do. Keep in mind the KISS and least surprise principles.

So what are you proposing?

You are saying: "Let write this functions unsafe!":) And you are writing

char back( const char *s )
{
size_t n = strlen( s );

return ( s[n - 1] );
}

And you are adding that somewhere in the C++ Standard we will write that the function has undefined behavior if a string is empty.

Moreover this function does not signal that its usage for an empty string is invalid. Neither exception no some return code that will signal its bad usage. So it is written neither in the spirit of C no in the spirit of C++.

What is this spirit of C and C++ you're talking about? Read the C and C++ standards. Specifying preconditions is all over the place, not just the library, but the language itself. If you violate preconditions, you get undefined behavior. Again, all over the place. "i << j" - if j is larger than the bitsize of i's type, you have undefined behavior. "a + b" - if these are signed integers and the addition overflows, you have undefined behavior. "*p" - if p is null, you have undefined behavior. Call strlen() on something that doesn't have a terminator, you have undefined behavior. Use operator[](0) on an empty vector, you have undefined behavior.

How is front() or back() on an empty string being undefined behavior not in the spirit of this? An empty sequence does not have a first or last element. Therefore, front() and back() do not make sense. They are undefined. It is simple, logical, and absolutely consistent with how everything in C and C++ works.

Your proposed behavior in the name of "safety" has the following effects:

- Errors become harder to detect. A good implementation has a debug-mode assertion for calling these functions on an empty string. You have replaced this with a return value that is indistinguishable from a valid return value (std::string may contain embedded NULs). It is no longer possible to detect this mistake.

- The functions become harder to explain. The description of the function as it is is, "Returns a reference to the first character." If somebody asks, "But what if the string is empty?" I can say "An empty string does not have a first character. Obviously you can't call this function." With your solution, I have to say, "It returns a reference to a character that has value 0, but you're not allowed to modify it.". Then I will be asked, "Why?" And then I have to get into explaining terminators and confuse the hell out of everybody who's not used to C. (Not all C++ programmers come from C. Many start with C++ directly. Many come from Java, Python, C# or Ruby, and if I say, "0-terminated string", they go, "whu?".)

- The code using the functions becomes harder to read. You may not believe me, but let me assert again that no matter how you specify it, any result of front() on an empty string is just unintuitive. Therefore, any code that relies on such a result is unintuitive. Good, readable code tests for the special case of an empty string explicitly and takes appropriate action.

And what about users?

As you removed the check ( n == 0 ) ? s[0] : s[n - 1] from the function body you are now suggesting users to make this check theirself each time when they will call the function.

What? No! They make this check once when they enter their processing, to special-case the empty string. Or the check is implicit in the loop header they use for processing the string. You know, good, readable code that is explicit about what it does.

Splendid! My applause!:)

The only problem that users will ask the reasonable question: "Why do we need such a function if in fact we shall do all ourselves? This function is very dangerous and should not be used":)

Except that nobody aside from you ever asked this on any programming help forum I read, and I read quite a few.

In my opinion this function is unsafe, provokes to errors that are difficult to find out,,only confuses users and is useless.

It's very simple, actually.

Empty strings are special. (Empty sequences in general.)

Therefore, you should have special code for empty strings.

Therefore, you never have to worry about empty strings in the code that uses front() and back().

Or let me put this differently.

Code that uses front() or back() on empty strings is wrong. Empty strings do not have a front() or back(). It doesn't matter whether you special-case these functions for empty strings, it's still conceptually wrong.

This code should be found and corrected.

A precondition !empty() can be part of a debug assertion inside front() and back(). This will stop the program and you can immediately find and correct the bad code through the call stack.

If front() and back() don't have a precondition, nothing can be tested inside of them. If I want to find the wrong code, I have to put assertions on every call site. This is bad. Also, I have to assert that the return value isn't 0, which is actually a possible valid return value for non-empty strings, so the assertion may have false positives. In other words, I cannot reliably find the wrong code. This is not just bad, this is just short of disastrous.

Therefore, your definition is *less safe* than mine.

corn...@google.com

unread,

Apr 23, 2013, 5:44:07 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

On Tuesday, April 23, 2013 11:32:24 AM UTC+2, Vlad from Moscow wrote:

One more I do not see any serious arguments.

Then learn to read!

You are riting for example that "operator[] is the only thing in std::string that is meaningful for an empty string". Why?!

I have explained this multiple times. The reason is that

char* cstring = ...;

char c = cstring[i];

is syntactically and semantically valid.

cstring.front() and cstring.at(i) are not. Therefore, there is no need to maintain the illusion of a 0-terminated string for these operations.

I do not see any sense in this operator behaviour without corresponding bahaviour of for example front.

I keep repeating myself here.

If you said 'A' then please said also 'B'. Either operator [](0 ) has a sense for an empty string and in this case front has the same sense or trhe both have no sense and behave the same way as the corresponding functions in any other sequential container.

What part of "C compatibility feature" was too hard to understand for you? Why is it so hard to accept that std::string only pretends to be a 0-terminated sequence for that single operation that is syntactically valid in C? One! Single! Function! c_str() doesn't even count, because it doesn't actually need to return the buffer that the other functions view.

I do not understand why I need such a stupid compound code? Can you explain me?

You're trying to do something stupid, i.e. get the first and last elements of an empty sequence. This doesn't make any sense. Therefore you need stupid code to do it.

As for me it is enough to write correctly function front. As for you you may build one template function over another template function. It is you problem. Why are you tryying to force others to write a bad code?!

You're the one who's trying to write bad code. You're trying to write code that doesn't respect the special case of an empty sequence. You're trying to write your code in a way that is most unintuitive and confusing. Tell me, why do you hate your fellow programmers, to inflict such code on them?

As for functions data and c_str then the only requirement in the C++ 11 standard is the following

The program shall not alter any of the values stored in the character array
So if functions front and back will be correctly described in the Standard it will not change the currrent code base.

It's true that relaxing the precondition on front() and back() won't break any code - that's universally true for relaxing preconditions. That is not an argument for doing so.

Vlad from Moscow

unread,

Apr 23, 2013, 6:12:00 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

This

"Code that uses front() or back() on empty strings is wrong. "

is a wrong atatement. It is the same as to say for C functions that C string functions that use empty strings that is strings that have no characters except the terminating zero are wrong.

I showed C functions front and back and in my opinion the equivalent C++ functions front and back shall be the same.

I need safe simple functions front and back without bothering that strings can be empty. If I need to check that strings are not empty I can do that by using additional condition !empty()

As for the debug assertion then it is not very useful. For example you are reading some file into a vector of std::pair<char, char> where the pair contains the first and the last symbols of a string.. And you are not bothering that a string can be empty. For example the file can contain two sequential new line characters. In this case you could use std::transform as I showed here somewhere above with simple fiunction

[]( const std::string &s ) -> std::pair<char, char> { return {s.front(), s.back() }; }

In other words it is natural that this code would be valid for any string

    auto lm = []( std::string &s ) -> std::pair<char, char>
    {
        return { s.front(), s.back() };
    };

I do not want to write in this simple case something as

    auto lm = []( std::string &s ) -> std::pair<char, char>
    {
        if ( s.empty() ) return { 0, 0 };
        else return { s.front(), s.back() };
    };;

corn...@google.com

unread,

Apr 23, 2013, 6:38:01 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

On Tuesday, April 23, 2013 12:12:00 PM UTC+2, Vlad from Moscow wrote:

This

"Code that uses front() or back() on empty strings is wrong. "

is a wrong atatement. It is the same as to say for C functions that C string functions that use empty strings that is strings that have no characters except the terminating zero are wrong.

I'm sorry, I completely fail to parse this last sentence (try using some punctuation). Are you saying that my claim is equivalent to saying that there should be no functions that work on empty strings at all? If so, that's absurd.

I showed C functions front and back and in my opinion the equivalent C++ functions front and back shall be the same.

I consider your C functions to be bad. They have unintuitive specifications and just because they don't crash or misbehave on empty strings doesn't make them safe or good. I stand by my statement. front() and back() on empty strings is wrong. Not just from a code semantics viewpoint, but from a simple, abstract conceptual viewpoint. An empty string, or any empty sequence, does not have a first or last element. You say, "but strings are special, they have a terminator". But that terminator was an implementation detail even in C, and relying on it led to tricky and unintuitive code even there.

I need safe simple functions front and back without bothering that strings can be empty.

I wrote them for you. Use them. Don't try to force your weird semantics on the rest of us.

As for the debug assertion then it is not very useful.

It's extremely useful to find wrong code, such as what you're writing below.

For example you are reading some file into a vector of std::pair<char, char> where the pair contains the first and the last symbols of a string.. And you are not bothering that a string can be empty. For example the file can contain two sequential new line characters.

Your problem specification is weird. Why would I want {0, 0} entries for empty lines? Without knowing the exact background I cannot tell. I would consider it much more likely that I want to simply skip empty lines completely.

In this case you could use std::transform as I showed here somewhere above with simple fiunction

[]( const std::string &s ) -> std::pair<char, char> { return {s.front(), s.back() }; }

In other words it is natural that this code would be valid for any string

No, it is not. There is absolutely nothing natural about getting the front and back of an empty sequence.

    auto lm = []( std::string &s ) -> std::pair<char, char>
    {
        return { s.front(), s.back() };
    };

I do not want to write in this simple case something as

    auto lm = []( std::string &s ) -> std::pair<char, char>
    {
        if ( s.empty() ) return { 0, 0 };
        else return { s.front(), s.back() };
    };;

This is an incredibly specific requirement you have. Why do you want to represent empty lines as {0, 0}? What if you want them as {-1, -1}? Basically, you have one very specific use case, and you try to change the standard to accommodate that particular use case.

Here's some code that skips empty lines.

std::vector<pair<char, char>> ends;

// using Boost.Range for nicer syntax

boost::transform(incoming | boost::adapter::filtered([](const std::string& s) { return !s.empty(); }),

std::back_inserter(ends),

[](const std::string&s) { return std::make_pair(s.front(), s.back()); });

But all this arguing is useless. The fact of the C++ standard is that a std::string is not a 0-terminated string. You're welcome to submit a proposal to make it one. But until you do and it is accepted, you need to live with the fact that the functions won't behave the way you think you should do, and that this isn't a defect in the standard but a conscious design decision. You can disagree with it, of course, you can present arguments on why you think it's the wrong decision, and you can eschew std::string in favor of your own string class that works differently. But you cannot put forth claims about individual members of std::string having to work according to your model when you don't understand the overall design of std::string.

Ville Voutilainen

unread,

Apr 23, 2013, 6:39:15 AM4/23/13

to std-dis...@isocpp.org

On 23 April 2013 13:12, Vlad from Moscow <vlad....@mail.ru> wrote:

This

"Code that uses front() or back() on empty strings is wrong. "

is a wrong atatement. It is the same as to say for C functions that C string functions that use empty strings that is strings that have

It's a perfectly correct statement. front() returns the first element. For an empty sequence, there is none.

Same for back() and the last element.

no characters except the terminating zero are wrong.

I showed C functions front and back and in my opinion the equivalent C++ functions front and back shall be the same.

Your C function "front" is so incorrect it's brain-damaged.

I need safe simple functions front and back without bothering that strings can be empty. If I need to check that strings are not

Great, write them yourself. The safety you want for front()/back() has a cost the standard has no intention
to pay. That's why the behaviour is undefined if the sequence is empty and you perform a front()/back(),

separating the safety checking allows front()/back() to avoid it.

Nicol Bolas

unread,

Apr 23, 2013, 6:50:03 AM4/23/13

to std-dis...@isocpp.org, corn...@google.com

On Tuesday, April 23, 2013 2:32:24 AM UTC-7, Vlad from Moscow wrote:

One more I do not see any serious arguments.

I'm tired of seeing you write that. Saying that you don't see "any serious arguments" doesn't make you right. It means you're just dismissing your opponent's argument out of hand. If you can't argue your case without recognizing that the opposition themselves has a valid perspective, then you're not discussing or arguing: you're ranting.

You believe that the null terminator should be considered part of the string. The rest of us do not. We have explained, repeatedly, why we believe that a string should be considered a sequence of the actual characters, not including the null terminator. We have talked about why the null terminator exists in std::string, and why it's not a valid part of the range, and therefore why it's not a valid return value from front/back.

Please stop continuously restating your position and ignoring all opposing viewpoints. Either answer someone else's position with some form of argument, or accept that we have a difference of opinion, and that you'd have to put forth a proposal to get this into the standard (good luck with that, as I'm fairly sure the committee would reject it without some very convincing argument. And what you've put forth here isn't convincing).

Ville Voutilainen

unread,

Apr 23, 2013, 8:57:41 AM4/23/13

to std-dis...@isocpp.org

On 23 April 2013 13:39, Ville Voutilainen <ville.vo...@gmail.com> wrote:

Your C function "front" is so incorrect it's brain-damaged.

Pardon me. It's precisely what front() should do. Excuse me for reading too hastily. It's also

precisely the equivalent what a front() in C++ should do. And, it's not at all what operator[]

for strings should do.

Vlad from Moscow

unread,

Apr 23, 2013, 2:19:55 PM4/23/13

to std-dis...@isocpp.org, corn...@google.com

I simply do not see any serious arguments except unwillingnessto confess the wrong approach.

Well I will try to explain.

For any indexed sequence in any programming language the notion of front() is defined as the notion s[0] (let assume that indexes start from 0). If the expression s[0] is valid then the expression front() is also valid and vice versa. These two expressions are interchangeable.

The same way is defined the notion of back(). It is defined as an expression with the maximum index n for which s[n] is valid. If there is no such an index that greater than 0 then it means that back() is equivalent to s[0] provided that the expression s[0] is valid.

Now let consider a regulat arrays. They can not be declared as having zero elements. So for an empty array the notions a[0] and front are invalid. Also take into account that for any regular array back() is always corresponds to a[dimension - 1] if dimension is not equal to zero (that may not be).

So similar to the behaviour of regular arrays any empty object of std::vector has undefined expressions v[0] and v.back().

The other situation with character arrays. Again we may not define a character array with the zero dimension. However even if a character array has the dimension that is greater than zero we can say that a character array is empty if its first byte is the zero-terminating byte. So even for so-called empty character arrays expression s[0] is valid.

Class std::string is created that simulate character arrays and string literals. If we are saying that expression s[0] is valid then it means that expressions s.front() and s.back() is valid because they are interchangeable.

Now let return to the simple task of creating of a container of type std::vector<std::pair<char, char>> that is built based on some other container of type std::vector<std:;string> where some strings can be empty.

If you are describing some task and the description contains some pre-condition then these pre-conditions should be reflected in the corresponding code.

For example you are saying: "I want for a non-empty string to get the first and the last characters."

In this case the corresponding code could look the following way

std::pair<char, char> p;

if ( !s.empty() ) p = std::make_pair( s.front(), s.back() );

Here !s.empty() is the pre-condition that was declared in the description.

Now let assume you are saying: "I want to get the first and the last characters for sequence of strings."

There is no any pre-conditions. What to do in this case? Let assume that in this case it is not important whether some strings are empty or not. You have the container std::vector<std:;string> and have to build the corresponding container std::vector<std::pair<char, char>>. So you even do not know what to do.:)

Well, David Rodriguez Ibeas suggested to use the following functionality

[]( const std::string &w ) -> std::pair<char,char>
{
return w.empty() ? {0,0} : {w.front(), w.back()};
} );

But why did he decide that {0, 0} shall be returned in case of an empty string?! Why not {-1, -1} or even {'$', '$'}?

So one programmer will use {0,0}, the second programmer will use {-1, -1} the third will use {'$', '$'} and so on. The notions of front() and back() for an empty string is not defined in the Standard.

So it looks like this simple task is insoluble.

If s[0], front() and back() would be equivalent for empty strings (and they shall be equivalent) then there is no any problem. The task is being done using standard algorithm std::transform

std::vector<std::string> v;

// filling the vector from some file that can contain empty records.

std::vector<std::pair<char, char>> v2;

v2.reserve( v.size() );

std::transform( v.begin(), v.end(), std::back_inserter( v2 ), [](const std:;string &s ) ->std::pair<char, char> { return { v.front(), v.back() }; } );

Why did you decide that I shall skip empty strings? Did I ask you about this? Is there some pre-conditions in my description of the task? No. there is not. I am satisfied with the result.

If I need to skip empty strings I can do that further while processing the new vector using expression

if ( p.first ) { /* some code */ }

This expression is valid because I am sure that I am dealing with a text file and strings can not contain embedded zeroes.

So there are two approaches

On the one hand there are unsave functions that have confusing semantic because they are not equivalent to s[0] and provide unpredictable behaviour in case when a string is empty because it is not clear what to return for empty strings. They are sources of numerous errors. One programmer do not know that these functions have undefined behavior for empty strings. Other programmer knows that but forgot to insert the check of empty strings. The third simply do not know what to return in case of empty strings.

On the other hand there are safe functions with clear consistent logic and predictable behavior. You need check whether a string empty? No problem! Use expression s.empty().

So either s[0], front() and back() are not defined for an empty container (as for example for std::vector) or if one of them is defined then the others also shall be defined.

Olaf van der Spek

unread,

Apr 23, 2013, 2:43:53 PM4/23/13

to std-dis...@isocpp.org, corn...@google.com

On Tue, Apr 23, 2013 at 8:19 PM, Vlad from Moscow <vlad....@mail.ru> wrote:
> For any indexed sequence in any programming language the notion of front()
> is defined as the notion s[0] (let assume that indexes start from 0). If the
> expression s[0] is valid then the expression front() is also valid and vice
> versa. These two expressions are interchangeable.

What languages define s[0] for empty sequences?
What languages define s.front() for empty sequences?

Chris Jefferson

unread,

Apr 23, 2013, 2:45:39 PM4/23/13

to std-dis...@isocpp.org

On 23/04/13 19:19, Vlad from Moscow wrote:
>
> I simply do not see any serious arguments except unwillingnessto
> confess the wrong approach.
> Well I will try to explain.
> For any indexed sequence in any programming language the notion of
> front() is defined as the notion s[0] (let assume that indexes start
> from 0). If the expression s[0] is valid then the expression front()
> is also valid and vice versa. These two expressions are interchangeable.
> The same way is defined the notion of back(). It is defined as an
> expression with the maximum index n for which s[n] is valid. If there
> is no such an index that greater than 0 then it means that back() is
> equivalent to s[0] provided that the expression s[0] is valid.

The standard promises that s[size()] is valid, and always returns
charT(). By this argument, do you want 'back()' to always return
charT()? This does not seem like it would be useful (and would be a
breaking change).

This shows exactly the problem -- std::string tries to provide something
both like a container, where you can access the on-past-end-value with
s[size()]. However, when the container is empty, front() and back()
still do not make sense.

Chris

Vlad from Moscow

unread,

Apr 23, 2013, 2:50:05 PM4/23/13

to std-dis...@isocpp.org, corn...@google.com

There is such a language as C that considers character arrays as empty if the first character is the terminating zero.:)

If there would not be such conception then C++ would not have container std:;string. It would suggest to use std::vector<char> for character arrays.:)

Olaf van der Spek

unread,

Apr 23, 2013, 2:51:52 PM4/23/13

to std-dis...@isocpp.org, corn...@google.com

On Tue, Apr 23, 2013 at 8:50 PM, Vlad from Moscow <vlad....@mail.ru> wrote:
>
> On Tuesday, April 23, 2013 10:43:53 PM UTC+4, Olaf van der Spek wrote:
>>
>> On Tue, Apr 23, 2013 at 8:19 PM, Vlad from Moscow <vlad....@mail.ru>
>> wrote:
>> > For any indexed sequence in any programming language the notion of
>> > front()
>> > is defined as the notion s[0] (let assume that indexes start from 0). If
>> > the
>> > expression s[0] is valid then the expression front() is also valid and
>> > vice
>> > versa. These two expressions are interchangeable.
>>
>> What languages define s[0] for empty sequences?
>> What languages define s.front() for empty sequences?
>
>
> There is such a language as C that considers character arrays as empty if
> the first character is the terminating zero.:)

What about other languages / sequences?

Vlad from Moscow

unread,

Apr 23, 2013, 2:55:01 PM4/23/13

to std-dis...@isocpp.org

On Tuesday, April 23, 2013 10:45:39 PM UTC+4, Chris Jefferson wrote:

On 23/04/13 19:19, Vlad from Moscow wrote:
>
> I simply do not see any serious arguments except unwillingnessto
> confess the wrong approach.
> Well I will try to explain.
> For any indexed sequence in any programming language the notion of
> front() is defined as the notion s[0] (let assume that indexes start
> from 0). If the expression s[0] is valid then the expression front()
> is also valid and vice versa. These two expressions are interchangeable.
> The same way is defined the notion of back(). It is defined as an
> expression with the maximum index n for which s[n] is valid. If there
> is no such an index that greater than 0 then it means that back() is
> equivalent to s[0] provided that the expression s[0] is valid.

The standard promises that s[size()] is valid, and always returns
charT(). By this argument, do you want 'back()' to always return
charT()? This does not seem like it would be useful (and would be a
breaking change).

Yes I want that if s[0] is valid for an empty object of std::string then s.front() and s.back() shall be also valid. As I think I demonstrated that in this case we will have a predictable behavior that allow to write generic code.

This shows exactly the problem -- std::string tries to provide something
both like a container, where you can access the on-past-end-value with
s[size()]. However, when the container is empty, front() and back()
still do not make sense.

I think differently.

Chris

David Rodríguez Ibeas

unread,

Apr 23, 2013, 3:04:56 PM4/23/13

to std-dis...@isocpp.org

On Tue, Apr 23, 2013 at 2:19 PM, Vlad from Moscow <vlad....@mail.ru> wrote:

I simply do not see any serious arguments except unwillingness to confess the wrong approach.

"Wrong" is your statement, where everyone else seems to consider yours to be the "wrong" approach.

For any indexed sequence in any programming language the notion of front() is defined as the notion s[0] (let assume that indexes start from 0). If the expression s[0] is valid then the expression front() is also valid and vice versa. These two expressions are interchangeable.

Incorrect, 'front()' is the _first_ element in the sequence, 'back()' is the _last_ element in the sequence. In most containers 'front()' is exactly the same as '[0]' and can be implemented as such. But do not confuse the semantics with the implementation. This is what others have been trying to explain. The std::string is designed as a type that stores a sequence of characters (including possible nulls) and can be empty. That is the *design*.

The same way is defined the notion of back(). It is defined as an expression with the maximum index n for which s[n] is valid. If there is no such an index that greater than 0 then it means that back() is equivalent to s[0] provided that the expression s[0] is valid.

Reread your sentence, and try to determine whether it makes sense. "back() is equivalent to s[0] provided that the expression s[0] is valid". This is a very limited definition of what the general notion of 'back()' is considering that in all other situations an empty container does not have a 's[0]'.

The definition of your 'front()' and 'back()' concepts differ from what everyone else considers, and they are not better. In the standard case you have to special case accessing the element at the front and the back (this is really what 'front()' and 'back()' mean) for empty strings because an empty string does not contain any element in the front or the back. It does not contain any element period.

Now let consider a regular arrays. They can not be declared as having zero elements. So for an empty array the notions a[0] and front are invalid. Also take into account that for any regular array back() is always corresponds to a[dimension - 1] if dimension is not equal to zero (that may not be).

Wrong:

int *p = new int[0];

Now try to apply your rationale to this case. There are no elements, accessing p[0] is undefined behavior, as is accessing p[0-1].

So similar to the behaviour of regular arrays any empty object of std::vector has undefined expressions v[0] and v.back().

See code above

The other situation with character arrays. Again we may not define a character array with the zero dimension. However even if a character array has the dimension that is greater than zero we can say that a character array is empty if its first byte is the zero-terminating byte. So even for so-called empty character arrays expression s[0] is valid.

Null terminated strings, which are *not* just character arrays, have this behavior, yes.

Class std::string is created that simulate character arrays and string literals. If we are saying that expression s[0] is valid then it means that expressions s.front() and s.back() is valid because they are interchangeable.

The class std::string has been designed to represent a _string_, not a character array, not a null terminated C string. It has been adapted to provide some level of compatibility, but that is not the purpose of the type.

Now let return to the simple task of creating of a container of type std::vector<std::pair<char, char>> that is built based on some other container of type std::vector<std:;string> where some strings can be empty.

If you are describing some task and the description contains some pre-condition then these pre-conditions should be reflected in the corresponding code.

For example you are saying: "I want for a non-empty string to get the first and the last characters."

In this case the corresponding code could look the following way

std::pair<char, char> p;

if ( !s.empty() ) p = std::make_pair( s.front(), s.back() );

Here !s.empty() is the pre-condition that was declared in the description.

Now let assume you are saying: "I want to get the first and the last characters for sequence of strings."

The problem here is that the problem definition is _flawed_. Go ask any non C developer what the first character of an empty string is. The reality is a text, null termination is an artifact of the C implementation. Before you wanted to compare 'front()' and 'back()' with different languages. How many programming languages have null termination, in how many programming languages can you do 's[0]' blindly for a string?

There is no any pre-conditions. What to do in this case? Let assume that in this case it is not important whether some strings are empty or not. You have the container std::vector<std:;string> and have to build the corresponding container std::vector<std::pair<char, char>>. So you even do not know what to do.:)

:) This is sooo untrue. There is the implicit precondition that you are not stating here that the strings are null terminated. Just consider giving that same problem to a developer in a different language.

Well, David Rodriguez Ibeas suggested to use the following functionality

[]( const std::string &w ) -> std::pair<char,char>
{
return w.empty() ? {0,0} : {w.front(), w.back()};
} );

But why did he decide that {0, 0} shall be returned in case of an empty string?! Why not {-1, -1} or even {'$', '$'}?

I tried to model what you *wanted* and was not explicit in the problem, exactly because you did not offer an alternative. I believe I even went further and said that your approach would solve this particular definition of the problem but not any other definition in which a different pair was to be generated.

So one programmer will use {0,0}, the second programmer will use {-1, -1} the third will use {'$', '$'} and so on. The notions of front() and back() for an empty string is not defined in the Standard.

Exactly right: each programmer, given a well formed problem will choose to generate code that complies with the requirements. In some cases you might want to use '\0' while in others a different form of separator. Unless you understand that your problem description is incomplete in assuming that strings are null terminated you won't understand this part. Give the same task to someone programming in a different language. Would you state the same problem in a different way in Java?

So it looks like this simple task is insoluble.

Well, I believe I provided a rather simple solution to it, which you might not like, but does it not yield the exact same set of std::pair<char,char> that you wanted (and did not formally state)?

If s[0], front() and back() would be equivalent for empty strings (and they shall be equivalent) then there is no any problem. The task is being done using standard algorithm std::transform

std::vector<std::string> v;

// filling the vector from some file that can contain empty records.

std::vector<std::pair<char, char>> v2;
v2.reserve( v.size() );

std::transform( v.begin(), v.end(), std::back_inserter( v2 ), [](const std:;string &s ) ->std::pair<char, char> { return { v.front(), v.back() }; } );

Why did you decide that I shall skip empty strings? Did I ask you about this? Is there some pre-conditions in my description of the task? No. there is not. I am satisfied with the result.

There *are* preconditions, the precondition that strings are null terminated.

If I need to skip empty strings I can do that further while processing the new vector using expression

if ( p.first ) { /* some code */ }

:) Bug report, you fail to apply 'some code' to my perfectly valid string:

std::string s(10,' '); s[0] = 0;

Why did you not apply 'some code' there! Why did you assume that strings would not contain a null? Did I tell you to ignore this case?

This expression is valid because I am sure that I am dealing with a text file and strings can not contain embedded zeroes.

Good, so there are even more preconditions that you failed to mention in your problem definition! So much for 'Is there some pre-conditions in my description of the task? No'

So there are two approaches
On the one hand there are unsave functions that have confusing semantic because they are not equivalent to s[0] and provide unpredictable behaviour in case when a string is empty because it is not clear what to return for empty strings. They are sources of numerous errors. One programmer do not know that these functions have undefined behavior for empty strings. Other programmer knows that but forgot to insert the check of empty strings. The third simply do not know what to return in case of empty strings.

On the other hand there are safe functions with clear consistent logic and predictable behavior. You need check whether a string empty? No problem! Use expression s.empty().

So either s[0], front() and back() are not defined for an empty container (as for example for std::vector) or if one of them is defined then the others also shall be defined.

I see two approaches, you use std::string that is designed to support nulls internally and for which 'front()' and 'back()' (together with '*begin()' and '*(end()-1)') are undefined behavior for an empty string, or you design a different string class and use it.

One of the things you keep saying is that your approach is 'safer', well, it is not necessarily safer. Consider that in my domain empty strings should never be passed to a function, and that with that in mind the function accesses `front()` directly (more probably '*begin()'). My standard library implementation can assert and let me know during debugging that there is an issue in the program that I can fix before going to production. Narrow contracts are not unsafer than wide contracts.

Chris Jefferson

unread,

Apr 23, 2013, 3:05:47 PM4/23/13

to std-dis...@isocpp.org

On 23/04/13 19:55, Vlad from Moscow wrote:

On Tuesday, April 23, 2013 10:45:39 PM UTC+4, Chris Jefferson wrote:
On 23/04/13 19:19, Vlad from Moscow wrote:
>
> I simply do not see any serious arguments except unwillingnessto
> confess the wrong approach.
> Well I will try to explain.
> For any indexed sequence in any programming language the notion of
> front() is defined as the notion s[0] (let assume that indexes start
> from 0). If the expression s[0] is valid then the expression front()
> is also valid and vice versa. These two expressions are interchangeable.
> The same way is defined the notion of back(). It is defined as an
> expression with the maximum index n for which s[n] is valid. If there
> is no such an index that greater than 0 then it means that back() is
> equivalent to s[0] provided that the expression s[0] is valid.

The standard promises that s[size()] is valid, and always returns
charT(). By this argument, do you want 'back()' to always return
charT()? This does not seem like it would be useful (and would be a
breaking change).

Yes I want that if s[0] is valid for an empty object of std::string then s.front() and s.back() shall be also valid. As I think I demonstrated that in this case we will have a predictable behavior that allow to write generic code.

So to confirm, you want (for std::string), s.back() to always return '\0'? (as that is always the value of s[size()], the last de-referencable value in the string).

For consistency, you would then also have to also include that null terminator when people iterated using 'begin()' and 'end()', so when iterating from 'begin()' to 'end()' you got the null terminator of the std::string.

This would be a complete re-design of all the functions of std::string, to include the C null terminator, which would break almost every program which uses std::string.

Chris

David Rodríguez Ibeas

unread,

Apr 23, 2013, 3:08:59 PM4/23/13

to std-dis...@isocpp.org

On Tue, Apr 23, 2013 at 2:50 PM, Vlad from Moscow <vlad....@mail.ru> wrote:

There is such a language as C that considers character arrays as empty if the first character is the terminating zero.:)

If there would not be such conception then C++ would not have container std:;string. It would suggest to use std::vector<char> for character arrays.:)

This is a rather absurd statement that has been addressed before. 'std::string' has a much richer interface than 'std::vector' it does not exist to provide a null terminator, it exists because it offers operations that make sense inside a string but not a general container.

You seem to be fixated on C. See, one of the features of C++ is that it is almost a superset of C, you can still write the code you want with C strings, but don't try to force your C-null-terminated approach on all other people that don't want to go that way!

David Rodríguez Ibeas

unread,

Apr 23, 2013, 3:12:56 PM4/23/13

to std-dis...@isocpp.org

On Tue, Apr 23, 2013 at 2:55 PM, Vlad from Moscow <vlad....@mail.ru> wrote:

Yes I want that if s[0] is valid for an empty object of std::string then s.front() and s.back() shall be also valid. As I think I demonstrated that in this case we will have a predictable behavior that allow to write generic code.

And I assume that *s.begin() and *s.end() should also be valid? You can define the contract of 'front' in terms of 'begin' in the same way that it is expressed in terms of 'operator[]'. But now the change means that 'end()' is suddendly a deference-able pointer, do you really want that? Or it is perfectly fine to have the inconsistency between 'front()' and '*begin()' but not between 'operator[]' and 'front()'. What about 'at()'?

Daniel Krügler

unread,

Apr 23, 2013, 3:14:13 PM4/23/13

to std-dis...@isocpp.org, corn...@google.com

2013/4/23 Vlad from Moscow <vlad....@mail.ru>:

>
> I simply do not see any serious arguments except unwillingnessto confess the
> wrong approach.
>
> Well I will try to explain.
>
> For any indexed sequence in any programming language the notion of front()
> is defined as the notion s[0] (let assume that indexes start from 0). If the
> expression s[0] is valid then the expression front() is also valid and vice
> versa. These two expressions are interchangeable.

Note that s[0] for an empty basic_string still is restricted compared
to other index accesses < size(), since modifying the character is
undefined behaviour. This is really an extreme borderline case and as
others have already noticed, the at() functions would throw an
exception here, so this is clearly not comparable with other indexed
positions. I would argue here that front() are well-defined, if at(0)
does not throw an exception. I see no good reason to extend this
special access any further, this would give the wrong signal to the
user that read or potentially write access might be freely allowed.
Given the potential danger of misusing this facility I see not a
considerable win of the suggested extension for the user. Furthermore,
in any generic context, this special rule would not be applicable to
any other container-like type. This functionality might be useful to
you and probably to several other people. Nonetheless the Standard
Library does not standardize *everything* that *can* be useful.

> The same way is defined the notion of back(). It is defined as an
> expression with the maximum index n for which s[n] is valid. If there is no
> such an index that greater than 0 then it means that back() is equivalent to
> s[0] provided that the expression s[0] is valid.

See above.

> Now let consider a regulat arrays. They can not be declared as having zero
> elements. So for an empty array the notions a[0] and front are invalid. Also
> take into account that for any regular array back() is always corresponds to
> a[dimension - 1] if dimension is not equal to zero (that may not be).

I sympathize with your view, but basic_string is - more or less - a
container with a special end character that is read-only. The actual
elements are the characters before this end character. You can
consider this end-character as a special manifestation of a
past-the-end iterator value. The front()/back()/at functions are
supposed to model the access to the *elements* and the special
past-the-character does not belong to these elements.

> So similar to the behaviour of regular arrays any empty object of
> std::vector has undefined expressions v[0] and v.back().

Yes.

> The other situation with character arrays. Again we may not define a
> character array with the zero dimension. However even if a character array
> has the dimension that is greater than zero we can say that a character
> array is empty if its first byte is the zero-terminating byte. So even for
> so-called empty character arrays expression s[0] is valid.

But the validity of access is not the same as for other characters,
because it is read-only. This is very similar to the past-the-end
iterator where no guarantee exists that you can dereference it.

> Class std::string is created that simulate character arrays and string
> literals. If we are saying that expression s[0] is valid then it means that
> expressions s.front() and s.back() is valid because they are
> interchangeable.

Keep in mind that we also have at(0) and this is a counter example to
your model.

> Now let assume you are saying: "I want to get the first and the last
> characters for sequence of strings."
>
> There is no any pre-conditions.

There is one, that you shall not modify the value of this past-the-end
character.

> This expression is valid because I am sure that I am dealing with a text
> file and strings can not contain embedded zeroes.

In your example they may not, but basic_string is designed to support
embedded zero characters.

> So either s[0], front() and back() are not defined for an empty container
> (as for example for std::vector) or if one of them is defined then the
> others also shall be defined.

I agree that consistency is useful, but sometimes we have to make
compromises when we have to consider the historic context or when we
take a broader view on this comparing with related concepts.
front()/back() are designed to specify the accessible limits of the
elements of a sequence and therefore the current result for string is
consistent (In addition to the outcome of at), because the final
null-terminator is no element. If we would attempt to enforce the
ideal at all costs, I would vote for making the access s[0] for an
empty string s invalid. But for historic reasons this has been decided
for to be valid. Changing this now would break the contract of large
amount of code and at least to me the win in consistency is not worth
this risk. If you find the inconsistency so repellent, I can only
suggest to (a) attempt to view on basic_string as *if* s[0] is invalid
for an empty string or (b) to provide your own helper functions that
fix the wrong view of basic_string for you.

- Daniel

Vlad from Moscow

unread,

Apr 23, 2013, 4:07:35 PM4/23/13

to std-dis...@isocpp.org, dib...@ieee.org

On Tuesday, April 23, 2013 11:04:56 PM UTC+4, David Rodríguez Ibeas wrote:

On Tue, Apr 23, 2013 at 2:19 PM, Vlad from Moscow <vlad....@mail.ru> wrote:

I simply do not see any serious arguments except unwillingness to confess the wrong approach.

"Wrong" is your statement, where everyone else seems to consider yours to be the "wrong" approach.

For any indexed sequence in any programming language the notion of front() is defined as the notion s[0] (let assume that indexes start from 0). If the expression s[0] is valid then the expression front() is also valid and vice versa. These two expressions are interchangeable.

Incorrect, 'front()' is the _first_ element in the sequence, 'back()' is the _last_ element in the sequence. In most containers 'front()' is exactly the same as '[0]' and can be implemented as such. But do not confuse the semantics with the implementation. This is what others have been trying to explain. The std::string is designed as a type that stores a sequence of characters (including possible nulls) and can be empty. That is the *design*.

s[0] is also the _first_ alement in a sequence is not it? :) Well what does mean the _last_ element? Is the valid expression s[n] with the meximum index n is the _last_ element of an indexed sequence?:) What you are saying is not an argument. It is a play on words. You are trying to introduce some notion of the _last_ element that is not expressed throufg a valid expression s[n].

If you are speaking about semantic then it is very simple: if expression s[0] is valid for an empty sequence then expressions s.front() and s,back() are also valid. Otherwise you are trying only to confuse users.

Well in fact function front() can not be used along in generic code. It always shall be accpmpanied with the condition !s.empty() shall not it? So if you want to have a safe code you need to substitute it everywhere in the code for s[0]. In this case you will have a predictable behavior: if a string not empty you will get an actual character otherwise you will get 0. It means only the following that member function front is totally useless.

Well you can substitute front() for s[0]. But what about back()?:) I do not see any other way that to write

s[s.size() == 0 ? 0 : s.size() -1]

The same way is defined the notion of back(). It is defined as an expression with the maximum index n for which s[n] is valid. If there is no such an index that greater than 0 then it means that back() is equivalent to s[0] provided that the expression s[0] is valid.

Reread your sentence, and try to determine whether it makes sense. "back() is equivalent to s[0] provided that the expression s[0] is valid". This is a very limited definition of what the general notion of 'back()' is considering that in all other situations an empty container does not have a 's[0]'.

The definition of your 'front()' and 'back()' concepts differ from what everyone else considers, and they are not better. In the standard case you have to special case accessing the element at the front and the back (this is really what 'front()' and 'back()' mean) for empty strings because an empty string does not contain any element in the front or the back. It does not contain any element period.

It looks like you do not understand what you are writing. In fact you are writing what I lready said that functions front() and bac() are useless.:) Think about this.

Now let consider a regular arrays. They can not be declared as having zero elements. So for an empty array the notions a[0] and front are invalid. Also take into account that for any regular array back() is always corresponds to a[dimension - 1] if dimension is not equal to zero (that may not be).

Wrong:

int *p = new int[0];

You have here a pointer. You have no direct access to the array You may noy place an element in this arrray. So this example does not contradict to what I said.

Now try to apply your rationale to this case. There are no elements, accessing p[0] is undefined behavior, as is accessing p[0-1].

So similar to the behaviour of regular arrays any empty object of std::vector has undefined expressions v[0] and v.back().

See code above

The other situation with character arrays. Again we may not define a character array with the zero dimension. However even if a character array has the dimension that is greater than zero we can say that a character array is empty if its first byte is the zero-terminating byte. So even for so-called empty character arrays expression s[0] is valid.

Null terminated strings, which are *not* just character arrays, have this behavior, yes.

Class std::string is created that simulate character arrays and string literals. If we are saying that expression s[0] is valid then it means that expressions s.front() and s.back() is valid because they are interchangeable.

The class std::string has been designed to represent a _string_, not a character array, not a null terminated C string. It has been adapted to provide some level of compatibility, but that is not the purpose of the type.

It is your personal opinion. I think that the main purpose was to adapt safe using of character arrays in C++

Now let return to the simple task of creating of a container of type std::vector<std::pair<char, char>> that is built based on some other container of type std::vector<std:;string> where some strings can be empty.

If you are describing some task and the description contains some pre-condition then these pre-conditions should be reflected in the corresponding code.

For example you are saying: "I want for a non-empty string to get the first and the last characters."

In this case the corresponding code could look the following way

std::pair<char, char> p;

if ( !s.empty() ) p = std::make_pair( s.front(), s.back() );

Here !s.empty() is the pre-condition that was declared in the description.

Now let assume you are saying: "I want to get the first and the last characters for sequence of strings."

The problem here is that the problem definition is _flawed_. Go ask any non C developer what the first character of an empty string is. The reality is a text, null termination is an artifact of the C implementation. Before you wanted to compare 'front()' and 'back()' with different languages. How many programming languages have null termination, in how many programming languages can you do 's[0]' blindly for a string?

It is not important how many languages use this idiom. It is important that s[0] is valid expression for empty string in C and C++.

There is no any pre-conditions. What to do in this case? Let assume that in this case it is not important whether some strings are empty or not. You have the container std::vector<std:;string> and have to build the corresponding container std::vector<std::pair<char, char>>. So you even do not know what to do.:)

:) This is sooo untrue. There is the implicit precondition that you are not stating here that the strings are null terminated. Just consider giving that same problem to a developer in a different language.

I never said that the strings are null-terminated. It is your fantasy. I relied on the C++Standard that s[0] returns '\0' for empty strings.And I suggested that the bcode will be safe that front() and back() also will return '\0' for empty strings. It is main idea of the proposal that s[0], front and back would be equivalent for empty strings. It is not even important what value they will return. It is much more important that the bahavior were defined and predictable. At present the behaviour is undefined, there is no some exception or other means of signalling an error and unpredictable.

Well, David Rodriguez Ibeas suggested to use the following functionality

[]( const std::string &w ) -> std::pair<char,char>
{
return w.empty() ? {0,0} : {w.front(), w.back()};
} );

But why did he decide that {0, 0} shall be returned in case of an empty string?! Why not {-1, -1} or even {'$', '$'}?

I tried to model what you *wanted* and was not explicit in the problem, exactly because you did not offer an alternative. I believe I even went further and said that your approach would solve this particular definition of the problem but not any other definition in which a different pair was to be generated.

Do not model. Say what are you suggesting? What is your alternative> And are you sure that other programmer will use your alternative? How many are such alternatives?:)

So one programmer will use {0,0}, the second programmer will use {-1, -1} the third will use {'$', '$'} and so on. The notions of front() and back() for an empty string is not defined in the Standard.

Exactly right: each programmer, given a well formed problem will choose to generate code that complies with the requirements. In some cases you might want to use '\0' while in others a different form of separator. Unless you understand that your problem description is incomplete in assuming that strings are null terminated you won't understand this part. Give the same task to someone programming in a different language. Would you state the same problem in a different way in Java?

One more I want that by default the bahavior were predictable. Do not tell me stories about many cases.

So it looks like this simple task is insoluble.

Well, I believe I provided a rather simple solution to it, which you might not like, but does it not yield the exact same set of std::pair<char,char> that you wanted (and did not formally state)?

No, it is not the same pair as you think. My pair is generated by the values of back and front. These values are predefined according to my proposal. Your values are arbitrary and nobody can rely on them because in some other part of the project other programmer can use other values..

If s[0], front() and back() would be equivalent for empty strings (and they shall be equivalent) then there is no any problem. The task is being done using standard algorithm std::transform

std::vector<std::string> v;

// filling the vector from some file that can contain empty records.

std::vector<std::pair<char, char>> v2;
v2.reserve( v.size() );

std::transform( v.begin(), v.end(), std::back_inserter( v2 ), [](const std:;string &s ) ->std::pair<char, char> { return { v.front(), v.back() }; } );

Why did you decide that I shall skip empty strings? Did I ask you about this? Is there some pre-conditions in my description of the task? No. there is not. I am satisfied with the result.

There *are* preconditions, the precondition that strings are null terminated.

It is not a precondition and strings are not null terminated as you think. I am suggesting that s[0], front() and back behave the same way and would return the same value for empty strings.

If I need to skip empty strings I can do that further while processing the new vector using expression

if ( p.first ) { /* some code */ }

:) Bug report, you fail to apply 'some code' to my perfectly valid string:

std::string s(10,' '); s[0] = 0;

I already explained that this string is not perfectly valid because you are dealing with a text file.

Why did you not apply 'some code' there! Why did you assume that strings would not contain a null? Did I tell you to ignore this case?

Yes I ignore this case. You have invalid data. So your should check the input. It is not the problem of this task. The error occured somewhere else. For example input data wes corrupted and this situation was not processed.

This expression is valid because I am sure that I am dealing with a text file and strings can not contain embedded zeroes.

Good, so there are even more preconditions that you failed to mention in your problem definition! So much for 'Is there some pre-conditions in my description of the task? No'

Please do not count "preconditions". It means only that you have nothing to contradict.

So there are two approaches
On the one hand there are unsave functions that have confusing semantic because they are not equivalent to s[0] and provide unpredictable behaviour in case when a string is empty because it is not clear what to return for empty strings. They are sources of numerous errors. One programmer do not know that these functions have undefined behavior for empty strings. Other programmer knows that but forgot to insert the check of empty strings. The third simply do not know what to return in case of empty strings.

On the other hand there are safe functions with clear consistent logic and predictable behavior. You need check whether a string empty? No problem! Use expression s.empty().

So either s[0], front() and back() are not defined for an empty container (as for example for std::vector) or if one of them is defined then the others also shall be defined.

I see two approaches, you use std::string that is designed to support nulls internally and for which 'front()' and 'back()' (together with '*begin()' and '*(end()-1)') are undefined behavior for an empty string, or you design a different string class and use it.

You are wrong. s[0] is already defined. I only want that back and front would have the same behaviour as s[0] for empty strings. All others will be the same.

One of the things you keep saying is that your approach is 'safer', well, it is not necessarily safer. Consider that in my domain empty strings should never be passed to a function, and that with that in mind the function accesses `front()` directly (more probably '*begin()'). My standard library implementation can assert and let me know during debugging that there is an issue in the program that I can fix before going to production. Narrow contracts are not unsafer than wide contracts.

I do not see any problem. You library will be unchanged because it will always assert if an empty string will be passed.:)

Vlad from Moscow

unread,

Apr 23, 2013, 4:09:10 PM4/23/13

to std-dis...@isocpp.org, dib...@ieee.org

I said clearly enough that all I want is that s[0], front() and back() have the same behavior for empty strings. Nothing more.

Message has been deleted

Ville Voutilainen

unread,

Apr 23, 2013, 4:28:55 PM4/23/13

to std-dis...@isocpp.org

Oops, accidental send before.

On 23 April 2013 23:07, Vlad from Moscow <vlad....@mail.ru> wrote:

Well in fact function front() can not be used along in generic code. It always shall be accpmpanied with the condition !s.empty() shall not it? So if you want to have a safe code you need to substitute it everywhere in the code for s[0]. In this case you will have a predictable behavior: if a string not empty you will get an actual character otherwise you will get 0. It means only the following that member function front is totally useless.

No. That's not at all what it means. Just because front() doesn't perform the same checking as operator[] doesn't
make front() useless. It makes front() quite the opposite, it makes it very useful.

David Rodríguez Ibeas

unread,

Apr 23, 2013, 5:25:54 PM4/23/13

to std-dis...@isocpp.org

On Tue, Apr 23, 2013 at 4:07 PM, Vlad from Moscow <vlad....@mail.ru> wrote:

If you are speaking about semantic then it is very simple: if expression s[0] is valid for an empty sequence then expressions s.front() and s,back() are also valid. Otherwise you are trying only to confuse users.

Experience says otherwise. Try to find questions in C++ dev forums where people have wondered why s[0] is valid and s.front() is not. Having a 'front()' function that has a different behavior in std::string than std::vector or std::list and that does not return an element IN the string but a terminator *after* the string would be confusing. You can also say that yours and mine are just opinions.

Well in fact function front() can not be used along in generic code. It always shall be accpmpanied with the condition !s.empty() shall not it? So if you want to have a safe code you need to substitute it everywhere in the code for s[0]. In this case you will have a predictable behavior: if a string not empty you will get an actual character otherwise you will get 0. It means only the following that member function front is totally useless.

Well you can substitute front() for s[0]. But what about back()?:) I do not see any other way that to write

What you say makes no sense whatsoever. You cannot write s[0] in *generic* code ever without checking whether the container 's' is empty or not. You keep calling your very specific use case *generic*. No, the fact that 's[0]' yields a reference is not *generic* it is quite the opposite a weird particularity of std::string for C compatibility.

Why don't you take a few seconds and provide a description of what 'front()' and 'back()' mean to you, in plain simple english, not in terms of the implementation. My definition is: 'front()' yields a reference to the first element in a container, 'back()' to the last element in the sequence. Both of them are built on the precondition that such an element exists.

I imagine you thinking: 'well yes, the null terminator'. That is what you are not quite understanding here. The null terminator is NOT an element of the std::string. It is a weird artifact for C compatibility. In literals, the null terminator is a detail of implementation. If you ask a person outside of C to describe you what the first character or a word with no characters is they will look funny at you: what do you mean the first of none?

You have here a pointer. You have no direct access to the array You may noy place an element in this arrray. So this example does not contradict to what I said.

I have no access to the array? What is *p? What is p[0]? You do realize that in both C and C++ the access operator is defined in terms of pointers, not arrays right? When you do 'int array[10]' and you later type 'array[1]' the array *decays* to a pointer, not different than 'p' in the example I provided ('int *p = new[0];') and then the [] is applied.

The class std::string has been designed to represent a _string_, not a character array, not a null terminated C string. It has been adapted to provide some level of compatibility, but that is not the purpose of the type.

It is your personal opinion. I think that the main purpose was to adapt safe using of character arrays in C++

Ok, so I have an opinion: std::string models a possibly empty sequence of characters. Your opinion is that std::string models a null terminated sequence of characters. The template 'std::basic_string<>' predates C++11, in the previous versions of the standard the restriction was stronger than it is now, and you were only allowed to call the 'const' overload of 'operator[]' with an argument of 'size()'. That is:

std::string s;
const std::string &cr = s;
cr[0]; // Ok, returns a reference to a const charT()
s[0]; // Undefined behavior pre C++11

Now can you explain how the original design of 'std::string' modelled null termination and yet you cannot use the non const operator[]?
Why if it models the null terminator, '*end()' is undefined behavior (you claimed before that you don't want to change this). How can std::string model a null terminated string and still allow for embedded null characters?

The type std::string does NOT model a C null terminated string. It has quirks that enable some form of compatibility, but it is not just a null terminated string with automatic memory management as you seem to believe.

It is not important how many languages use this idiom. It is important that s[0] is valid expression for empty string in C and C++.

Alternatively it is not important that s[0] is a valid expression (and a strange one, since it may yield for the non-const overload a non-const reference that cannot be changed without invoking undefined behavior!). Pointing at other languages is only natural since you started saying that everywhere else your model applies, but it just happens that it does not. Only C behaves as you want.

I never said that the strings are null-terminated. It is your fantasy. I relied on the C++Standard that s[0] returns '\0' for empty strings.And I suggested that the bcode will be safe that front() and back() also will return '\0' for empty strings. It is main idea of the proposal that s[0], front and back would be equivalent for empty strings. It is not even important what value they will return. It is much more important that the bahavior were defined and predictable. At present the behaviour is undefined, there is no some exception or other means of signalling an error and unpredictable.

Oh, right you didn't. You just said "I want to get the first and the last characters for sequence of strings." except that there is no such thing as the first and last character of an empty string (in other than C or your personal view). As a matter of fact, the C++ standard is quite clear in that 'size()' returns the number of characters in a string. For an empty string it returns 0, thus an empty string has NO characters. There is no *first* or *last* characters of a set of exactly *no* characters. I insist, your problem definition is *flawed*, it works only under a set of assumptions that you are making but are different from what the standard says.

I tried to model what you *wanted* and was not explicit in the problem, exactly because you did not offer an alternative. I believe I even went further and said that your approach would solve this particular definition of the problem but not any other definition in which a different pair was to be generated.

Do not model. Say what are you suggesting? What is your alternative> And are you sure that other programmer will use your alternative? How many are such alternatives?:)

Let me rephrase that. I tried to write a piece of code that would generate exactly the same result that you were expecting given, not the problem definition, but your proposed solution. And I did that only to demonstrate that you were wrong in stating that the code has to be much more complex than what it needs to be if your approach was accepted.

One more I want that by default the bahavior were predictable. Do not tell me stories about many cases.

Making it illegal to access 'front' and 'back' on an empty std::string is VERY predictable. The solution I provided is amazingly predictable, it will produce the exact same result that your solution. How is that code more or less predictable? You don't want predictibility, you want to hammer your solution to a problem that most people don't seem to have into the language.

Well, I believe I provided a rather simple solution to it, which you might not like, but does it not yield the exact same set of std::pair<char,char> that you wanted (and did not formally state)?

No, it is not the same pair as you think. My pair is generated by the values of back and front. These values are predefined according to my proposal. Your values are arbitrary and nobody can rely on them because in some other part of the project other programmer can use other values..

How is charT() read from the string and charT() written by the lambda different? In none of the cases it need to be generated from the *string*. A perfectly fine implementation of a std::string class might not allocate any memory if the string is empty (I actually expect this to be the case!), just store a null pointer (together with the size and capacity and any other bookkeeping information). In that implementation 'operator[]' might give you the 'reinterpret_cast<charT>(&m_size)'. In both cases the pair contains two characters with the value 0 that are indistinguishable from each other. The only difference is that in one case you want 'std::string' to produce the value and I generated it in the lambda, but where the code is is irrelevant.

There *are* preconditions, the precondition that strings are null terminated.

It is not a precondition and strings are not null terminated as you think. I am suggesting that s[0], front() and back behave the same way and would return the same value for empty strings.

Again, let me state it in a different way. Your problem definition asks for characters IN the string, and you are assuming that there is a 0 beyond the end of the string, which is not true. What is true is that operator[] must give a 0 in some particular cases, and it is also true that you want that behavior for 'front()' and 'back()', but those would not be the "first" and "last" characters in the string, those are just plain 0s, not characters IN the string.

I already explained that this string is not perfectly valid because you are dealing with a text file.

You claimed a couple of times that the problem was correctly and fully stated, I don't recall reading anywhere that the strings in the std::vector<std::string> did not have any null characters.

Yes I ignore this case. You have invalid data. So your should check the input. It is not the problem of this task. The error occured somewhere else. For example input data wes corrupted and this situation was not processed.

Correct, as with 'front()' if you call it with an empty string "You have invalid data. So your should check the input. It is not the problem of this task."

One of the things you keep saying is that your approach is 'safer', well, it is not necessarily safer. Consider that in my domain empty strings should never be passed to a function, and that with that in mind the function accesses `front()` directly (more probably '*begin()'). My standard library implementation can assert and let me know during debugging that there is an issue in the program that I can fix before going to production. Narrow contracts are not unsafer than wide contracts.

I do not see any problem. You library will be unchanged because it will always assert if an empty string will be passed.:)

No, if the standard mandates a behavior for 'front' and 'back', my library is NOT allowed to assert if 'front' and 'back' are called on an empty string. Unless what you intend is the standard changing to cater your particular use case and me having to implement my own std::string to be able to assert :) Not that we don't have our own implementation of string in which we do assert... but why would the standard cater your need and not my assert?

If we ignore the details, the reality is that:

- std::string does not model a null terminated string, it neved did (allows internal nulls, size() does not count the terminator, *begin() does not yield the null terminator in an empty string, end() cannot be dereferenced)

- std::string has a couple of quirks to simplify substitution of null terminated strings with the safer std::string. In particular in case you have an algorithm that operates on a C string and loops until s[i] is NULL, it does fake null termination. This is there to ease the transition, not to make std::string null terminated. See the point above.

David Rodríguez Ibeas

unread,

Apr 23, 2013, 5:31:09 PM4/23/13

to std-dis...@isocpp.org

On Tue, Apr 23, 2013 at 4:09 PM, Vlad from Moscow <vlad....@mail.ru> wrote:

And I assume that *s.begin() and *s.end() should also be valid? You can define the contract of 'front' in terms of 'begin' in the same way that it is expressed in terms of 'operator[]'. But now the change means that 'end()' is suddendly a deference-able pointer, do you really want that? Or it is perfectly fine to have the inconsistency between 'front()' and '*begin()' but not between 'operator[]' and 'front()'. What about 'at()'?

I said clearly enough that all I want is that s[0], front() and back() have the same behavior for empty strings. Nothing more.

And you do realize that while you consider inconsistent that s[0] is not equivalent to s.front(), others (me included) would consider it inconsistent if s.front() was not equivalent to *s.begin() or even s.at(0). If you are willing to accept an inconsistency, there is a current choice. Please, don't try to defend that your change fixes an inconsistency when it just moves it around. The way to make this completely consistent would be dropping the requirement that s[size()] yields a charT(), but that would break a good amount of existing code now. In the beginning it would have made the transition from C to C++ much more painful, and difficulted the adoption of the language.

There are many similar quirks in the language to ease adoption that are not the *design* of the component, but rather a necessary evil to get C++ running back in the day.

Vlad from Moscow

unread,

Apr 23, 2013, 7:04:03 PM4/23/13

to std-dis...@isocpp.org

General case is the case when you can not guarantee that a sequence will not contain empty strings. So because front and back can not be used in the feneral case they are totally useless. Moreover they can not be used in generic code. Again due to the rerason poiinted above.

All what I propose is that in the standard there would be written

For front()

"9 Effects: Equivalent to operator[](0)" (the C++ Standard).

For back()

"11 Effects: Equivalent to operator[](size() - 1)." (the C++ Standard )

if size() != 0 (my insertion). "Otherwise, returns a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior." (the C++ STandard from the description of operator [])

Vlad from Moscow

unread,

Apr 23, 2013, 7:11:00 PM4/23/13

to std-dis...@isocpp.org, dib...@ieee.org

I advice you to read the Standard where these functions are declared. For example read the description of operator []

2 Returns: *(begin() + pos) if pos < size(). Otherwise, returns a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior

The description is clear enougth There is only one inconsistent: the current description of the functions that can be considered as a serious standard defect.

Ville Voutilainen

unread,

Apr 23, 2013, 7:13:15 PM4/23/13

to std-dis...@isocpp.org

On 24 April 2013 02:04, Vlad from Moscow <vlad....@mail.ru> wrote:

No. That's not at all what it means. Just because front() doesn't perform the same checking as operator[] doesn't

make front() useless. It makes front() quite the opposite, it makes it very useful.

General case is the case when you can not guarantee that a sequence will not contain empty strings. So because front and back can not be used in the feneral case they are totally useless. Moreover they can not be used in generic code. Again due to the

Thanks for sharing. This is not a defect, so we're not going to change the standard. Feel free to find
the next thing you want to talk about.

Vlad from Moscow

unread,

Apr 23, 2013, 7:21:30 PM4/23/13

to std-dis...@isocpp.org

As I said in the very beginning I do not see any serious counter-evidences and you confirmed that indeed there are no counter-evidences.

I demonstrated that functions front() and back() are totally useless, unsafe and their usage should be avoided..

Ville Voutilainen

unread,

Apr 23, 2013, 7:24:29 PM4/23/13

to std-dis...@isocpp.org

On 24 April 2013 02:21, Vlad from Moscow <vlad....@mail.ru> wrote:

As I said in the very beginning I do not see any serious counter-evidences and you confirmed that indeed there are no counter-evidences.

I demonstrated that functions front() and back() are totally useless, unsafe and their usage should be avoided..

I agree completely. We still won't change the standard, because there's no defect. front() and back() are
specified exactly as intended, as is op[]. No defect in any of them.

Vlad from Moscow

unread,

Apr 24, 2013, 3:47:50 AM4/24/13

to std-dis...@isocpp.org, corn...@google.com

On Tuesday, April 23, 2013 11:14:13 PM UTC+4, Daniel Krügler wrote:

2013/4/23 Vlad from Moscow <vlad....@mail.ru>:
>
> I simply do not see any serious arguments except unwillingnessto confess the
> wrong approach.
>
> Well I will try to explain.
>
> For any indexed sequence in any programming language the notion of front()
> is defined as the notion s[0] (let assume that indexes start from 0). If the
> expression s[0] is valid then the expression front() is also valid and vice
> versa. These two expressions are interchangeable.

Note that s[0] for an empty basic_string still is restricted compared
to other index accesses < size(), since modifying the character is
undefined behaviour. This is really an extreme borderline case and as
others have already noticed, the at() functions would throw an
exception here, so this is clearly not comparable with other indexed
positions. I would argue here that front() are well-defined, if at(0)
does not throw an exception. I see no good reason to extend this
special access any further, this would give the wrong signal to the
user that read or potentially write access might be freely allowed.
Given the potential danger of misusing this facility I see not a
considerable win of the suggested extension for the user. Furthermore,
in any generic context, this special rule would not be applicable to
any other container-like type. This functionality might be useful to
you and probably to several other people. Nonetheless the Standard
Library does not standardize *everything* that *can* be useful.

In a generic context function at() also is useless because it throws an exceptions that only prevent to process a sequence a normal way. So neither function at() nor functions front and back() can be used in a generic context.

> The same way is defined the notion of back(). It is defined as an
> expression with the maximum index n for which s[n] is valid. If there is no
> such an index that greater than 0 then it means that back() is equivalent to
> s[0] provided that the expression s[0] is valid.

See above.

> Now let consider a regulat arrays. They can not be declared as having zero
> elements. So for an empty array the notions a[0] and front are invalid. Also
> take into account that for any regular array back() is always corresponds to
> a[dimension - 1] if dimension is not equal to zero (that may not be).

I sympathize with your view, but basic_string is - more or less - a
container with a special end character that is read-only. The actual
elements are the characters before this end character. You can
consider this end-character as a special manifestation of a
past-the-end iterator value. The front()/back()/at functions are
supposed to model the access to the *elements* and the special
past-the-character does not belong to these elements.

An empty sttring bis a special case. So all other functions except front and back have special behavior in this case. For example function at() throws an exception.Opertaot-function [] return terminating zero that may not be chamged. And only functions front() and back() have undefined bahaviour. Any function that have undefined behavior is useless in a generic case. It is obvious.

> So similar to the behaviour of regular arrays any empty object of
> std::vector has undefined expressions v[0] and v.back().

Yes.

> The other situation with character arrays. Again we may not define a
> character array with the zero dimension. However even if a character array
> has the dimension that is greater than zero we can say that a character
> array is empty if its first byte is the zero-terminating byte. So even for
> so-called empty character arrays expression s[0] is valid.

But the validity of access is not the same as for other characters,
because it is read-only. This is very similar to the past-the-end
iterator where no guarantee exists that you can dereference it.

I do not see any problem here.

> Class std::string is created that simulate character arrays and string
> literals. If we are saying that expression s[0] is valid then it means that
> expressions s.front() and s.back() is valid because they are
> interchangeable.

Keep in mind that we also have at(0) and this is a counter example to
your model.

No function at() is useless in generic case because it generates an exception. It can be usefull in a single operation or in a special case when we assume that empty strings can not be present. But it is a very partial case.

> Now let assume you are saying: "I want to get the first and the last
> characters for sequence of strings."
>
> There is no any pre-conditions.

There is one, that you shall not modify the value of this past-the-end
character.

You are mistaken. It is not a pre-conditions of the task. It is behavior of the function or functions.

> This expression is valid because I am sure that I am dealing with a text
> file and strings can not contain embedded zeroes.

In your example they may not, but basic_string is designed to support
embedded zero characters.

I do not see any problem. Well, basic_string can have embedded zeroes. And what? What is the problem? The idea is other. We will have predictable behavior of the functions and sequence processing. We will not tell fortunes what character is used in case of empty string. So if a zero-character will be encountered we alwasy can check whether indeed the string is empty It would be much worse that there is no such predetermined character and every programmer would invent its own such a character as -1, '$', '#' or something else.

.

> So either s[0], front() and back() are not defined for an empty container
> (as for example for std::vector) or if one of them is defined then the
> others also shall be defined.

I agree that consistency is useful, but sometimes we have to make
compromises when we have to consider the historic context or when we
take a broader view on this comparing with related concepts.
front()/back() are designed to specify the accessible limits of the
elements of a sequence and therefore the current result for string is
consistent (In addition to the outcome of at), because the final
null-terminator is no element. If we would attempt to enforce the
ideal at all costs, I would vote for making the access s[0] for an
empty string s invalid. But for historic reasons this has been decided
for to be valid. Changing this now would break the contract of large
amount of code and at least to me the win in consistency is not worth
this risk. If you find the inconsistency so repellent, I can only
suggest to (a) attempt to view on basic_string as *if* s[0] is invalid
for an empty string or (b) to provide your own helper functions that
fix the wrong view of basic_string for you.

I do not see any "historic context". If indeed to speak about historic context then functions back and front shall return zero character in case of an empty string. I showed already how such functions could be written in C. It is the historic context.

As I said early several times functions front and back are unsafe compared to at() or operator[] and simply useless. They may not be used alone. They always require a check that a string is not empty.

For the example I described the only correct way to do the task is to substitute front() for s[0] and back for s[s.size() == 0 ? 0 :s.size() - 1]. But why should this code be written instead of having it inside front() and back()?

So currently for the example you shall to write something as

[]( std::string &s ) -> std::pair<char, char>

{

if ( s.empty() ) return std::make_pair( s[0], s[0] );

else return std::make_pair( s.front(), s.back() );

}

Or you can rewrite this in one line as for example

[]( std::string &s ) -> std::pair<char, char>

{

return s.empty() ? std::make_pair( s[0], s[0] ) : std::make_pair( s.front(), s.back() );

}

But in any case this code looks badly.

It would be much better if the code would look the following way

[]( std::string &s ) -> std::pair<char, char>

{

return std::make_pair( s.front(), s.back() );

}

All thsi demonstrates the useless of functions back() and fron().

And making this minor standard changing that I am suggesting will not break any code as you are saying.

- Daniel

Daniel Krügler

unread,

Apr 24, 2013, 4:13:20 AM4/24/13

to std-dis...@isocpp.org, corn...@google.com

2013/4/24 Vlad from Moscow <vlad....@mail.ru>:

>
>> > The other situation with character arrays. Again we may not define a
>> > character array with the zero dimension. However even if a character
>> > array
>> > has the dimension that is greater than zero we can say that a character
>> > array is empty if its first byte is the zero-terminating byte. So even
>> > for
>> > so-called empty character arrays expression s[0] is valid.
>>
>> But the validity of access is not the same as for other characters,
>> because it is read-only. This is very similar to the past-the-end
>> iterator where no guarantee exists that you can dereference it.
>>
> I do not see any problem here.

It depends what you mean with "problem". All I'm trying to say here is
that the access to the past-the-end character of basic_string is not
equivalent to that of the elements of the string. This is a kind of
past-the-end value, not a conceptual element.

>> > Class std::string is created that simulate character arrays and string
>> > literals. If we are saying that expression s[0] is valid then it means
>> > that
>> > expressions s.front() and s.back() is valid because they are
>> > interchangeable.
>>
>> Keep in mind that we also have at(0) and this is a counter example to
>> your model.
>>
> No function at() is useless in generic case because it generates an
> exception. It can be usefull in a single operation or in a special case when
> we assume that empty strings can not be present. But it is a very partial
> case.

Fundamentally, at() is the checking variant of operator[]. The only
difference is that you can rely on the exception which makes this
well-defined. I don't consider this as a partial case and it
consistent with the current specification of front() and back().

>> > Now let assume you are saying: "I want to get the first and the last
>> > characters for sequence of strings."
>> >
>> > There is no any pre-conditions.
>>
>> There is one, that you shall not modify the value of this past-the-end
>> character.
>
> You are mistaken. It is not a pre-conditions of the task. It is behavior of
> the function or functions.

It is a requirement in regard to the caller of these functions, in
this sense this function has a narrow contract. In theory an
implementation could detect this and could produce an "access
violation". This is very different in regard to the actual elements of
a non-constant basic_string object, that can be freely changed.

> I do not see any "historic context". If indeed to speak about historic
> context then functions back and front shall return zero character in case of
> an empty string. I showed already how such functions could be written in C.
> It is the historic context.
>
> As I said early several times functions front and back are unsafe compared
> to at() or operator[] and simply useless. They may not be used alone. They
> always require a check that a string is not empty.
> For the example I described the only correct way to do the task is to
> substitute front() for s[0] and back for s[s.size() == 0 ? 0 :s.size() -
> 1]. But why should this code be written instead of having it inside front()
> and back()?
> So currently for the example you shall to write something as
>
> []( std::string &s ) -> std::pair<char, char>
> {
> if ( s.empty() ) return std::make_pair( s[0], s[0] );
> else return std::make_pair( s.front(), s.back() );
> }
>
> Or you can rewrite this in one line as for example
>
> []( std::string &s ) -> std::pair<char, char>
> {
> return s.empty() ? std::make_pair( s[0], s[0] ) : std::make_pair(
> s.front(), s.back() );
> }
>
> But in any case this code looks badly.
>
> It would be much better if the code would look the following way
>
> []( std::string &s ) -> std::pair<char, char>
> {
> return std::make_pair( s.front(), s.back() );
> }
>
> All thsi demonstrates the useless of functions back() and fron().

A special scenario like yours does not demonstrate a general
usefulness. The Library is not intended to support every useful
scenario, especially not, if the effects would cause much larger
inconsistencies as has been argued by several participants of this
discussion.

> And making this minor standard changing that I am suggesting will not break
> any code as you are saying.

It would cause several other inconsistencies and therefore I have not
heard any other participant of this thread voting for your change
suggestion. Change suggestions to the Standard are decisions based on
consensus. I don't see even the slightest consensus here. This means I
don't see even the slightest chance for a change at the moment.

Again, I'm emphasizing that I understand that your model is a valid
one in some scope, I'm not denying this. But please also understand
that the Standard won't change something just because there is a valid
model somewhere. In such cases where your model is important, the
general recommendation is to define your own type or to wrap the
standard type and fix this in the wrapper.

- Daniel

Chris Jefferson

unread,

Apr 24, 2013, 4:25:15 AM4/24/13

to std-dis...@isocpp.org

On 24/04/13 08:47, Vlad from Moscow wrote:
>
> ....

It seems clear to me that no-one on this mailing list other than you
wants this change, and your arguments are not persuading people, and our
arguments against are not convincing you. (Please do not reply saying
there are "no good arguments against", I believe my arguments are good,
and yours are bad. We have clearly reached an impassable position).

However, persuading people on this mailing list will make no difference
one way or the other to if this change ends up in the standard.

If you really care about this, then instead of arguing on this mailing
list, you should write up a clear paper for the standards committee.

This thead has outlived it's usefulness at this point I feel.

Chris

Vlad from Moscow

unread,

Apr 24, 2013, 4:49:02 AM4/24/13

to std-dis...@isocpp.org, corn...@google.com

I see teh only problem that all you are unable to understand that my example is not a "special scenario". It is a general case when a container can contain any strings including empty. Even if you initially have a container without empty strings however due to some processing empty strings can appear in it. So the usage of front() and back() are unsafe and useless. And I demonstrated this.

Daniel Krügler

unread,

Apr 24, 2013, 4:53:33 AM4/24/13

to std-dis...@isocpp.org, corn...@google.com

Yes, I agree that this is just a problem of mine. Therefore I think it
is best, if I'm no longer trying to cause more confusion within this
thread.

- Daniel

David Rodríguez Ibeas

unread,

Apr 24, 2013, 9:54:37 AM4/24/13

to std-dis...@isocpp.org

On Tue, Apr 23, 2013 at 7:21 PM, Vlad from Moscow <vlad....@mail.ru> wrote:

As I said in the very beginning I do not see any serious counter-evidences and you confirmed that indeed there are no counter-evidences.

I demonstrated that functions front() and back() are totally useless, unsafe and their usage should be avoided..

This is an opinion and can be easily refuted by counter example. I use 'front()' and 'back()' and they serve their purpose. Don't confuse 'useless' with don't solve this particular problem at hand.

You insist in that there are no good rationale against your change, but you fail to acknowledge that the exception is 'operator[]' and that the exception is only to ease transition from C style strings, like:

void str_to_upper(char* str) {
for (int i = 0; str[i]; ++i) {
str[i] = to_upper(str[i]);

}
}

With the special casing of 'operator[]' the refactoring to work with 'std::string' is as simple as changing the type of the argument. That is the reason for that particular special behavior.

Now the behavior of 'operator[]' is *strange* and might be surprising. Other than in your particular use case, there are no guarantees that a string won't contain null characters, so when the user in your proposed modified version does 's.front()' and get's a null character, there is no information as of whether modifying that character will lead to undefined behavior. This in itself is problematic as you are designing a function for everyone's use where it is not obvious whether it yields perfectly fine or undefined behavior. Yes, you can require that if you want to modify it you need to test for 'empty()', but the fact is that by guaranteeing that 'front()' will yield a correct value the distinction of what is allowed or not becomes less clear. Yes, that is exactly the case of 'operator[]', with the difference that in that case it enabled a simpler transition from C strings, and in the balance the latter became more important than the former. But that does not mean you want to keep sprinkling the interface with peculiarities: this yields a reference to either a character in the string or else a different character that is somewhere else and cannot be modified, so you are free to call it, but not to use the result as you please...

It is more consistent and simpler for most people to just place the preconditions on the interface of the function rather than the value returned. On the function interface it can be easily tested and asserted on the returned value it is not. Consider:

void change( char& ch );

char a = 'a';
char * pch = &a; // perfectly legal in your proposal
if (condition) pch = &a.front(); // also perfectly legal in your proposal
... bunch of code ...
if (another_condition)
change(*ch); // undefined behavior depending on the values of 's', 'condition', 'another_condition' and 'o'

The point where you might want to use the variable can be arbitrarily detached from the original string from which the character is. In the current standard, the library can assert when you call 's.front()' if the string is empty and that will yield a clear error report in exactly the location where the issue arised. Something went wrong and you caught it. With your proposal, the first two lines won't trigger any alarm in a code review as the behavior is guaranteed by the standard, and by the time that you try to modify the character it is not clear where it came from. Furthermore, being a plain modification, the undefined behavior might not be easy to catch at this point and it might trigger even later.

Consider for example an implementation (I already mentioned this) that stores a pointer and two ints, for the size and capacity. A string is empty if the size() == 0, if memory had already been acquired, then the pointer and capacity values will be non-zero, but if the string was initially empty no memory has been allocated. This is a perfectly valid implementation and has the advantage that it will skip a dynamic memory allocation if the string never gets a non-empty value. Now, to implement `operator[]` you need to return a reference to a `char` somehow. Well, simple hack: since the 'size' field is 0 and 'sizeof(char) < sizeof(std::size_t)' just '*reinterpret_cast<char*>(&size)' after all the caller cannot change the value, right?

Now the problem is that in the code above the undefined behavior would represent itself as a change in the 'size' of the string, and nothing else. Only much later when the string is used it will become an issue ('size>0', o well, then the pointer must be valid!).

Again, you can claim that this is already the case with 'operator[]', but that does not make this less of a problem. We have an issue with 'operator[]' and your proposal wants to extend the problems to other functions in the interface. That is not making the class better, but actually worse.

David Rodríguez Ibeas

unread,

Apr 24, 2013, 11:16:12 AM4/24/13

to std-dis...@isocpp.org

On Wed, Apr 24, 2013 at 4:49 AM, Vlad from Moscow <vlad....@mail.ru> wrote:

I see teh only problem that all you are unable to understand that my example is not a "special scenario". It is a general case when a container can contain any strings including empty. Even if you initially have a container without empty strings however due to some processing empty strings can appear in it. So the usage of front() and back() are unsafe and useless. And I demonstrated this.

It *is* a special scenario for different things. The first is how often the problem of extracting the first and last character of the string happen in real life? (infrequent implies somehow special). In how many cases for an empty string you would want to generate a value rather than say skip it? (this is just one of the options, so it is only a subset of the previous). How many times if you want to produce a value that value needs to be 'std::pair<char,char>(0,0)'? (why not 'std::pair<char,char>('^','$')'?) You are not aiming to solve a *general* problem, but a very particular one, and you want a change in the library that will make your *particular* use case simpler in just a couple of characters.

If the pair to be stored was not (0,0) then you would have to fall back to the code I produced. If empty strings where not to be processed then you would have a similar check (if (!s.empty())...), if...

Vlad from Moscow

unread,

Apr 24, 2013, 11:47:15 AM4/24/13

to std-dis...@isocpp.org, dib...@ieee.org

On Wednesday, April 24, 2013 7:16:12 PM UTC+4, David Rodríguez Ibeas wrote:

On Wed, Apr 24, 2013 at 4:49 AM, Vlad from Moscow <vlad....@mail.ru> wrote:

I see teh only problem that all you are unable to understand that my example is not a "special scenario". It is a general case when a container can contain any strings including empty. Even if you initially have a container without empty strings however due to some processing empty strings can appear in it. So the usage of front() and back() are unsafe and useless. And I demonstrated this.

It *is* a special scenario for different things. The first is how often the problem of extracting the first and last character of the string happen in real life? (infrequent implies somehow special).

It is totally unimportant "how often the problem of extracting the first and last character of a string happens in the real life". Because any usage of functtions front() and back() is the extraction either of the first or of the last characters. So it is the global problem of these functions. It is not "a special scenario" for these functions.:)

In how many cases for an empty string you would want to generate a value rather than say skip it?

To process only non-empty strings iis indeed the special scenario, because as I already said you can not guarantee in general case that a processed container that initially has no empty strings will not contain them as the result of some processing.

So using unsafe functions front() and back() with undefined behavior in case of an empty string leads to errors.

(this is just one of the options, so it is only a subset of the previous). How many times if you want to produce a value that value needs to be 'std::pair<char,char>(0,0)'? (why not 'std::pair<char,char>('^','$')'?) You are not aiming to solve a *general* problem, but a very particular one, and you want a change in the library that will make your *particular* use case simpler in just a couple of characters.

Indeed in some special scenario you can use some special characters but if you want a predictable behavior the functions shall provide it. '\0' is taken because front and back shall be equivalent to operator [] for ampty strings. Empty string is a special case that is fixed in the standard. '

If the pair to be stored was not (0,0) then you would have to fall back to the code I produced. If empty strings where not to be processed then you would have a similar check (if (!s.empty())...), if...

See above.

David Rodríguez Ibeas

unread,

Apr 24, 2013, 1:20:06 PM4/24/13

to std-dis...@isocpp.org

We agree to disagree. Your definition of *generic* and *safe* are different than mine. To me 'front()' and 'back()' are perfectly safe, they have preconditions you need to check, but it won't *seem to work* only to cause undefined behavior later on. Your approach makes the call to 'front()' safe, but the use of the returned object unsafe. It is not *safer* it is *less safe* since now the undefined behavior on misuse will be displaced from the cause.

It is perfectly *generic* to require testing preconditions. I already mentioned 'begin()', which you don't want to change. Is dereferencing the iterator returned by 'begin()' safe? Just as much as 'front()', it has preconditions and you must meet before using it.

What is not *safe* is blindly calling functions that might have preconditions. If you want *generic* code you must cater for that genericity and that means testing preconditions. The solution I provided, which has just a couple of extra characters on the user side than your solution is more *generic* than what you want, it enables the use of different tokens, it does not confuse strings that start or stop with a null terminator with empty strings.

My opinion being given, you can just ignore it as you did with all other opinions given before. I agree with Chris that this thread has outlived its usefulness.

Cheers,
David

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-discussion/?hl=en.

Vlad from Moscow

unread,

Apr 24, 2013, 2:02:25 PM4/24/13

to std-dis...@isocpp.org, dib...@ieee.org

If to look through section "21.4.5 basic_string element access" you will see that function operator [] has defined behavior in case when a string is empty. It returns the terminating zero. Member function at() also has defined behavior. It throws an exceprion. And only these two functions, front() and back(), have undefined behavior.

You should always avoid to write functions with undefined behavior. It is simply a bad design of a function and very bad style of programming. And it is no matter how many papers you will write listing preconditions.:) .Functions shall be simple, safe and have predictable and defined default behavior in some extra situations.

You are saying that you need to check the preconditions. In fact you have confirmed that these functions are useless because it is better to use operator []( 0 ) instead of front() and and operator [] ( size() == 0 ? 0 : size() - 1] instead of back. So any programmer seeing this stupidy as the result will write his own functions . For example

const charT & safe_front() const { return size() == 0 ? '\0' : ( *this )[0]; }

const charT & safe_back() const { return size() == 0 ? '\0' : ( *this )[size() - 1]; }

becuase in the C++ Standard Committee there is no understanding that to write a function with undefined behavior is very bad idea.:)

As for begin() then the purpose of iterators is to use them in pairs begin() - and(). So there is no any problem. If begin() is equal to end() nobody will dereference begin().

But std::string is a special type that inherited its behavior from C. That is why we have two kinds of containers with a direct access iterator and indexing: std::vector and std:;string.

Ville Voutilainen

unread,

Apr 24, 2013, 2:03:36 PM4/24/13

to std-dis...@isocpp.org

On 24 April 2013 21:02, Vlad from Moscow <vlad....@mail.ru> wrote:

You should always avoid to write functions with undefined behavior. It is simply a bad design of a function and very bad style of

You seriously don't have any idea what you're talking about.

Vlad from Moscow

unread,

Apr 24, 2013, 2:37:17 PM4/24/13

to std-dis...@isocpp.org

I am sure that in case of containers it is a bad idea to write functions with undefined behavior. Otherwise for example function at() would not appear. Containers are not low-level programming.

Reply all

Reply to author

Forward

A contradiction in the description of basic_string element access.

Vlad from Moscow

Ville Voutilainen

Vlad from Moscow

Vlad from Moscow

Nevin Liber

Vlad from Moscow

Nicol Bolas

Vlad from Moscow

Olaf van der Spek

Vlad from Moscow

Olaf van der Spek

Nevin Liber

Vlad from Moscow

Olaf van der Spek

Vlad from Moscow

Vlad from Moscow

Olaf van der Spek

Vlad from Moscow

David Rodríguez Ibeas

Vlad from Moscow

Vlad from Moscow

Vlad from Moscow

Vlad from Moscow

corn...@google.com

Vlad from Moscow

Vlad from Moscow

Nicol Bolas

Vlad from Moscow

corn...@google.com

Vlad from Moscow

corn...@google.com

corn...@google.com

Vlad from Moscow

corn...@google.com

Ville Voutilainen

Nicol Bolas

Ville Voutilainen

Vlad from Moscow

Olaf van der Spek

Chris Jefferson

Vlad from Moscow

Olaf van der Spek

Vlad from Moscow

David Rodríguez Ibeas

Chris Jefferson

David Rodríguez Ibeas

David Rodríguez Ibeas

Daniel Krügler

Vlad from Moscow

Vlad from Moscow

Ville Voutilainen

David Rodríguez Ibeas

David Rodríguez Ibeas

Vlad from Moscow

Vlad from Moscow

Ville Voutilainen

Vlad from Moscow

Ville Voutilainen

Vlad from Moscow

Daniel Krügler

Chris Jefferson

Vlad from Moscow

Daniel Krügler

David Rodríguez Ibeas

David Rodríguez Ibeas

Vlad from Moscow

David Rodríguez Ibeas

Vlad from Moscow

Ville Voutilainen

Vlad from Moscow