atof(string_view) and others

1,270 views
Skip to first unread message

Matthew Fioravante

unread,
Sep 30, 2014, 11:47:40 AM9/30/14
to std-pr...@isocpp.org
Now that we have string_view, it might be a good idea to have overloads of atoX and strtoX which take string_view arguments.

double atof(string_view str);

int atoi(string_view str);
long atol(string_view str);
long long atoll(string_view str);

long strtol(string_view str, string_view& tail, int base);
long strtol(string_view str, int base);

long long strtoll(string_view str, string_view& tail, int base);
long long strtoll(string_view str, int base);

long strtoul(string_view str, string_view& tail, int base);
long strtoul(string_view str, int base);

long long strtoull(string_view str, string_view& tail, int base);
long long strtoull(string_view str, int base);


std::intmax_t strtoimax(string_view str, string_view& tail, int base);
std::intmax_t strtoimax(string_view str, int base);

std::uintmax_t strtoumax(string_view str, string_view& tail, int base);
std::uintmax_t strtoumax(string_view str, int base);


float strtof(string_view str, string_view& tail);
float
strtof(string_view str);

double strtof(string_view str);
double strtof(string_view str, string_view& tail);

long double strtof(string_view str);
long double strtof(string_view str, string_view& tail);



Why?
  1. strtod(const char*, char**) assumes null termination, making it unusable with an arbitrary string_view
  2. Many implementations of strtod(const char*, char**) call strlen(). In some of my applications which do a lot of string -> float conversion, I have found this to have a measurable impact on performance. By passing in a string_view, we can avoid these stupid redundant calls to strlen().
  3. Converting std::string to int / float becomes more natural (no calling .c_str()) and efficient (no ridiculous strlen calls inside) because it converts to string_view.

The alternative is redesigning a completely new str to number API and there was a big discussion about this a while ago with debates about return values, out parameters, optional, expected, and exceptions. That's probably the best solution in the long run but it might take a long time to get it right. In the meantime, we could at least maintain status quo and have some way of doing str to number conversions using string_view.

These could also be built on top of lower level primitives like this:


float atof_l(const char* str, size_t len);

The above could also be added to the C standard, allowing anyone building a higher level language ontop of C to use non-null terminated strings.

Bo Pesson

unread,
Sep 30, 2014, 12:33:59 PM9/30/14
to std-pr...@isocpp.org
Yes, Why?

The counter argument being that there are WAY too many overloads
already. Is adding even more of them an improvement?


Bo Persson



Brent Friedman

unread,
Sep 30, 2014, 12:47:46 PM9/30/14
to std-pr...@isocpp.org
Yes, it is an improvement, because string_view is semantically different (not null-terminated) from all of the existing overloads. Shall I incur the significant cost of std::string conversion instead?

All string manipulation functions that can be plausibly supported should have a string_view version.







--

--- You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

Nevin Liber

unread,
Sep 30, 2014, 1:26:16 PM9/30/14
to std-pr...@isocpp.org
On 30 September 2014 10:47, Matthew Fioravante <fmatth...@gmail.com> wrote:
Now that we have string_view, it might be a good idea to have overloads of atoX and strtoX which take string_view arguments.

If you want to add something like this, please propose a templated function instead, as in

template<typename N>
N to_number(string_view str);

That way it works with all the typedefs (such as uint32_t) for the numeric types.
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

Matthew Fioravante

unread,
Sep 30, 2014, 1:32:25 PM9/30/14
to std-pr...@isocpp.org

strto<T>() might be a better name, but that's getting into bikeshed territory.


string_view s
= /*something */
auto i = strto<int>(s);
auto f = strto<float>(s);


If we go this route, then all of the questions about return values, out parameters, how to return an error status and a value, exceptions, etc... all come back. I guess we can revisit that discussion again. I'll write up a basic proposal sketch to start the discussion.
 

Andy Prowl

unread,
Sep 30, 2014, 1:43:09 PM9/30/14
to std-pr...@isocpp.org
That would be a useful addition in my opinion.

Regarding the name, considering that we have std::to_string() which converts a number into a string, I believe calling the operation which converts a string into a number std::to_number() would be more appropriate.

Kind regards,

Andy

Matthew Fioravante

unread,
Sep 30, 2014, 2:35:04 PM9/30/14
to std-pr...@isocpp.org
I'll write up a formal draft with the details, but here's a basic sketch of what I'm thinking:

//Parses str as a base base integer and stores the result in target.
//tail will point to the tail of the string after the last character
//If successful, returns a default constructed error_code.
//If an error occurs, the error will be stored in the error code (Specification TBD) and target will be left unmodified.
template <typename Integral>
error_code to_number
(Integral& target, string_view& tail, string_view str, int base);

template <typename Integral>
error_code to_number
(Integral& target, string_view str, int base) {
  string_view tail
;
 
return to_number(target, tail, str, base);
}

template <typename Integral>
Integral to_number(error_code& ec, string_view& tail, string_view str, int base, Integral ret_on_error = Integral{}) {
 
Integral val = ret_on_error;  
  ec
= to_number(val, tail, str, base);
 
return val;
}

template <typename Integral>
Integral to_number(string_view& tail, string_view str, int base, Integral ret_on_error = Integral{}) {
  error_code ec
; return to_number(ec, tail, str, base, ret_on_error);
}

template <typename Integral>
Integral to_number(error_code& ec, string_view str, int base, Integral ret_on_error = Integral{}) {
  string_view tail
; return to_number(ec, tail, str, base, ret_on_error);
}

template <typename Integral>
Integral to_number(string_view str, int base, Integral ret_on_error = Integral{}) {
  string_view tail
;
  error_code ec
;
 
return to_number(ec, tail, str, base, ret_on_error);
}

The out parameter version is useful when you want to use overloading to deduce the type. Its also very natural for error checking.

string_view s = "1234,5678";

int x, y;
if(auto ec = to_number(x, s, s, 10)) {
 
//handle error
}
s
.remove_prefix(1); //skip the ,
if(auto ec = to_number(y, s, s, 10)) {
 
//handle error
}
assert(s.empty());

Returning the value instead of the error code lets you use auto. Its also more handy if you don't care so much about checking errors and just want to use the default value if parsing failed.


//x is set to the parsed integer on success, or -1 on failure
auto x = to_number<int>(s, 10, -1);

The floating point variants would look identical, just without the base argument.


Miro Knejp

unread,
Sep 30, 2014, 3:03:06 PM9/30/14
to std-pr...@isocpp.org
Then why not go the whole route and make it to_string/from_string and not limit it just to numbers?

Btw "it might take a long time to get it right": the next standard is at least another three years out.
--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

Zhihao Yuan

unread,
Sep 30, 2014, 3:56:28 PM9/30/14
to std-pr...@isocpp.org
On Tue, Sep 30, 2014 at 3:03 PM, Miro Knejp <miro....@gmail.com> wrote:
Then why not go the whole route and make it to_string/from_string and not limit it just to numbers?

`to_string` is not really an efficient interface to me.  Writing to an output iterator
should be the generic interface for formatting individual objects to strings.

--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___________________________________________________
4BSD -- http://bit.ly/blog4bsd

Matthew Fioravante

unread,
Sep 30, 2014, 4:15:10 PM9/30/14
to std-pr...@isocpp.org
On Tuesday, September 30, 2014 3:56:28 PM UTC-4, Zhihao Yuan wrote:
On Tue, Sep 30, 2014 at 3:03 PM, Miro Knejp <miro....@gmail.com> wrote:
Then why not go the whole route and make it to_string/from_string and not limit it just to numbers?

`to_string` is not really an efficient interface to me.  Writing to an output iterator
should be the generic interface for formatting individual objects to strings.


I agree. I don't use this function very often at all. I usually wind up doing an snprintf() into a buffer on the stack or something else.

Even with move semantics its not efficient to return a newly created string because you are forced to allocate new memory for the returned object every time.
Using an out parameter is better, similar to std::getline() because you can reuse one string object for multiple conversions and only do the memory allocation once. This issue has been discussed several times before at C++ talks and on blog posts.

Using a std::string& out parameter is still not good enough though because you are forced to use a std::string as your destination buffer to store the characters. Many times you don't want to use a std::string.

There are several common cases:
  • Fixed size character array (char[N], std::array<char,N>, std::array_view<char>, (char*, char*), (char*, size_t)) with truncate semantics, possibly which lives on on the stack
  • Dynamically growing buffer (std::string, std::vector<char>, user defined string class)
  • A container which pre-allocates a fixed size buffer on the stack and falls back to growing heap allocation after that.
Building an interface based on output iterators could allow for and abstract all of these possibilities.

Miro Knejp

unread,
Sep 30, 2014, 4:53:14 PM9/30/14
to std-pr...@isocpp.org

Am 30.09.2014 um 22:15 schrieb Matthew Fioravante:
> Building an interface based on output iterators could allow for and
> abstract all of these possibilities.
>
The out iterator approach has been discussed a lot in the formatting
thread and it boiled down to character encodings and not having to
provide the entire implementation in header files and still keep it
efficient. At some point you have to start thinking about locale support
(in both directions).

Zhihao Yuan

unread,
Sep 30, 2014, 6:02:37 PM9/30/14
to std-pr...@isocpp.org
We can have two sets of overloads, one for locale-unaware, one
for locale-aware, and the later one converts the output iterator
into any_iterator for being used with non-inlined implementations.
To speed up some output iterators known by the library, like char*,
string::iterator, etc, we can bypassing such a conversion by
adding more overloads in an implementation.  It does not look like
a blocking issue to me.

Thiago Macieira

unread,
Sep 30, 2014, 8:41:50 PM9/30/14
to std-pr...@isocpp.org
On Tuesday 30 September 2014 12:25:34 Nevin Liber wrote:
> On 30 September 2014 10:47, Matthew Fioravante <fmatth...@gmail.com>
>
> wrote:
> > Now that we have string_view, it might be a good idea to have overloads of
> > atoX and strtoX which take string_view arguments.
>
> If you want to add something like this, please propose a templated function
> instead, as in
>
> template<typename N>
> N to_number(string_view str);
>
> That way it works with all the typedefs (such as uint32_t) for the numeric
> types.

Make that:

template <typename N>
optional<N> to_integral(string_view str, int base = 10,
const char **endptr = nullptr);
template <typename N>
optional<N> to_floating_point(string_view str, const char **endptr = nullptr);

The conversion can fail and bases don't make sense for floating-point.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Miro Knejp

unread,
Sep 30, 2014, 9:12:26 PM9/30/14
to std-pr...@isocpp.org

Am 01.10.2014 um 02:41 schrieb Thiago Macieira:
> On Tuesday 30 September 2014 12:25:34 Nevin Liber wrote:
>> On 30 September 2014 10:47, Matthew Fioravante <fmatth...@gmail.com>
>>
>> wrote:
>>> Now that we have string_view, it might be a good idea to have overloads of
>>> atoX and strtoX which take string_view arguments.
>> If you want to add something like this, please propose a templated function
>> instead, as in
>>
>> template<typename N>
>> N to_number(string_view str);
>>
>> That way it works with all the typedefs (such as uint32_t) for the numeric
>> types.
> Make that:
>
> template <typename N>
> optional<N> to_integral(string_view str, int base = 10,
> const char **endptr = nullptr);
> template <typename N>
> optional<N> to_floating_point(string_view str, const char **endptr = nullptr);
>
> The conversion can fail and bases don't make sense for floating-point.
>
Actually, at least 16 does. With printf("%a") one can produce numbers in
the form [-]0xh.hhhp±d which is better suited as exact lossless
round-tripping representation.

Thiago Macieira

unread,
Sep 30, 2014, 10:12:28 PM9/30/14
to std-pr...@isocpp.org
On Wednesday 01 October 2014 03:12:31 Miro Knejp wrote:
> Actually, at least 16 does. With printf("%a") one can produce numbers in
> the form [-]0xh.hhhp±d which is better suited as exact lossless
> round-tripping representation.

That's very uncommon. Whoever is doing that probably has specialised functions
anyway.

Matthew Fioravante

unread,
Sep 30, 2014, 10:13:40 PM9/30/14
to std-pr...@isocpp.org


On Tuesday, September 30, 2014 8:41:50 PM UTC-4, Thiago Macieira wrote:
template <typename N>
optional<N> to_integral(string_view str, int base = 10,
        const char **endptr = nullptr);
template <typename N>
optional<N> to_floating_point(string_view str, const char **endptr = nullptr);

I have a few issues with optional:
  1. You lose the ability to check specific error conditions like ERANGE. I'm not sure how useful this is as I've never done an ERANGE check on a number parse but that functionality is available with the legacy functions.
  2. If you want to enable a throwing interface by just calling optional::value() and letting it throw on error, you'll be throwing a std::bad_optional_access. It would be better to throw an exception class which is more specifically related to parsing numbers and has information on what went wrong. 
Also I think its better to use a string_view& for the tail instead of a char**. The tail string itself is not null terminated so using a character pointer with which we usually associate null termination is a big misleading. Also converting the char** endptr back into a string_view is clumsy and error prone as it will call strlen() on the non-null terminated buffer unless the programmer carefully computes and specifies the new length.

I'd also like to try to make the integer conversions constexpr so setting errno is out.
 
The conversion can fail and bases don't make sense for floating-point.


Bases don't make sense but parsing a hex float might.

Matthew Fioravante

unread,
Sep 30, 2014, 10:24:11 PM9/30/14
to std-pr...@isocpp.org


On Tuesday, September 30, 2014 3:03:06 PM UTC-4, Miro Knejp wrote:
Then why not go the whole route and make it to_string/from_string and not limit it just to numbers?

Instead of from_string<T>, I like the name template <typename T> string_to<T>, as it reads directly as "string to T" when invoked in the code.

Matheus Izvekov

unread,
Sep 30, 2014, 10:26:42 PM9/30/14
to std-pr...@isocpp.org
On Tuesday, September 30, 2014 11:13:40 PM UTC-3, Matthew Fioravante wrote:
I have a few issues with optional:
  1. You lose the ability to check specific error conditions like ERANGE. I'm not sure how useful this is as I've never done an ERANGE check on a number parse but that functionality is available with the legacy functions.
  2. If you want to enable a throwing interface by just calling optional::value() and letting it throw on error, you'll be throwing a std::bad_optional_access. It would be better to throw an exception class which is more specifically related to parsing numbers and has information on what went wrong. 
You can use expected instead of optional, if that gets accepted.

Thiago Macieira

unread,
Sep 30, 2014, 10:40:50 PM9/30/14
to std-pr...@isocpp.org
On Tuesday 30 September 2014 19:13:40 Matthew Fioravante wrote:
> 1. You lose the ability to check specific error conditions like ERANGE.
> I'm not sure how useful this is as I've never done an ERANGE check on a
> number parse but that functionality is available with the legacy
> functions.

I don't see how that relates to the optional. In fact, I don't see how you
came to this conclusion.

If the parsing failed, the function will return a disengaged optional and set
errno to ERANGE or EINVAL. If it worked, it will return an engaged optional
and errno is unspecified.

That's much better than strtoul, which returns zero when it fails and that's
undistinguishable from a successful parsing of a number zero.

> 2. If you want to enable a throwing interface by just calling
> optional::value() and letting it throw on error, you'll be throwing a
> std::bad_optional_access. It would be better to throw an exception class
> which is more specifically related to parsing numbers and has information
> on what went wrong.

I again don't understand why that is a problem. What's stopping you from
throwing an exception if the returned optional is disengaged?

> Also I think its better to use a string_view& for the tail instead of a
> char**. The tail string itself is not null terminated so using a character
> pointer with which we usually associate null termination is a big
> misleading. Also converting the char** endptr back into a string_view is
> clumsy and error prone as it will call strlen() on the non-null terminated
> buffer unless the programmer carefully computes and specifies the new
> length.

Yeah, that makes sense. I was pondering that when I wrote the suggestion, but
didn't go through with it.

> I'd also like to try to make the integer conversions constexpr so setting
> errno is out.

Forget it. Non-inline functions can't be constexpr and those functions
shouldn't be inline.

At best, they could be an inline wrapper to the real parser, such as:

template <typename T> optional<T>
to_integral(string_view str, int base = 10, string_view *tail = nullptr)
{
typedef typename conditional<is_unsigned<T>::value,
unsigned long long, long long>::type U;
U min = numeric_limits<T>::min();
U max = numeric_limits<T>::max();
optional<U> result =
to_integral_helper(str, base, tail, min, max);
return result ? optional<T>(result.value()) : optional<T>();
}

This will require two only out-of-line functions, one for long long (for
signed conversions) and one for unsigned long long (for unsigned ones).

Matthew Fioravante

unread,
Sep 30, 2014, 10:57:22 PM9/30/14
to std-pr...@isocpp.org


On Tuesday, September 30, 2014 10:40:50 PM UTC-4, Thiago Macieira wrote:
On Tuesday 30 September 2014 19:13:40 Matthew Fioravante wrote:
>    1. You lose the ability to check specific error conditions like ERANGE.
>    I'm not sure how useful this is as I've never done an ERANGE check on a
>    number parse but that functionality is available with the legacy
> functions.

I don't see how that relates to the optional. In fact, I don't see how you
came to this conclusion.

Optional only gives you a boolean yes/no answer as to whether or not parsing failed. You have to get the reason for failure using some other mechanism.
 

If the parsing failed, the function will return a disengaged optional and set
errno to ERANGE or EINVAL. If it worked, it will return an engaged optional
and errno is unspecified.

Are we sure we want to continue using global errno in new interfaces? Is this still a good way to specify errors? Maybe its ok for checking return codes, but definitely not for exceptions.

If the optional object throws an exception, and then during the unwinding process something resets errno you lose the error information. The error code must be embedded in the exception object itself.
 

That's much better than strtoul, which returns zero when it fails and that's
undistinguishable from a successful parsing of a number zero.

Agree completely. Trying to indicate an error by hijacking a valid value such as 0, INT_MIN, INT_MAX, or anything else is a really terrible design. 

> 2. If you want to enable a throwing interface by just calling
>    optional::value() and letting it throw on error, you'll be throwing a
>    std::bad_optional_access. It would be better to throw an exception class
>    which is more specifically related to parsing numbers and has information
> on what went wrong.

I again don't understand why that is a problem. What's stopping you from
throwing an exception if the returned optional is disengaged?

Why impose the boilerplate of throwing onto the user? Isn't that one of the things optional and expected are supposed to do for us? It would be better if such an exception is baked into the interface, with error codes / error messages already available.

try {
  auto x = string_to<int>(s);
  foo
(x.value());
}
catch(std::range_error& e) { /* do something */ }
catch(std::invalid_argument& e) { /*do something else*/ }
 

Or like this:

try {
 
auto x = string_to<int>(s);
  foo
(x.value());
}
catch(std::system_error& e) {
 
if(e.code() == ERANGE) { /* do something */ }
 
else if(e.code() == EINVAL) { /* do something else */ }
}


 

> Also I think its better to use a string_view& for the tail instead of a
> char**. The tail string itself is not null terminated so using a character
> pointer with which we usually associate null termination is a big
> misleading. Also converting the char** endptr back into a string_view is
> clumsy and error prone as it will call strlen() on the non-null terminated
> buffer unless the programmer carefully computes and specifies the new
> length.

Yeah, that makes sense. I was pondering that when I wrote the suggestion, but
didn't go through with it.

> I'd also like to try to make the integer conversions constexpr so setting
> errno is out.

Forget it. Non-inline functions can't be constexpr and those functions
shouldn't be inline.

I agree about the floating point conversions. But does it really have to be this way for integer conversions? Parsing ints is not that complex.

Matthew Fioravante

unread,
Sep 30, 2014, 11:06:52 PM9/30/14
to std-pr...@isocpp.org


On Tuesday, September 30, 2014 10:57:22 PM UTC-4, Matthew Fioravante wrote:


On Tuesday, September 30, 2014 10:40:50 PM UTC-4, Thiago Macieira wrote:

Forget it. Non-inline functions can't be constexpr and those functions
shouldn't be inline.

I agree about the floating point conversions. But does it really have to be this way for integer conversions? Parsing ints is not that complex.
 

One use case for constexpr is that it would enable these functions to be used for creating user defined literals. 

Thiago Macieira

unread,
Sep 30, 2014, 11:31:40 PM9/30/14
to std-pr...@isocpp.org
On Tuesday 30 September 2014 19:57:22 Matthew Fioravante wrote:
> > I don't see how that relates to the optional. In fact, I don't see how you
> > came to this conclusion.
>
> Optional only gives you a boolean yes/no answer as to whether or not
> parsing failed. You have to get the reason for failure using some other
> mechanism.

Right, but:

> Are we sure we want to continue using global errno in new interfaces? Is
> this still a good way to specify errors? Maybe its ok for checking return
> codes, but definitely not for exceptions.

Why not? I don't see the problem with errno (which is actually thread-specific,
not global).

> If the optional object throws an exception, and then during the unwinding
> process something resets errno you lose the error information. The error
> code must be embedded in the exception object itself.

If something threw an exception, whether it was by consumption of the
disengaged optional or something else, the number conversion is long
forgotten. How useful is the "number out of range" information three frames up
the stack?

No, if there was a failure in converting, the code that may want to throw
needs to inspect errno and decide which meaningful exception it will throw.

> > > 2. If you want to enable a throwing interface by just calling
> > >
> > > optional::value() and letting it throw on error, you'll be throwing a
> > > std::bad_optional_access. It would be better to throw an exception
> >
> > class
> >
> > > which is more specifically related to parsing numbers and has
> >
> > information
> >
> > > on what went wrong.
> >
> > I again don't understand why that is a problem. What's stopping you from
> > throwing an exception if the returned optional is disengaged?
>
> Why impose the boilerplate of throwing onto the user?

Simple: so that people who don't use exceptions can still use this function.

Despite the direction the standard committee wants to go in, practice is that
there are large and well-known C++ projects that don't use exceptions (see the
Google coding guidelines, applying to Chrome/Chromium, Blink, V8, the Mozilla
coding guidelines, Qt, etc.).

You can easily turn code that uses errno into one that throws in case of
failure, but you can't compile code reporting errors via exceptions with -fno-
exceptions and still report the errors.

> > Forget it. Non-inline functions can't be constexpr and those functions
> > shouldn't be inline.
>
> I agree about the floating point conversions. But does it really have to be
> this way for integer conversions? Parsing ints is not that complex.

It's not just about complexity. And yes, it is complex. Have you looked at the
source code for your strtoul?

And even if it weren't about the complexity, the point is that this is a
performance-critical function. It should be hand-tuned by the implementation
for most performance (given space availability), which may make use of
processor or compiler features that violate the rules of constexprness.

Another reason is code bloat: this function should not be inline, even if it
were constexpr'able. Converting non-literal strings to integers happens
everywhere, and we don't need the source for this function duplicated
everywhere, either via inlining or via out-of-line copies in each object file
and each shared library.

I think that trying to make them constexpr is optimising for the corner-case
to the detriment of the vast majority of the common-case.

> One use case for constexpr is that it would enable these functions to be
> used for creating user defined literals.

That's an extreme corner-case. If someone wants to do a UDL for ternary or
duodecimal, they can write the function for that by themselves. As I said,
let's optimise for the vast majority of the common case.

Miro Knejp

unread,
Sep 30, 2014, 11:47:58 PM9/30/14
to std-pr...@isocpp.org

Am 01.10.2014 um 05:31 schrieb Thiago Macieira:
> Simple: so that people who don't use exceptions can still use this function.
>
> Despite the direction the standard committee wants to go in, practice is that
> there are large and well-known C++ projects that don't use exceptions (see the
> Google coding guidelines, applying to Chrome/Chromium, Blink, V8, the Mozilla
> coding guidelines, Qt, etc.).
>
> You can easily turn code that uses errno into one that throws in case of
> failure, but you can't compile code reporting errors via exceptions with -fno-
> exceptions and still report the errors.
As far as I remember the latest expected<> wasn't hardwired to
exceptions but templated on the type to hold in case of errors.

Matthew Fioravante

unread,
Oct 1, 2014, 9:22:09 AM10/1/14
to std-pr...@isocpp.org


On Tuesday, September 30, 2014 11:31:40 PM UTC-4, Thiago Macieira wrote:

> Are we sure we want to continue using global errno in new interfaces? Is
> this still a good way to specify errors? Maybe its ok for checking return
> codes, but definitely not for exceptions.

Why not? I don't see the problem with errno (which is actually thread-specific,
not global).

The alternative to errno is to use std::error_code and return it somehow either via an out parameter, return value (using out param for the int), or packaged as part of something like optional / expected / pair / tuple. 

There are so many possibilities here. I have not yet figured out which one is best.

error_code string_to(int& i, string_view& tail, string_view s);
int string_to(error_code& ec, string_view& tail, string_view s);
expected
<int,error_code> string_to(string_view& tail, string_view s);

bool string_to(int& i, string_view& tail, string_view s); //sets errno on failure
int string_to(string_view& tail, string_view s); //sets errno on failure
optional
<int> string_to(string_view& tail, string_view s); //sets errno on failure



> If the optional object throws an exception, and then during the unwinding
> process something resets errno you lose the error information. The error
> code must be embedded in the exception object itself.

If something threw an exception, whether it was by consumption of the
disengaged optional or something else, the number conversion is long
forgotten. How useful is the "number out of range" information three frames up
the stack?

If my program crashes due to an uncaught exception, "Number of out range" is a lot more useful to me than "disengaged optional". 

The ERANGE may not be as useful three frames up in catch blocks, but a parsing API which uses exceptions should throw something related to parsing, not something generic. If we're returning something like optional which can throw, then we are already introducing exceptions into the entire API as a whole and at that point we must spend some effort to provide a quality throwing interface. The return type optional is part of the API. It's behavior must be considered as an essential part of the whole package.

No, if there was a failure in converting, the code that may want to throw
needs to inspect errno and decide which meaningful exception it will throw.

At least by default throwing something like std::conversion_error or std::system_error with ERANGE error_code embedded into it would be a great improvement over std::bad_optional_access. These small changes make the interface a little bit easier to use and more expressive for free. They also make debugging (my uncaught exception example above) easier. The user can still inspect the error code manually and choose to throw something else if they want.

> Why impose the boilerplate of throwing onto the user?

Simple: so that people who don't use exceptions can still use this function.

Despite the direction the standard committee wants to go in, practice is that
there are large and well-known C++ projects that don't use exceptions (see the
Google coding guidelines, applying to Chrome/Chromium, Blink, V8, the Mozilla
coding guidelines, Qt, etc.).

You can easily turn code that uses errno into one that throws in case of
failure, but you can't compile code reporting errors via exceptions with -fno-
exceptions and still report the errors.

This is the beauty of optional and expected. It makes exception handling optional. I'm aware of many projects who decide not to use exception handling for different reasons. I have a few of my own.

If you use something like std::expected and always check for errors before retrieving the value, it will never throw and even better all of the exception throwing logic can be optimized away. You should be able to use it in any project with -fno-exceptions. A compiler could even generate a warning if there exists a code path which might make optional/expected throw when -fno-exceptions is enabled. This essentially makes the compiler check that you handled all of the possible errors. For projects which do want to use exceptions, you get them for free and don't have to write the throwing code yourself unless you don't like the exceptions provided by the library.

I think that trying to make them constexpr is optimising for the corner-case
to the detriment of the vast majority of the common-case.

We can drop constexpr. If later someone decides its good to have they can always write a followup proposal.

Matthew Woehlke

unread,
Oct 1, 2014, 11:19:41 AM10/1/14
to std-pr...@isocpp.org
On 2014-09-30 22:12, Thiago Macieira wrote:
> On Wednesday 01 October 2014 03:12:31 Miro Knejp wrote:
>> Actually, at least 16 does. With printf("%a") one can produce numbers in
>> the form [-]0xh.hhhp±d which is better suited as exact lossless
>> round-tripping representation.
>
> That's very uncommon. Whoever is doing that probably has specialised functions
> anyway.

Of course they are... because there *are* no standard functions (that
I'm aware of) to convert base-16 floating point strings back into
numbers. Therefore they don't have a choice.

I would also use / like to see this.

--
Matthew

Matthew Fioravante

unread,
Oct 1, 2014, 11:21:00 AM10/1/14
to std-pr...@isocpp.org, mw_t...@users.sourceforge.net

I am in this camp as well. Hex float parsing should be supported with the new string_to functions.

Thiago Macieira

unread,
Oct 1, 2014, 11:30:10 AM10/1/14
to std-pr...@isocpp.org
On Wednesday 01 October 2014 06:22:08 Matthew Fioravante wrote:
> > Why not? I don't see the problem with errno (which is actually
> > thread-specific,
> > not global).
>
> The alternative to errno is to use std::error_code and return it somehow
> either via an out parameter, return value (using out param for the int), or
> packaged as part of something like optional / expected / pair / tuple.

Does this std::error_code exist already? If it doesn't exist, please don't
create it. Everyone and their mom already knows about errno codes and there
are entire operating systems based around them. Inventing a new error list
sounds counter-productive.

Though I think that having a class enum so we can properly separate errors
from other uses of integers would be nice. Just as long as the error codes
remain 1:1 with errno.

> error_code string_to(int& i, string_view& tail, string_view s);
> int string_to(error_code& ec, string_view& tail, string_view s);
> expected<int,error_code> string_to(string_view& tail, string_view s);
>
> bool string_to(int& i, string_view& tail, string_view s); //sets errno on
> failure
> int string_to(string_view& tail, string_view s); //sets errno on failure
> optional<int> string_to(string_view& tail, string_view s); //sets errno on
> failure

The tail should be optional. Therefore, it can't be a reference. All those
options are out in my book.

> > If something threw an exception, whether it was by consumption of the
> > disengaged optional or something else, the number conversion is long
> > forgotten. How useful is the "number out of range" information three
> > frames up
> > the stack?
>
> If my program crashes due to an uncaught exception, "Number of out range"
> is a lot more useful to me than "disengaged optional".

That assumes that you did not attempt to correct the error. That is, your code
has a bug: either you did not validate your input prior to the parsing or you
failed to handle the parser errors.

I was thinking of normal error conditions, when you did catch the possible
errors and report them properly. If the input is allowed to fail parsing to
integral value, then you need to report that in the right way to your caller.

> The ERANGE may not be as useful three frames up in catch blocks, but a
> parsing API which uses exceptions should throw something related to
> parsing, not something generic. If we're returning something like optional
> which can throw, then we are already introducing exceptions into the entire
> API as a whole and at that point we must spend some effort to provide a
> quality throwing interface. The return type optional is part of the API.
> It's behavior must be considered as an essential part of the whole package.

Agreed, but see what I wrote above: if your code can reasonably be expected to
fail the parsing, you should check whether it succeeded before proceeding
further. If you find out it failed parsing, then you throw your appropriate
exception.

Don't depend on to_integral doing your job for you. The exceptions it throws
and the conditions it finds may not match your API's requirements.

> > > Why impose the boilerplate of throwing onto the user?
> >
> > Simple: so that people who don't use exceptions can still use this
> > function.
> >
> > Despite the direction the standard committee wants to go in, practice is
> > that
> > there are large and well-known C++ projects that don't use exceptions (see
> > the
> > Google coding guidelines, applying to Chrome/Chromium, Blink, V8, the
> > Mozilla
> > coding guidelines, Qt, etc.).
> >
> > You can easily turn code that uses errno into one that throws in case of
> > failure, but you can't compile code reporting errors via exceptions with
> > -fno-
> > exceptions and still report the errors.
>
> This is the beauty of optional and expected. It makes exception handling
> optional. I'm aware of many projects who decide not to use exception
> handling for different reasons. I have a few of my own.

I was not aware of expected. If it solves both problems, then let's use it.

Thiago Macieira

unread,
Oct 1, 2014, 11:31:43 AM10/1/14
to std-pr...@isocpp.org
How is that different from std::optional?

germinolegrand

unread,
Oct 1, 2014, 11:34:53 AM10/1/14
to std-pr...@isocpp.org
Le 01/10/2014 17:28, Thiago Macieira a écrit :
> On Wednesday 01 October 2014 06:22:08 Matthew Fioravante wrote:
>>> Why not? I don't see the problem with errno (which is actually
>>> thread-specific,
>>> not global).
>> The alternative to errno is to use std::error_code and return it somehow
>> either via an out parameter, return value (using out param for the int), or
>> packaged as part of something like optional / expected / pair / tuple.
> Does this std::error_code exist already? If it doesn't exist, please don't
> create it. Everyone and their mom already knows about errno codes and there
> are entire operating systems based around them. Inventing a new error list
> sounds counter-productive.
>
> Though I think that having a class enum so we can properly separate errors
> from other uses of integers would be nice. Just as long as the error codes
> remain 1:1 with errno.
You might want to read the five articles about it :
http://blog.think-async.com/2010/04/system-error-support-in-c0x-part-1.html


Miro Knejp

unread,
Oct 1, 2014, 11:36:50 AM10/1/14
to std-pr...@isocpp.org
On 01 Oct 2014, at 17:16 , Thiago Macieira <thi...@macieira.org> wrote:

> On Wednesday 01 October 2014 05:48:04 Miro Knejp wrote:
>> Am 01.10.2014 um 05:31 schrieb Thiago Macieira:
>>> Simple: so that people who don't use exceptions can still use this
>>> function.
>>>
>>> Despite the direction the standard committee wants to go in, practice is
>>> that there are large and well-known C++ projects that don't use
>>> exceptions (see the Google coding guidelines, applying to
>>> Chrome/Chromium, Blink, V8, the Mozilla coding guidelines, Qt, etc.).
>>>
>>> You can easily turn code that uses errno into one that throws in case of
>>> failure, but you can't compile code reporting errors via exceptions with
>>> -fno- exceptions and still report the errors.
>>
>> As far as I remember the latest expected<> wasn't hardwired to
>> exceptions but templated on the type to hold in case of errors.
>
> How is that different from std::optional?

optional<T> either has a value of type T or it doesn't have anything.
expected<T, E> either has a value of type T or a value of type E to describe why the former isn't there (and it happens that E defaults to std::exception_ptr).

Maybe the details have changed again, I don't remember how long it's been that I read a proposal/draft/whatever.


Matthew Fioravante

unread,
Oct 1, 2014, 11:41:47 AM10/1/14
to std-pr...@isocpp.org


On Wednesday, October 1, 2014 11:30:10 AM UTC-4, Thiago Macieira wrote:
On Wednesday 01 October 2014 06:22:08 Matthew Fioravante wrote:
> > Why not? I don't see the problem with errno (which is actually
> > thread-specific,
> > not global).
>
> The alternative to errno is to use std::error_code and return it somehow
> either via an out parameter, return value (using out param for the int), or
> packaged as part of something like optional / expected / pair / tuple.

Does this std::error_code exist already?

Its part of C++11 <system_error>


Though I think that having a class enum so we can properly separate errors
from other uses of integers would be nice. Just as long as the error codes
remain 1:1 with errno.

Something like this would work fine too. Also if its an enum with only the error tags which are possible, then the set of possible errors to check is enforced by the type system. For example you can get a compiler warning if you don't check them all in a switch statement.
 

> error_code string_to(int& i, string_view& tail, string_view s);
> int string_to(error_code& ec, string_view& tail, string_view s);
> expected<int,error_code> string_to(string_view& tail, string_view s);
>
> bool string_to(int& i, string_view& tail, string_view s); //sets errno on
> failure
> int string_to(string_view& tail, string_view s); //sets errno on failure
> optional<int> string_to(string_view& tail, string_view s); //sets errno on
> failure

The tail should be optional. Therefore, it can't be a reference. All those
options are out in my book.

That's very easy to do with additional overloads:

int string_to<int>(string_view& tail, string_view s);
int string_to<int>(string_view s) { string_view tail; return string_to<int>(tail, s); }



> > If something threw an exception, whether it was by consumption of the
> > disengaged optional or something else, the number conversion is long
> > forgotten. How useful is the "number out of range" information three
> > frames up
> > the stack?
>
> If my program crashes due to an uncaught exception, "Number of out range"
> is a lot more useful to me than "disengaged optional".

That assumes that you did not attempt to correct the error. That is, your code
has a bug: either you did not validate your input prior to the parsing or you
failed to handle the parser errors.

Exactly, and it would be nice if the system would help me debug and fix these bugs when they occur.
 

I was thinking of normal error conditions, when you did catch the possible
errors and report them properly. If the input is allowed to fail parsing to
integral value, then you need to report that in the right way to your caller.

A reasonable default is not a bad thing and we're already accepting a default exception policy if we adopt optional or expected. The user can always check and throw their own exceptions specific to their API if needed.

Matthew Fioravante

unread,
Oct 1, 2014, 11:54:06 AM10/1/14
to std-pr...@isocpp.org


On Wednesday, October 1, 2014 11:36:50 AM UTC-4, Miro Knejp wrote:

optional<T> either has a value of type T or it doesn't have anything.
expected<T, E> either has a value of type T or a value of type E to describe why the former isn't there (and it happens that E defaults to std::exception_ptr).

Maybe the details have changed again, I don't remember how long it's been that I read a proposal/draft/whatever.


This is the biggest problem I have with using expected. Its not finished yet and I don't know what it will look like when done, or if it will even be accepted. I also don't know what the performance implications will be. Its difficult to write a new proposal based on such shaky foundations.

I'd also like to see the object thrown not necessarily be the same as the error state. An exception object has a vtable and possibly other cruft. Why do we need to return all of those bytes?

Something like this might be better:

template <typename T, typename Error, typename Exception>
struct expected {
 
public:
  T
& value() { if(has_error()) throw Exception(_error); return _val;}
  T value_or
(T def) return has_error() ? def : _val; }

 
//...

private:
  T _val
;
 
Error _err;
};


class conversion_error : public exception {};

template <typename T> using stoiret = expected<T,error_code,conversion_error>;

template <typename T>
stoiret
<T> string_to(string_view& tail, string_view s);

Of course inventing a whole new expected / optional like thing just for these conversion functions is pretty heavy weight. Sticking to bools and error_codes as return values / out parameters side steps this whole question but may result in a more primitive interface.

Miro Knejp

unread,
Oct 1, 2014, 12:04:52 PM10/1/14
to std-pr...@isocpp.org
It's at least 3 years to the next standard, so no need to speedrun this.

In terms of performance: expected<int, std::error_code> has basically exactly the same layout as your example + one bool (_val and _err even share space in a union) and it's up to the user what to do with the error_code. optional<T> doesn't tell you what went wrong at all. You could write your own thin wrapper that throws it on error etc. It's certainly easier to add exceptions to an exception-free design than the reverse.

My suggestion is to *carefully anticipate* the existence of expected and optional in 3 years. If it turns out not to be the case halfway down the road one can still design a helper type for the return values.

Matthew Fioravante

unread,
Oct 1, 2014, 12:14:34 PM10/1/14
to std-pr...@isocpp.org


On Wednesday, October 1, 2014 12:04:52 PM UTC-4, Miro Knejp wrote:

My suggestion is to *carefully anticipate* the existence of expected and optional in 3 years. If it turns out not to be the case halfway down the road one can still design a helper type for the return values.


I'll sketch a hypothetical return value class in my proposal and mention that it can be implemented in terms of expected. This might also inform the future design of the expected proposal as this will be a first class example.

I believe Alexandrescu when he gave his talk about expected also used a parsing API as an example..

Anyway I think at this point something needs to be written down and formalized. Then we can debate and fine tune it.

wi...@schinmeyer.de

unread,
Nov 3, 2015, 3:30:03 AM11/3/15
to ISO C++ Standard - Future Proposals
Doesn't std::(i)strstream already handle these conversions quite nicely? How about just un-deprecating that and maybe adding a (c)string_view constructor?

Nicol Bolas

unread,
Nov 3, 2015, 8:02:08 AM11/3/15
to ISO C++ Standard - Future Proposals, wi...@schinmeyer.de
On Tuesday, November 3, 2015 at 3:30:03 AM UTC-5, wi...@schinmeyer.de wrote:
Doesn't std::(i)strstream already handle these conversions quite nicely? How about just un-deprecating that and maybe adding a (c)string_view constructor?

We shouldn't undeprecate strstreams; it's deprecated for a good reason. Instead, we should create viewstream classes that are safe versions of the strstreams.

Even so, the basic `stof` and so forth functions are far too useful to not make versions of for string_view.

Tony V E

unread,
Nov 3, 2015, 11:36:15 PM11/3/15
to Standard Proposals
You could also use variant<T, Error>.   Which is basically expected<>, but variant is almost definitely going into the library.

I would use expected<>, and then fall-back to variant<> if expected doesn't move forward.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

Reply all
Reply to author
Forward
0 new messages