Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

maybe a stupid question

89 views
Skip to first unread message

alessandro volturno

unread,
Apr 6, 2021, 10:57:16 AM4/6/21
to
Hello group,

as the title of this message says, this is going to be a silly question,
but I don't know how to make it by myself.

I'm looking for a way to parse the day of the month in a string data
with the format of dd.mm.yyyy via scanf.

Where the day of the month is given as an integer number of two digits
like the following ones:

01, 02, 03, 04, 05, 06, 07, 08, 09 (that should obviously be interpreted
as 1, 2, 3, 4, 5, 6, 7, 8, 9)

if I try with the format "%2d"

I get an erroneus parsing of 0.

thank you for your help,
alessandro

Manfred

unread,
Apr 6, 2021, 1:42:26 PM4/6/21
to
Hi,

This is comp.lang.c++, better send this question to comp.lang.c, with a
compiling piece of code of what you are trying to achieve.

Bonita Montero

unread,
Apr 6, 2021, 1:59:27 PM4/6/21
to
> This is comp.lang.c++, better send this question to comp.lang.c, with a
> compiling piece of code of what you are trying to achieve.

I don't see any necessity for asking this in comp.lang.c only.
scanf and sscanf aren't idiomatically C++-functions, but are
still included in the standard.

Paavo Helde

unread,
Apr 6, 2021, 2:03:48 PM4/6/21
to
#include <iostream>
#include <string>
#include <sstream>

int main()
{
// test data
std::string s = "06.04.2021";
std::istringstream is(s);

// parse
int day, mon, year;
char dot1, dot2;

if ((is >> day >> dot1 >> mon >> dot2 >> year) &&
dot1=='.' && dot2=='.')
{
std::cout << "day=" <<day << ", mon=" << mon << ", year=" << year;
std::cout << "\n";
}
else
{
std::cerr << "Date parsing failed on: " << s << "\n";
return EXIT_FAILURE;
}
}

HTH

Scott Lurndal

unread,
Apr 6, 2021, 2:20:08 PM4/6/21
to
Paavo Helde <myfir...@osa.pri.ee> writes:
>06.04.2021 17:57 alessandro volturno kirjutas:
>> Hello group,
>>
>> as the title of this message says, this is going to be a silly question,
>> but I don't know how to make it by myself.
>>
>> I'm looking for a way to parse the day of the month in a string data
>> with the format of dd.mm.yyyy via scanf.

#include <time.h>

char *cp;
struct tm tm;

cp = strptime("10.07.1925", "%d.%m.%Y", &tm);
if (*cp != '\0') {
/* Error, unconsumed input */
} else {
day = tm.tm_mday;
month = tm.tm_mon;
year = tm.tm_year;
}

Rud1ger Sch1erz

unread,
Apr 6, 2021, 3:20:13 PM4/6/21
to
There are many ways, this is via sscanf, as asked:

#include <stdio.h>

int main(int argc, char** argv)
{
const char* dateStr = "24.04.2020";

int n, day, month, year;

n = sscanf(dateStr, "%d.%d.%d", &day, &month, &year);

printf("%d conversions: %d-%d-%d\n", n, day, month, year);

return 0;
}

--
Tschau
Rüdiger

alessandro volturno

unread,
Apr 6, 2021, 4:05:59 PM4/6/21
to
Thank you all,

yes, this is not a strictly C++ question, but your help gave me new
points of view on the available possibilities (especially Scott's). The
strange thing is that my code (that this afternoon gave me a zero
instead of an eight) now behaves correctly even if I didn't make any change.

Two are the possibilities:

1) my view is getting worst - since I cannot distingish between a 0 and
an 8 anymore when there is too much light

2) the resolution of my screen is too high, numbers are too small and in
the GRX program I'm writing, I cannot distinguish between those symbols.

maybe both the factors are involved : )

In fact it is strange, but I was sure that my code was not working.

Just to make the picture of the game,
I'm playing with an Arduino board and one sensor. I read (at time
intervals of 20 minutes) Temperature, Pressure and Humidity of a room
and after collecting those data writing them in a Micro SD card, I
monthly plot the variation of those parameters in a program I am writing
using the GRX graphics library.

I chosed to use GRX because I did'n t know it and I wanted to give it a
try. Allegro library was probably better, but anyway the road was
started on that path and so I continue on that.

Thank you and sorry if I made you waste some time.

alessandro

Keith Thompson

unread,
Apr 6, 2021, 5:30:03 PM4/6/21
to
Is there some reason you need to use scanf?

Note that scanf has undefined behavior on numeric input if the value
can't be represented in the target type. For example, this:

scanf("%d", &n);

has undefined behavior if the input is
"100000000000000000000000000000000000000000000000000" (unless int is
surprisingly large in your implementation).

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

alessandro volturno

unread,
Apr 7, 2021, 2:48:12 AM4/7/21
to
Il 06/04/2021 23:29, Keith Thompson ha scritto:
> alessandro volturno <alessandr...@libero.it> writes:
>> as the title of this message says, this is going to be a silly
>> question, but I don't know how to make it by myself.
>>
>> I'm looking for a way to parse the day of the month in a string data
>> with the format of dd.mm.yyyy via scanf.
>>
>> Where the day of the month is given as an integer number of two digits
>> like the following ones:
>>
>> 01, 02, 03, 04, 05, 06, 07, 08, 09 (that should obviously be
>> interpreted as 1, 2, 3, 4, 5, 6, 7, 8, 9)
>>
>> if I try with the format "%2d"
>>
>> I get an erroneus parsing of 0.
>
> Is there some reason you need to use scanf?
>
> Note that scanf has undefined behavior on numeric input if the value
> can't be represented in the target type. For example, this:
>
> scanf("%d", &n);
>
> has undefined behavior if the input is
> "100000000000000000000000000000000000000000000000000" (unless int is
> surprisingly large in your implementation).
>

Yes, there is.

That's because codeblocks (which I use a graphical frontend to gdb) says:

"Setting breakpoints
Debugger name and version: GNU gdb (GDB) 8.1
Child process PID: 6704
In ?? () ()
Cannot find bounds of current function
"

and itoa() function is not part of ANSI C standard as it is reported here:

https://www.cplusplus.com/reference/cstdlib/itoa/?kw=itoa

Keith Thompson

unread,
Apr 7, 2021, 5:03:07 AM4/7/21
to
alessandro volturno <alessandr...@libero.it> writes:
> Il 06/04/2021 23:29, Keith Thompson ha scritto:
[...]
>> Is there some reason you need to use scanf?
>>
>> Note that scanf has undefined behavior on numeric input if the value
>> can't be represented in the target type. For example, this:
>>
>> scanf("%d", &n);
>>
>> has undefined behavior if the input is
>> "100000000000000000000000000000000000000000000000000" (unless int is
>> surprisingly large in your implementation).
>>
>
> Yes, there is.
>
> That's because codeblocks (which I use a graphical frontend to gdb) says:
>
> "Setting breakpoints
> Debugger name and version: GNU gdb (GDB) 8.1
> Child process PID: 6704
> In ?? () ()
> Cannot find bounds of current function
> "

Um, how is that relevant?

> and itoa() function is not part of ANSI C standard as it is reported here:
>
> https://www.cplusplus.com/reference/cstdlib/itoa/?kw=itoa

Why does ANSI C matter if you're writing C++? (If you're writing C, you
want comp.lang.c.)

itoa() isn't the only alternative to scanf(). In C, there's strtol()
and friends. And there are a number of C++ alternatives, some of which
have already been mentioned.

You can use C library functions from C++, but it's usually better to use
C++-specific functions unless there's some specific reason not to.

alessandro volturno

unread,
Apr 7, 2021, 7:42:20 AM4/7/21
to
Il 07/04/2021 11:02, Keith Thompson ha scritto:
> alessandro volturno <alessandr...@libero.it> writes:
>> Il 06/04/2021 23:29, Keith Thompson ha scritto:
> [...]
>>> Is there some reason you need to use scanf?
>>>
>>> Note that scanf has undefined behavior on numeric input if the value
>>> can't be represented in the target type. For example, this:
>>>
>>> scanf("%d", &n);
>>>
>>> has undefined behavior if the input is
>>> "100000000000000000000000000000000000000000000000000" (unless int is
>>> surprisingly large in your implementation).
>>>
>>
>> Yes, there is.
>>
>> That's because codeblocks (which I use a graphical frontend to gdb) says:
>>
>> "Setting breakpoints
>> Debugger name and version: GNU gdb (GDB) 8.1
>> Child process PID: 6704
>> In ?? () ()
>> Cannot find bounds of current function
>> "
>
> Um, how is that relevant?

It is because if I cannot debug I need a way to print the status of a
variable on screen. And since GRX doesn't allow to print other things
other than strings of text, I needed a function to convert from
numerical format to text.

Do you think that this debugging message is due to the library built
from source (probably) without the debugging info?

>> and itoa() function is not part of ANSI C standard as it is reported here:
>>
>> https://www.cplusplus.com/reference/cstdlib/itoa/?kw=itoa
>
> Why does ANSI C matter if you're writing C++? (If you're writing C, you
> want comp.lang.c.)
>
> itoa() isn't the only alternative to scanf(). In C, there's strtol()
> and friends. And there are a number of C++ alternatives, some of which
> have already been mentioned.

Thank you for this alternative, I use just a few of C++ or C std
libraries, I have heard of that but I didn't think of it because the
first to spot in my mind was atoi() and its reverse, itoa().

> You can use C library functions from C++, but it's usually better to use
> C++-specific functions unless there's some specific reason not to.

I posted here on comp.lang.C++ because when I write C++ code this is the
place where I search for help. So this time, even if the project is
written in C, I didn't pay attention to which newsgroup to post into.

Paavo Helde

unread,
Apr 7, 2021, 8:12:52 AM4/7/21
to
07.04.2021 14:42 alessandro volturno kirjutas:
> Il 07/04/2021 11:02, Keith Thompson ha scritto:
>> alessandro volturno <alessandr...@libero.it> writes:
>>> Il 06/04/2021 23:29, Keith Thompson ha scritto:
>> [...]
>>>> Is there some reason you need to use scanf?
>
> It is because if I cannot debug I need a way to print the status of a
> variable on screen. And since GRX doesn't allow to print other things
> other than strings of text, I needed a function to convert from
> numerical format to text.

That's strange, because scanf() does not convert numbers to text, rather
it does the opposite.

For converting numbers to text there are functions like sprintf() and
std::to_string().

alessandro volturno

unread,
Apr 7, 2021, 8:57:01 AM4/7/21
to
That was a fault of mine... I was expressing my needs but I wrote the
wrong function name.

And I do realize it only now. So it is now clearer the question of Mr.
K. Thompson asking me why to use scanf. I was still thinking to sprintf...

Anyway the problem is now solved. I have done what I was trying to do.

Here on the newsgrup the tone of the arguments is really high, I have
just rookie questions that I express even worse than that.

But I have to thank you all, because I've always found the right help.

alessandro

Juha Nieminen

unread,
Apr 8, 2021, 2:12:00 AM4/8/21
to
Manfred <non...@add.invalid> wrote:
> This is comp.lang.c++, better send this question to comp.lang.c

Why? std::scanf() is a 100% C++ standard library function.

What makes you think otherwise?

Juha Nieminen

unread,
Apr 8, 2021, 2:14:20 AM4/8/21
to
Keith Thompson <Keith.S.T...@gmail.com> wrote:
> Is there some reason you need to use scanf?
>
> Note that scanf has undefined behavior on numeric input if the value
> can't be represented in the target type.

What's the better alternative in standard C++?

Manfred

unread,
Apr 8, 2021, 7:55:08 AM4/8/21
to
I thought this was trivial, but since you are asking...

Because, even if the C standard library (or most of it) has been
included in the C++ standard, this function originates in the C language
and is most used in that context.
C++ provides and actually encourages other alternatives to scanf, as
Paavo gave examples of.

Thus, there's a good chance that there more users in comp.lang.c that
are experienced about scanf than there are here.
In fact you can see that all replies about scanf are from regular
posters in comp.lang.c.

Paavo Helde

unread,
Apr 8, 2021, 9:14:19 AM4/8/21
to
As scanf() can invoke UB (read: incorrect results) very easily if the
input stream does not match the expected results, it is unusable with
any external content (i.e. basically always). On top of that, it can
easily cause buffer overruns if one is not extra careful. Buffer
overruns are a major source of security bugs in C (and undeservedly also
in C++, thanks to the people who claim there is nothing wrong with using
unsafe C functions like scanf() in C++).

In C++ we have better means, so scanf() should be considered obsolete in
C++ for 30 years already, and is best left unused. Don't know or care
how they are dealing with it in C.

A little demo: should the program read in incorrect numbers happily, or
should it report an error?


#define _CRT_SECURE_NO_WARNINGS // needed for VS2019 to accept scanf
#include <iostream>
#include <string>
#include <sstream>
#include <stdio.h>

int main() {
const char* buffer = "12345678912345678";
int x;
int k = sscanf(buffer, "%d", &x);
if (k == 1) {
std::cout << "scanf() succeeded and produced: " << x << "\n";
} else {
std::cout << "scanf() failed\n";
}
std::istringstream is(buffer);
if (is >> x) {
std::cout << "istream succeeded and produced: " << x << "\n";
} else {
std::cout << "istream failed.\n";
}
}

Output:
scanf() succeeded and produced: 1578423886
istream failed.

Juha Nieminen

unread,
Apr 8, 2021, 1:40:30 PM4/8/21
to
Paavo Helde <myfir...@osa.pri.ee> wrote:
> A little demo: should the program read in incorrect numbers happily, or
> should it report an error?

I think it's not bad that the standard library offers the choice.

Paavo Helde

unread,
Apr 8, 2021, 2:42:01 PM4/8/21
to
You don't need scanf() to get incorrect numbers, you can easily have
them also with C++ streams (a design bug IMO, stream exceptions should
be on by default, but that's just me):


#include <iostream>
#include <string>
#include <sstream>

int main() {
const char* buffer = "12345678912345678";
int x;
std::istringstream is(buffer);
is >> x;
std::cout << "istream produced: " << x << "\n";
}

Output:
istream produced: 2147483647

Real Troll

unread,
Apr 8, 2021, 4:32:25 PM4/8/21
to
On 08/04/2021 19:41, Paavo Helde wrote:
>
>
> Output:
> istream produced: 2147483647

That's because you are calling an int.  However try this:

#include <iostream>
#include <string>
#include <sstream>

using namespace std;

int main() {
const char* buffer = "1234567890123456789";
long long x;
istringstream is(buffer);
is >> x;
cout << "istream produced: " << x << "\n";
return 0;
}

Real Troll

unread,
Apr 8, 2021, 4:38:49 PM4/8/21
to
On 08/04/2021 19:41, Paavo Helde wrote:

> Output:
> istream produced: 2147483647

That's because you are calling an int.  However try this:

#include <iostream>
#include <string>
#include <sstream>

using namespace std;

int main() {
const char* buffer = "1234567890123456789";
long long x;
istringstream is(buffer);
is >> x;
cout << "istream produced: " << x << "\n";
return 0;
}


Real Troll

unread,
Apr 8, 2021, 4:45:54 PM4/8/21
to
On 08/04/2021 19:41, Paavo Helde wrote:

> Output:
> istream produced: 2147483647

That's because you are calling an int.  However try this:

#include <iostream>
#include <string>
#include <sstream>

using namespace std;

int main() {
const char* buffer = "1234567890123456789";
long long x;
istringstream is(buffer);
is >> x;
cout << "istream produced: " << x << "\n";
return 0;
}



Paavo Helde

unread,
Apr 8, 2021, 4:52:55 PM4/8/21
to
Sorry, my bad. Here is a corrected version:

#include <iostream>
#include <string>
#include <sstream>

using namespace std;

int main() {
const char* buffer =
"123456789012345678912345678901234567891234567890123456789";

Real Troll

unread,
Apr 8, 2021, 5:02:52 PM4/8/21
to
On 08/04/2021 21:52, Paavo Helde wrote:
>
>>
> Sorry, my bad. Here is a corrected version:
>
>
>     long long x;


Change that line to:

string x;

Then your program will return the correct results.

You need to master the variables and their limits.



Real Troll

unread,
Apr 8, 2021, 5:18:40 PM4/8/21
to
On 08/04/2021 21:52, Paavo Helde wrote:
>
>>
> Sorry, my bad. Here is a corrected version:


You can also use this example if you want to stick with numbers:

#include <iostream>
#include <string>
#include <sstream>
#include <iomanip>

using namespace std;

int main() {
const char* buffer =
"123456789012345678912345678901234567891234567890123456789";
double x;
istringstream is(buffer);
is >> x;
cout << setprecision(80);

Paavo Helde

unread,
Apr 8, 2021, 5:36:15 PM4/8/21
to
09.04.2021 00:00 Real Troll kirjutas:
> On 08/04/2021 21:52, Paavo Helde wrote:
>>
>>>
>> Sorry, my bad. Here is a corrected version:
>>
>>
>>     long long x;
>
>
> Change that line to:
>
> string x;
>
> Then your program will return the correct results.

Sure, but Juha was keen to have a choice to get uncorrect results, how
could I possibly disappoint him now?

Rud1ger Sch1erz

unread,
Apr 8, 2021, 5:44:59 PM4/8/21
to
Real Troll <real....@trolls.com> writes:

> You need to master the variables and their limits.

Same applies to scanf (just saying).

(While I didn't miss scanf the last 30 years or so...)

--
Tschau
Rüdiger

Real Troll

unread,
Apr 8, 2021, 5:48:55 PM4/8/21
to
On 08/04/2021 22:35, Paavo Helde wrote:
>
> Sure, but Juha was keen to have a choice to get uncorrect results, how
> could I possibly disappoint him now?


I don't normally spend time trying to find incorrect results; I just
use what is already in the standard libraries of the programming languages.

I find it less interesting when intelligent people like Bonita, Bart
and others try to write their own functions and compilers when C++, C
and C# have already got most of them that works out of the box. Some
other features are developed by Boost <https://www.boost.org/> and
people can also use them.






Real Troll

unread,
Apr 8, 2021, 6:03:04 PM4/8/21
to
On 08/04/2021 22:44, Rud1ger Sch1erz wrote:
> Same applies to scanf (just saying).
>
> (While I didn't miss scanf the last 30 years or so...)
>

In Visual Studio you can use sscanf() like so:

> sscanf( dtm, "%s %s %d  %d", weekday, month, &day, &year );
> total_line = sscanf(buffer, "%s" , store_value);



James Kuyper

unread,
Apr 8, 2021, 10:34:22 PM4/8/21
to
The C++ routines for doing numeric conversions described in section
21.3.4 have well-defined behavior in such situations - they throw an
exception that can be caught, if desired.

The std::num_get::do_get() function also have well-defined behavior when
performing such conversions, namely assigning ios_base::failbit to err.
(28.4.2.1.2).

Either of those is preferable to undefined behavior.

Tim Rentsch

unread,
Apr 9, 2021, 2:22:33 AM4/9/21
to
Keith Thompson <Keith.S.T...@gmail.com> writes:

> alessandro volturno <alessandr...@libero.it> writes:
>
>> as the title of this message says, this is going to be a silly
>> question, but I don't know how to make it by myself.
>>
>> I'm looking for a way to parse the day of the month in a string
>> data with the format of dd.mm.yyyy via scanf.
>>
>> Where the day of the month is given as an integer number of two
>> digits like the following ones:
>>
>> 01, 02, 03, 04, 05, 06, 07, 08, 09 (that should obviously be
>> interpreted as 1, 2, 3, 4, 5, 6, 7, 8, 9)
>>
>> if I try with the format "%2d"
>>
>> I get an erroneus parsing of 0.
>
> Is there some reason you need to use scanf?
>
> Note that scanf has undefined behavior on numeric input if the
> value can't be represented in the target type. [...]

The possibility of undefined behavior is easily avoided in most
cases, and certainly in this case, simply by using a maximum
field width as part of each conversion specifier:

int day, month, year;
int items;
...
items = sscanf( input, "%2d.%2d.%4d", &day, &month, &year );
if( items == 3 ) ... etc ...

The given field widths guarantee the input values can be
represented in the respective target types, even if int
is only 16 bits.

Tim Rentsch

unread,
Apr 9, 2021, 2:30:34 AM4/9/21
to
There may be more potential responders in comp.lang.c, but the
best ones also read comp.lang.c++. So asking here is likely to
give a higher average quality of answer. :)

Only half tongue-in-cheek...

Tim Rentsch

unread,
Apr 9, 2021, 2:38:42 AM4/9/21
to
Paavo Helde <myfir...@osa.pri.ee> writes:

> 08.04.2021 09:11 Juha Nieminen kirjutas:
>
>> Manfred <non...@add.invalid> wrote:
>>
>>> This is comp.lang.c++, better send this question to comp.lang.c
>>
>> Why? std::scanf() is a 100% C++ standard library function.
>>
>> What makes you think otherwise?
>
> As scanf() can invoke UB (read: incorrect results) very easily if
> the input stream does not match the expected results, it is
> unusable with any external content (i.e. basically always). [...]

Nonsense. The undefined behavior you refer to can always be
avoided by using an appropriate maximum field width in each
conversion specifier.

Keith Thompson

unread,
Apr 9, 2021, 3:30:09 AM4/9/21
to
Not always (though as you said it can be in the case that started this
thread).

For example, if you're reading a 16-bit integer, limiting the field
width means you can't read values over 9999 -- or, if I'm not mistaken,
values less than -999 for a 16-bit signed integer.

Alf P. Steinbach

unread,
Apr 9, 2021, 5:30:22 AM4/9/21
to
On 08.04.2021 20:41, Paavo Helde wrote:
> 08.04.2021 20:40 Juha Nieminen kirjutas:
>> Paavo Helde <myfir...@osa.pri.ee> wrote:
>>> A little demo: should the program read in incorrect numbers happily, or
>>> should it report an error?
>>
>> I think it's not bad that the standard library offers the choice.
>>
>
> You don't need scanf() to get incorrect numbers, you can easily have
> them also with C++ streams (a design bug IMO, stream exceptions should
> be on by default, but that's just me):

The design of EOF handling in iostreams is practically incompatible with
exception throwing mode.

I guess that's a main reason why nobody uses that mode.

It's just totally impractical, due to the iostreams design.


> #include <iostream>
> #include <string>
> #include <sstream>
>
> int main() {
>     const char* buffer = "12345678912345678";
>     int x;
>     std::istringstream is(buffer);
>     is >> x;
>     std::cout << "istream produced: " << x << "\n";
> }
>
> Output:
> istream produced: 2147483647

- Alf

Juha Nieminen

unread,
Apr 10, 2021, 3:32:14 AM4/10/21
to
Paavo Helde <myfir...@osa.pri.ee> wrote:
> You don't need scanf() to get incorrect numbers, you can easily have
> them also with C++ streams (a design bug IMO, stream exceptions should
> be on by default, but that's just me):

But the thing is, std::sscanf() is probably going to be more efficient
than std::istringstream. Notice in your very own example:

> const char* buffer = "12345678912345678";
> std::istringstream is(buffer);

That's probably going to cause an extra memory allocation and
deallocation (not to talk about the needless copying of the string
contents into the allocated memory).

I know that the problem may be not as bad when reading directly
from a file or standard input, but notice how in this particular
case, when parsing a value from a string in memory, C++ doesn't
really offer a good alternative. C++17 did introduce the
std::from_chars() family of functions that are designed to
read integers and floating point values from strings as
efficiently as possible (probably even surpassing the efficiency
of std::sscanf()), in-place, without any allocations nor copying,
which is a great addition to the standard library, but they can only
be used to parse one single value at a time.

That being said, the std::scanf() functions are quite limited in
their usefulness, and I personally have almost never used any of
them for anything. Just pointing out that it's not automatically a
bad thing that C++ supports them.

Keith Thompson

unread,
Apr 10, 2021, 5:37:10 AM4/10/21
to
Juha Nieminen <nos...@thanks.invalid> writes:
> Paavo Helde <myfir...@osa.pri.ee> wrote:
>> You don't need scanf() to get incorrect numbers, you can easily have
>> them also with C++ streams (a design bug IMO, stream exceptions should
>> be on by default, but that's just me):
>
> But the thing is, std::sscanf() is probably going to be more efficient
> than std::istringstream. Notice in your very own example:
>
>> const char* buffer = "12345678912345678";
>> std::istringstream is(buffer);
>
> That's probably going to cause an extra memory allocation and
> deallocation (not to talk about the needless copying of the string
> contents into the allocated memory).
>
> I know that the problem may be not as bad when reading directly
> from a file or standard input, but notice how in this particular
> case, when parsing a value from a string in memory, C++ doesn't
> really offer a good alternative. C++17 did introduce the
> std::from_chars() family of functions that are designed to
> read integers and floating point values from strings as
> efficiently as possible (probably even surpassing the efficiency
> of std::sscanf()), in-place, without any allocations nor copying,
> which is a great addition to the standard library, but they can only
> be used to parse one single value at a time.

There's always strtol() and friends.

> That being said, the std::scanf() functions are quite limited in
> their usefulness, and I personally have almost never used any of
> them for anything. Just pointing out that it's not automatically a
> bad thing that C++ supports them.

Ian Collins

unread,
Apr 10, 2021, 5:50:07 AM4/10/21
to
On 10/04/2021 21:36, Keith Thompson wrote:
> Juha Nieminen <nos...@thanks.invalid> writes:
>> Paavo Helde <myfir...@osa.pri.ee> wrote:
>>> You don't need scanf() to get incorrect numbers, you can easily have
>>> them also with C++ streams (a design bug IMO, stream exceptions should
>>> be on by default, but that's just me):
>>
>> But the thing is, std::sscanf() is probably going to be more efficient
>> than std::istringstream. Notice in your very own example:
>>
>>> const char* buffer = "12345678912345678";
>>> std::istringstream is(buffer);
>>
>> That's probably going to cause an extra memory allocation and
>> deallocation (not to talk about the needless copying of the string
>> contents into the allocated memory).
>>
>> I know that the problem may be not as bad when reading directly
>> from a file or standard input, but notice how in this particular
>> case, when parsing a value from a string in memory, C++ doesn't
>> really offer a good alternative. C++17 did introduce the
>> std::from_chars() family of functions that are designed to
>> read integers and floating point values from strings as
>> efficiently as possible (probably even surpassing the efficiency
>> of std::sscanf()), in-place, without any allocations nor copying,
>> which is a great addition to the standard library, but they can only
>> be used to parse one single value at a time.
>
> There's always strtol() and friends.

Which are usually the best option!

Our code base has a couple of variadic function templates that replace
sscanf with type safe conversions and no annoying format specifiers. I
wrote them to clean up a bunch of equally annoying clang-tidy
cert-err34-c warnings.

--
Ian.

Paavo Helde

unread,
Apr 10, 2021, 9:07:16 AM4/10/21
to
10.04.2021 10:31 Juha Nieminen kirjutas:
> Paavo Helde <myfir...@osa.pri.ee> wrote:
>> You don't need scanf() to get incorrect numbers, you can easily have
>> them also with C++ streams (a design bug IMO, stream exceptions should
>> be on by default, but that's just me):
>
> But the thing is, std::sscanf() is probably going to be more efficient
> than std::istringstream. Notice in your very own example:
>
>> const char* buffer = "12345678912345678";
>> std::istringstream is(buffer);
>
> That's probably going to cause an extra memory allocation and
> deallocation (not to talk about the needless copying of the string
> contents into the allocated memory).

iostreams are slow anyway, so I would not worry about such details too
much. For me, both the scanf() and iostreams are convenience interfaces
which can be used when the speed is not so critical. And scanf() is
losing here as it is not so easy or convenient to use it safely.

If speed is critical, one often has to turn to special libraries
optimized for that purpose, like Google's double_conversion for example.
Hopefully std::to_chars()/std::from_chars() become a viable alternative
here (or maybe they already are, I have not had chance to try them out yet).

> C++17 did introduce the std::from_chars() [...] but they can only
> be used to parse one single value at a time.

It ought to be relatively easy to build a stream-like interface on top
of std::from_chars/std::to_chars. For keeping up with the performance it
should also be locale-independent, non-allocating, and should not
involve virtual functions.

Juha Nieminen

unread,
Apr 12, 2021, 1:50:40 AM4/12/21
to
Keith Thompson <Keith.S.T...@gmail.com> wrote:
> There's always strtol() and friends.

This sub-discussion spawned from someone suggesting that since the
original question was about scanf() that he go to the C group, and
me asking why, given that scanf() is a 100% C++ standard function,
and then someone responding with using the "C++ equivalent" of
scanf(), ie. std::istream (&co) as an alternative to the originally-C
function.

But yes, there's absolutely nothing wrong in using the standard library
functions that were "inherited" from C, if they suit the task at hand.

(On that note, one could argue that if scanf() is being used because of
its somewhat-syntax-parsing capability, kind of, which none of those
other suggestions support, it would be better to use a combination of
std::regex and those string-to-integer functions, especially if absolute
maximum efficiency is not required, as std::regex allows for a *significalty*
higher degree of syntax parsing than scanf(). Not that it completely
replaces a full-fledged parser, but much more so than scanf().)

Ian Collins

unread,
Apr 12, 2021, 4:13:26 AM4/12/21
to
Rather than going down that route, with my variadic function templates I
just beak down the format string and mix it with the variables to be
loaded, for example:


unsigned a {}, b {}, c {};
const auto read = fromString( "V1.2.3", "V", c, ".", b, ".", c );

I find it easier to read when not having to mentally map format
specification to the variables.

--
Ian,

Tim Rentsch

unread,
Apr 18, 2021, 11:04:55 PM4/18/21
to
Keith Thompson <Keith.S.T...@gmail.com> writes:

> Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
>
>> Paavo Helde <myfir...@osa.pri.ee> writes:
>>
>>> 08.04.2021 09:11 Juha Nieminen kirjutas:
>>>
>>>> Manfred <non...@add.invalid> wrote:
>>>>
>>>>> This is comp.lang.c++, better send this question to comp.lang.c
>>>>
>>>> Why? std::scanf() is a 100% C++ standard library function.
>>>>
>>>> What makes you think otherwise?
>>>
>>> As scanf() can invoke UB (read: incorrect results) very easily if
>>> the input stream does not match the expected results, it is
>>> unusable with any external content (i.e. basically always). [...]
>>
>> Nonsense. The undefined behavior you refer to can always be
>> avoided by using an appropriate maximum field width in each
>> conversion specifier.
>
> Not always (though as you said it can be in the case that started this
> thread).
>
> For example, if you're reading a 16-bit integer, limiting the field
> width means you can't read values over 9999 -- or, if I'm not mistaken,
> values less than -999 for a 16-bit signed integer.

True, but the value could be read into a long long, which in most
cases will suffice to read a desired input value safely.

If it's important to read values near the boundaries of what, for
example, integer types can represent, it is still always possible
to read the input safely using scanf, even if not as conveniently
as one might like.

Of course usually it's easier to read in lines and use sscanf()
on those, but the principle is the same.

(I will note for the record that these comments don't apply in
the case of a %p conversion specification. I assume no one was
talking about those.)
0 new messages