Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

reading unsigned numners

72 views
Skip to first unread message

Neil Butterworth

unread,
Apr 24, 2009, 6:24:09 PM4/24/09
to
This is based on a question posted on StackOverflow
http://stackoverflow.com/questions/786951/stringstream-unsigned-conversion-b
roken.
I've modified the code somewhat, but the question is given:

#include <iostream>
#include <sstream>
using namespace std;;

int main()
{
std::istringstream stream( "-1" );
unsigned short n = 0;
stream >> n;
if ( stream.fail() ) {
cerr << "Conversion failed\n";
}
else {
cout << n << endl;
}
}

g++ (several versions) the conversion fails
vc++ (several versions) prints 65535

Which is correct? Or are both? Please quote from the standard when
replying.


Neil Butterworth


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Pavel Minaev

unread,
Apr 25, 2009, 2:07:05 AM4/25/09
to
On Apr 24, 3:24 pm, Neil Butterworth <nbutterworth1...@googlemail.com>
wrote:
> This is based on a question posted on StackOverflowhttp://stackoverflow.com/questions/786951/stringstream-unsigned-conve...

> roken.
> I've modified the code somewhat, but the question is given:
>
> #include <iostream>
> #include <sstream>
> using namespace std;;
>
> int main()
> {
> std::istringstream stream( "-1" );
> unsigned short n = 0;
> stream >> n;
> if ( stream.fail() ) {
> cerr << "Conversion failed\n";
> }
> else {
> cout << n << endl;
> }
>
> }
>
> g++ (several versions) the conversion fails
> vc++ (several versions) prints 65535
>
> Which is correct? Or are both? Please quote from the standard when
> replying.

istream's operator>> is defined in terms of numget facet, and, at the
end of the call chain, its do_get member function - for your case it's
the one with unsigned short&. Its behavior is described in 22.2.2.1.2
[lib.facet.num.get.virtuals]. It's lengthy, but the gist of it which
seems to be applicable to your case is that the parsing behavior is
essentially defined in terms of scanf specifiers - (i.e. it should
read the characters for as long as they match the input field, and
then parse the resulting sequence of characters as if by scanf).
Specifically:

"The details of this operation occur in three stages
— Stage 1: Determine a conversion specifier
— Stage 2: Extract characters from in and determine a corresponding
char value for the format expected by the conversion specification
determined in stage 1.
— Stage 3: Store results"

"For conversion to an integral type, the function determines the
integral conversion specifier as indicated in Table 55
...
unsigned integral type - %u"

"A length specifier is added to the conversion specification, if
needed, as indicated in Table 56
...
unsigned short - h"

So it boils down to what scanf should do for %hu. This is C territory
already, not C++. So, ISO/IEC 9899:TC3, 7.19.6.2/12 ("scanf
function"):

"The conversion specifiers and their meanings are:

u - Matches an optionally signed decimal integer, whose format is the
same as expected for the subject sequence of the strtoul function with
the value 10 for the base argument."

Finally, looking at strtoul (7.20.1.4 in the same spec):

"If the value of base is zero, the expected form of the subject
sequence is that of an integer constant as described in 6.4.4.1,
optionally preceded by a plus or minus sign, but not including an
integer suffix. If the value of base is between 2 and 36 (inclusive),
the expected form of the subject sequence is a sequence of letters and
digits representing an integer with the radix specified by base,
optionally preceded by a plus or minus sign, but not including an
integer suffix."

Note that in our case base==10. Reading further:

"If the subject sequence has the expected form and the value of base
is zero, the sequence of characters starting with the first digit is
interpreted as an integer constant according to the rules of 6.4.4.1.
If the subject sequence has the expected form and the value of base is
between 2 and 36, it is used as the base for conversion, ascribing to
each letter its value as given above. If the subject sequence begins
with a minus sign, the value resulting from the conversion is negated
(in the return type)."

And there we go. It would seem that operator>>(unsigned short&) should
not fail on input "-1", but should read it as 1 ("the value resulting
from the conversion is negated"). So the answer is - neither
implementation is correct :) Unless I've missed something along the
line while unraveling this...

litb

unread,
Apr 25, 2009, 12:08:53 PM4/25/09
to
On Apr 25, 8:07 am, Pavel Minaev <int...@gmail.com> wrote:
> ....

> Note that in our case base==10. Reading further:
>
> "If the subject sequence has the expected form and the value of base
> is zero, the sequence of characters starting with the first digit is
> interpreted as an integer constant according to the rules of 6.4.4.1.
> If the subject sequence has the expected form and the value of base is
> between 2 and 36, it is used as the base for conversion, ascribing to
> each letter its value as given above. If the subject sequence begins
> with a minus sign, the value resulting from the conversion is negated
> (in the return type)."
>
> And there we go. It would seem that operator>>(unsigned short&) should
> not fail on input "-1", but should read it as 1 ("the value resulting
> from the conversion is negated"). So the answer is - neither
> implementation is correct :) Unless I've missed something along the
> line while unraveling this...
>

Good analysis there. But i'm not sure whether i agree with the last
part. It says "the value resulting value from the conversion is
negated". Negating "1" is -1, which is in modulo arithmetic USHRT_MAX
then. Saying that the value resulting from the conversion that already
respected '-' is negated would not make sense. It could never yield
any negative value.

Pavel Minaev

unread,
Apr 26, 2009, 12:03:50 AM4/26/09
to
On Apr 25, 9:08 am, litb <Schaub-Johan...@web.de> wrote:
> > "If the subject sequence has the expected form and the value of base
> > is zero, the sequence of characters starting with the first digit is
> > interpreted as an integer constant according to the rules of 6.4.4.1.
> > If the subject sequence has the expected form and the value of base is
> > between 2 and 36, it is used as the base for conversion, ascribing to
> > each letter its value as given above. If the subject sequence begins
> > with a minus sign, the value resulting from the conversion is negated
> > (in the return type)."
>
> > And there we go. It would seem that operator>>(unsigned short&) should
> > not fail on input "-1", but should read it as 1 ("the value resulting
> > from the conversion is negated"). So the answer is - neither
> > implementation is correct :) Unless I've missed something along the
> > line while unraveling this...
>
> Good analysis there. But i'm not sure whether i agree with the last
> part. It says "the value resulting value from the conversion is
> negated". Negating "1" is -1, which is in modulo arithmetic USHRT_MAX
> then. Saying that the value resulting from the conversion that already
> respected '-' is negated would not make sense. It could never yield
> any negative value.

Yes, now that I read it again, it does indeed make sense that
"negated" applies to the value of the number without the sign (i.e. it
simply defines the semantics of that sign when it's present). So the
correct answer is then (unsigned short)-1, which is what VC++ gives.

travis...@hotmail.com

unread,
Apr 26, 2009, 7:49:04 PM4/26/09
to
On Apr 25, 9:03 pm, Pavel Minaev <int...@gmail.com> wrote:
> On Apr 25, 9:08 am, litb <Schaub-Johan...@web.de> wrote:
>
> [...]

>
> Yes, now that I read it again, it does indeed make sense that
> "negated" applies to the value of the number without the sign (i.e. it
> simply defines the semantics of that sign when it's present). So the
> correct answer is then (unsigned short)-1, which is what VC++ gives.
>
> - Show quoted text -

A co-worker of mine filed a bug on this last week. See

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39802

Travis

0 new messages