int i = ch - '0'; // We assume that ch is a digit ranging from 0 to 9.
How does this work, why/how does subtraction of
two char's result in an int ??
Primitive data types. A char is a int is a char :)
Ah, ok. Thanks !
Any arithmetic on bytes, shorts, chars and ints is always done by first
widening the type to an int. For char the unsigned 16-bit unicode value
is used. Implicitly converting a character to a number was probably a
mistake in the language design, but its not about to change now.
Anyway, in most western language the characters representing '0' to '9'
have values, IIRC, 48, 49, 50, ... 57. So if you do say '1' - '0' then
that is the equivalent of 49 - 48, i.e. 1.
The code isn't the most reliable way of doing it. Unicode as a number of
ranges of numbers. Character.digit is better.
Unemployed English Java programmer
Ok, thanks for the insight.
chars are automatically promoted to ints before doing arithmetic. So
are bytes. So are shorts. The JVM has a 32 bit stack and 32 bit
so '2' - '0'
50 - 48 = 2
This is a fast way of converting a single char digit to binary int.
He is computing the relative difference in the codes for "2" and "0",
which conveniently is the binary for 2 because of the logical pattern
of code assignment. See http://mindprod.com/jgloss/unicode.html
Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.
BTW, I think this is bad way to convert characters to the numbers they
represent. I've written about a safer alternative on my blog at
http://nebupookins.net/entry.php?id=260 which correctly converts the unicode
characters for Roman numerals and Chinese/Japanese characters to the integer
they represent, for example.
I see from the book title that this is for a game, and one argue that
since this is game, the code should be very fast. My counter argument to
that is that I seriously doubt that converting chars to integers is going to
be the bottleneck in your game.
> I see from the book title that this is for a game, and one argue that
>since this is game, the code should be very fast. My counter argument to
>that is that I seriously doubt that converting chars to integers is going to
>be the bottleneck in your game.
on the other hand, your game is not defined to work with roman
numerals. That would be considered an error.
I think there is room for both. The strongest argument for using your
way is it leaves programs open to easier internationalisation. English
speaking programmers tend to forget their code, if successful, will be
If the design document doesn't specify a behaviour for roman numeral
input one way or another, I think actually parsing those roman numerals
would be "good" in the sense of "least surprising for the user" and "more
robust", as opposed to say, crashing, or returning an undefined value (and
then later crashing).
If the design document DOES say that upon detecting a roman numeral, an
error should be reported (or more likely "On any value other than 0, 1, 2,
3, 4, 5, 6, 7, 8 or 9, an error should be reported"), then obviously my
solution would be violating the requirements of the program.
> If the design document doesn't specify a behaviour for roman numeral
>input one way or another, I think actually parsing those roman numerals
>would be "good" in the sense of "least surprising for the user" and "more
>robust", as opposed to say, crashing, or returning an undefined value (and
>then later crashing).
> If the design document DOES say that upon detecting a roman numeral, an
>error should be reported (or more likely "On any value other than 0, 1, 2,
>3, 4, 5, 6, 7, 8 or 9, an error should be reported"), then obviously my
>solution would be violating the requirements of the program.
On the other paw, perhaps one in 10,000 people entering a roman
numeral into your program would do it on purpose. So the principle of
least astonishment suggests the best thing to do is reject it.
When I say that Character.getNumericValue() parses roman numerals, I
don't mean the string "VIII", but the actually unicode character whose
codepoint in hexadecimal is 0x2167. So I personally think it'd be unlikely
that someone would "accidentally" enter that character in.
Also, I don't know if this is the case for digits, but there are
distinct alphabetic characters in unicode which, in every font I've seen,
look identical. The Cyrillic character \u0430 and Latin character \u0061
both look like 'a' in most fonts. If this ever happens for digits as well,
the user's keyboard might be mapped to a local in which the the character
that the key labelled '9' generates looks identical to '9', but '0' minus
that character equals 400 or something. This would be an example of an
accidental usage of international character, but which should be accepted to
generate the least astonishment.