For example, the integers 1, 2, ... can be represented as the letters
A, B, ... but this system doesn't extend beyond a limited range of
integers. Ordinary Hindu-Arabic notation works for rational numbers
in the interval [0, 10), but "10" < "2" alphabetically. A more
general system would extend to all real numbers.
For the sake of discussion, let's use English characters, ordered by
US-ASCII code. The numerals come first, followed by upper case
letters and then lower case letters, with punctuation interspersed.
It's encouraging to know alphabetical order is a total order, and
dense over some infinite sets of character strings. An example is a
set of strings that don't end with the first letter of the alphabet,
such as strings of letters that don't end with A.
A simple scheme that comes to mind is the Elias Gamma code,
rather popular in data compression. There are many versions
of that code; here's one adjusted to have the desired sorting
property. (Elias Gamma code operates on the positive integers.
After describing it, I'll show how to then get the reals.)
Express the positive integer in binary, drop the leading one, count
the bits remaining (call the count N), prefix N 1's followed by zero.
For example, to express 25, follow the above instructions to
get
25 -> 11001 -> 1001 (N=4) -> (11110)1001 -> 111101001
Since 0 cannot be coded (and 1 will cause trouble, see below),
increase the integer by 2 before using the above code.
Elias' code has an "instaneous decoding" property; you don't
need that but it has the benefit that you won't need a decimal
point to separate the fractional part.
For a positive real number, say 10111.000111 in binary, your code
will be
111101001000111
Starting with the integer part, 10111 = 23, I added 2 as mentioned
above, took the Elias code for 25, concatenated the fractional part.
For negative number (-x), simply use the inverse (x), but change all
0's to 1's and vice versa. Zero will be no problem (e.g. use "1").
This all uses a 2-sized alphabet, instead of 26-sized.
Adjust to suit. YMMV.
* * * * * * * *
Another almost-example that comes to mind is the (normalized)
floating-point number format used on certain old computers like CDC's.
They *did* have your desired property for the numbers they could
represent. However, they couldn't represent *all* real numbers.
The Hackenbush number system also comes to mind.
James Dow Allen
Use fixed width padded with spaces.
In MS-Access, I use the function format$([number],"@@@@@@@@") which
returns
" 9"
" 10"
" 934567"
whose alphabetical sort matches its numerical sort.
>
> For example, the integers 1, 2, ... can be represented as the letters
> A, B, ... but this system doesn't extend beyond a limited range of
> integers. Ordinary Hindu-Arabic notation works for rational numbers
> in the interval [0, 10), but "10" < "2" alphabetically. A more
> general system would extend to all real numbers.
Split the characteristic from the mantissa, use padded fixed width
for the characteristic and append the mantissa. This will align the
decimal points and again, alphabetical sort matches numerical sort.
Examples:
" 1.12345"
" 933.011"
" 1000.3"
" 1000.534567"
Cheers
Mike
Quite easily: e.g. "A", "AA", "AAA", etc. It is easy to devise more
compact encodings.
However, you cannot use all the possible strings to represent natural
numbers, since between any two distinct strings there are infinitely
many other strings in their ordering: a property not true of natural
numbers.
> For the sake of discussion, let's use English characters, ordered by
> US-ASCII code. The numerals come first, followed by upper case
> letters and then lower case letters, with punctuation interspersed.
For example: Let N be the number of initial '~' characters (the last
printable US-ASCII character). They are followed by 2^N characters
(not beginning with '~') representing a number in base 95. Only the
shortest representation for a given number shall be considered valid.
There are a great many variations on this sort of rule.
- Tim
Is that of any practical concern? Certainly not in my work.
That works for non-negative real numbers. James Dow Allen pointed out
a way to get the negative integers; that is to construct the string
for the absolute value (a positive integer), but with the alphabet
reversed. For example, 2, 1, 0, -1, -2 would be represented as "a2",
"a1", "a0", "98", "97". Negative zero is "9a". But I can't think how
to do the negative non-integers. -1 > -1.1, but "98" <
"98.8" (different alphabetical order).
_______________________________________
Computer floating points can also only represent a limitted interval. Just
make sure that the string range is larger than the double point range.
You can fairly easily map all of Q into strings.
First, use any of the usual order-preserving bijections of Q with
Q /\ (0,1), e.g. piecewise linear mappings from [-(n+1), -n] to
[1/(n+3), 1/(n+2)] and [n, n+1] to [1 - 1/(n+2), 1 - 1/(n+3)].
Then use a "factorial" fraction representation. All rationals
terminate in this form, though large prime factors in the denominator
will entail a long representation. A simple example:
4/7 = 1/2! + 0/3! + 1/4! + 3/5! + 3/6! + 3/7!.
Finally you can encode each digit: e.g. (in letters) "BABDDD" for 4/7.
If you get to 96! you will need character pairs to represent all the
possible digits in printable US-ASCII, triples at (95^2+1)!, etc.
Some of the earlier digits could be combined to slightly better fill
the available character set, e.g. 4/7 = 13/4! + 21/6! + 24/8!.
- Tim