Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Portable and exact bi-directional float to shortest string conversion

10 views
Skip to first unread message

CHK

unread,
May 11, 2004, 6:42:43 PM5/11/04
to
I was wondering if there is a portable way to convert float to a string
whose length is less than 9 and then back.
The conversion must be exact. That is:
value==Str2Float(Float2Str(value))

If sizeof(float)==4 then 8-char string (at most) must be enough (+ one more
char for null terminator)
because any 32bit value can be represented by 8-digit hex number.

If I extend alphabet used for encoding then I probably can shorten the
string representation of "float" to 6 or 7 characters.

The question is portability (for IEEE standard conforming "float").

CHK

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Michiel Salters

unread,
May 12, 2004, 2:09:01 PM5/12/04
to
"CHK" <b...@b.biz> wrote in message news:<10843082...@news1.pubnix.net>...

> I was wondering if there is a portable way to convert float to a string
> whose length is less than 9 and then back.
> The conversion must be exact. That is:
> value==Str2Float(Float2Str(value))
>
> If sizeof(float)==4 then 8-char string (at most) must be enough (+ one more
> char for null terminator)
> because any 32bit value can be represented by 8-digit hex number.
>
> If I extend alphabet used for encoding then I probably can shorten the
> string representation of "float" to 6 or 7 characters.
>
> The question is portability (for IEEE standard conforming "float").

I don't think this is a C++ question, especially if you are restricting
yourself to IEEE 754. In addition, C++ doesn't require null terminators.

Regards,
Michiel Salters

Jens Kilian

unread,
May 12, 2004, 8:08:54 PM5/12/04
to
Have a look at http://www.netlib.org/fp/, it might be what you need.
HTH,
Jens.
--
mailto:j...@acm.org phone:+49-7031-464-7698 (TELNET 778-7698)
http://www.bawue.de/~jjk/ fax:+49-7031-464-7351
As the air to a bird, or the sea to a fish,
so is contempt to the contemptible. [Blake]

CHK

unread,
May 13, 2004, 6:44:29 AM5/13/04
to
> I don't think this is a C++ question, especially if you are restricting
> yourself to IEEE 754. In addition, C++ doesn't require null terminators.

I am writing C++ program and I want to save my data (C++ primitive types
such as float or double) to a text file or to a database (in text fields) in
a reversable way and preserving space.
C++ has some functionality for FP number<->string conversion.
For these reasons I believe that the question is not irrelevant here.


I'd like to know what are my options in C++. Can I achieve my goal using
standard C++ functions?

From the purist's point of view I suspect this cannot be done.
But I'd like to know if there are practical workarounds.
Such as:
std::string FloatToStr(float fpVal)
{
if (sizeof(float)==4 && sizeof(unsigned int)==4)
{
unsigned int iVal = memcpy(&iVal, &fpVal,4);
reverseBytesIfNecessary(&iVal);
return convertToTextualRepresentation(iVal); // it's much easier to
// deal with integers
}
return std::string();

ka...@gabi-soft.fr

unread,
May 13, 2004, 9:15:04 AM5/13/04
to
"CHK" <b...@b.biz> wrote in message
news:<10843082...@news1.pubnix.net>...

> I was wondering if there is a portable way to convert float to a


> string whose length is less than 9 and then back.

No. There is no guarantee that 9 characters will be sufficient.

> The conversion must be exact. That is:
> value==Str2Float(Float2Str(value))

> If sizeof(float)==4 then 8-char string (at most) must be enough (+ one
> more char for null terminator)

Maybe. Not if you use a readable, portable representation.

> because any 32bit value can be represented by 8-digit hex number.

You can do a memory dump of 32 bits in 8 characters. You can even do it
in less, if you use something like uuencode coding.

I'm not sure that doing this is a good idea. First, of course, it
restricts your file to machines using exactly the same representation of
floating point values. And it is totally unreadable, and the slightest
change in the file could result in a trapping representation, but be
undetectable in the file itself.

> If I extend alphabet used for encoding then I probably can shorten the
> string representation of "float" to 6 or 7 characters.

Uuencoding and base64 both pack 4 eight bit bytes into 6 characters.

Just how good you can get depends on the character code used. Standard
ASCII has 95 printable characters (including space), so you can pack
into ciel( log( 2^32 ) / log( 95 ) ) (== 5) characters. You can even
reserve a few for formatting purposed -- you only need 85 different
characters to get down to a length of 5 characters (and 256, obviously,
to reduce that to 4).

> The question is portability (for IEEE standard conforming "float").

Excuse me: the question is portability, or restrict to IEEE standard
floating point. The real question, of course, is what you are going to
do with this string.

Note that formatting an IEEE float using %.6e will only require twelve
characters, and is guaranteed (by IEEE, not by C++) to be reversable.
And if the characters don't have to be printable, you can do it in 4
eight byte characters (obviously). For most purposes, I would
definitely go with the %.6e option, since even non-IEEE machines will be
able to read it, and restore approximately the same value. And since a
human can read it, which sure helps when debugging. Once you've given
up portability and human readability, why not go all the way and just
use 4 byte binary?

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,
May 14, 2004, 11:24:35 AM5/14/04
to
"CHK" <b...@b.biz> wrote in message
news:<10843877...@news1.pubnix.net>...

> > I don't think this is a C++ question, especially if you are
> > restricting yourself to IEEE 754. In addition, C++ doesn't require
> > null terminators.

> I am writing C++ program and I want to save my data (C++ primitive
> types such as float or double) to a text file or to a database (in
> text fields) in a reversable way and preserving space.

> C++ has some functionality for FP number<->string conversion. For
> these reasons I believe that the question is not irrelevant here.

It seems totally on topic to me.

Note that the only 100% portable solution is to use a full textual
representation of the value, something like you'd get with "%.6e".
Which will take 12 characters per value. Not all C++ must be 100%
portable, however. For many of us, it is sufficient to suppose that 1)
ints and floats have the same size (or even that they are both 32 bits),
and 2) that unsigned int has no padding bits and not trapping
represtations. This covers a fair range of machines: PC's, Mac's, all
of the common Unix platforms, and IBM mainframes. (I don't think it
covers Crays or the Unisys mainframes, and it definitly isn't true on
some older Unisys machines.)

Given that, it is relatively simple to pack a bit image of the float in
as little as five characters, provided you can assign at least 85
characters to the encoding. Something like the following should do the
job:

std::string
asString( float f )
{
unsigned work = reinterpret_cast< unsigned& >( f ) ;
char result[ 5 ] ;
for ( int i = 5 ; i > 0 ; -- i ) {
result[ i - 1 ] = charTab[ work % 85 ] ;
work /= 85 ;
}
return std::string( result, result + 5 ) ;
}

Input is somewhat more difficult, since you have to handle error
conditions, but doesn't pose any conceptual problems: find the index of
the character in charTab, then work = 85 * work + index.

If you go this route, however, be aware of the potential problems:

- a random string of 5 characters may result in a trapping NaN,

- not all machines use IEEE (IBM mainframes don't) -- although the
code will compile and run on such machines, the results of encoding
any specific value will be very different, and

- it's going to be very difficult to read the data, even if it is in
ASCII.

> I'd like to know what are my options in C++. Can I achieve my goal
> using standard C++ functions?

Using only standard C++ library functions, the best you can do is 12
characters :

std::string
asString( float f )
{
std::ostringstream s ;
s.setf( std::ios::scientific, std::ios::floatfield ) ;
s.precision( 6 ) ;
s << f ;
return s.str() ;
}

This format have several advantages over the one above:

- it is guaranteed to work on all C++ implementations, even the most
exotic ones,
- even when writing from one machine, and reading on another, the
results will be close to correct, and
- you can easily read it.

> From the purist's point of view I suspect this cannot be done. But
> I'd like to know if there are practical workarounds.

> Such as:
> std::string FloatToStr(float fpVal)
> {
> if (sizeof(float)==4 && sizeof(unsigned int)==4)

I'd use assert here. Or I'd even do some sort of preprocessor tests:

#if UINT_MAX != 0xFFFFFFFF || FLT_RADIX != 2 \
|| FLT_MANT_DIG != 24 || FLT_MAX_EXP != 128 || FLT_MIN_EXP != -125
#error Not IEEE float or not 4 byte int
#endif

(If the compiler also supports C99, you can simply test if
__STDC_IEC_559__ is defined, instead of all of the tests on the
FLT_... values.)

There's no point in compiling successfully if the code can't work.

If an assert is good enough, std::numeric_limits<> has a field
is_iec559. Regretfully, the members of this structure are not
accessible from the preprocessor; even more regretfully, some older but
still widespread compilers (g++ pre 3.0) don't support it at all.

> {
> unsigned int iVal = memcpy(&iVal, &fpVal,4);

The reinterpret_cast that I used should do just as well on all usual
machines. (And since exotic machines don't have IEEE, or may have
trapping values of unsigned int, it should be portable enough.)

> reverseBytesIfNecessary(&iVal);

Why? What should this function do? There's never any reason to reverse
bytes when doing IO. See my code for a simpler way of handling this.

> return convertToTextualRepresentation(iVal); // it's much easier to
> // deal with integers
> }
> return std::string();
> }

--


James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Allan W

unread,
May 18, 2004, 1:10:52 PM5/18/04
to
"CHK" <b...@b.biz> wrote

> I was wondering if there is a portable way to convert float to a string
> whose length is less than 9 and then back.
> The conversion must be exact. That is:
> value==Str2Float(Float2Str(value))
>
> If sizeof(float)==4 then 8-char string (at most) must be enough (+ one more
> char for null terminator)
> because any 32bit value can be represented by 8-digit hex number.

Simple. Used fixed-length strings so you don't need the null terminator.

> If I extend alphabet used for encoding then I probably can shorten the
> string representation of "float" to 6 or 7 characters.

> The question is portability (for IEEE standard conforming "float").

As others have explained better than I can, any attempt to use the bits
in the floating point value (including writing the whole number in binary)
suffers from problems with portability.

Any scheme based on printable digits does not have portability problems,
but you'll lose precision instead. You stated that you wanted the result
to be bit-for-bit identical.

A good compromise might be a scheme where you pick out the 23-bit
mantissa (a value from 0 to 8388607), the 8-bit exponent (a value
from -125 to +128), and the sign (0 or 1). Then encode them any way
you want... as others have pointed out, you should be able to encode
these 32 bits into 5 printable characters.

The code to do this would *NOT* be portable; it would be system-dependant.
(AFAIK there's no portable way to pick out mantissa bits or exponent bits).
But the values you store would be in a format that allows you to rewrite
the packing and unpacking routines for any IEEE-compliant system without
losing precision. Other than extreme values, you could probably
interoperate with most non-IEEE systems as well, again by rewriting the
pack and unpack routines.

ka...@gabi-soft.fr

unread,
May 19, 2004, 10:26:15 PM5/19/04
to
all...@my-dejanews.com (Allan W) wrote in message
news:<7f2735a5.04051...@posting.google.com>...
> "CHK" <b...@b.biz> wrote

> > I was wondering if there is a portable way to convert float to a
> > string whose length is less than 9 and then back. The conversion
> > must be exact. That is: value==Str2Float(Float2Str(value))

> > If sizeof(float)==4 then 8-char string (at most) must be enough (+
> > one more char for null terminator) because any 32bit value can be
> > represented by 8-digit hex number.

> Simple. Used fixed-length strings so you don't need the null
> terminator.

I do this, but reading fixed-length strings without a terminator isn't
trivial using iostreams. The width parameter is not taken into account
on input. The way I did this recently was to mmap the file, and then
use istrstream's initialized with a pointer and a length. But mmap
isn't standard, and istrstream is officially deprecated:-). The only
100% standard solution I know of would be to read the text into a
string, and then use istringstream's on substrings of the string.

> > If I extend alphabet used for encoding then I probably can shorten
> > the string representation of "float" to 6 or 7 characters.

> > The question is portability (for IEEE standard conforming "float").

> As others have explained better than I can, any attempt to use the
> bits in the floating point value (including writing the whole number
> in binary) suffers from problems with portability.

> Any scheme based on printable digits does not have portability
> problems, but you'll lose precision instead. You stated that you
> wanted the result to be bit-for-bit identical.

IEEE guarantees that the conversion if reversible if there are at least
7 digits precision. So if you have IEEE format, and you output in
format "%.6e", you should be able to recover a bit-for-bit identical
value when reading. If you don't, I would consider it an error in the
implementation.

Obviously, if you write IEEE floats, and reread using the IBM format on
an IBM mainframe, you won't get bit-for-bit identical values (which
wouldn't make sense anyway), but you will get the closest possible
approximation. (Note that some IEEE float values will not be exactly
representable in IBM floating point format, and vice versa.)

> A good compromise might be a scheme where you pick out the 23-bit
> mantissa (a value from 0 to 8388607), the 8-bit exponent (a value from
> -125 to +128), and the sign (0 or 1). Then encode them any way you
> want... as others have pointed out, you should be able to encode these
> 32 bits into 5 printable characters.

> The code to do this would *NOT* be portable; it would be
> system-dependant. (AFAIK there's no portable way to pick out mantissa
> bits or exponent bits). But the values you store would be in a format
> that allows you to rewrite the packing and unpacking routines for any
> IEEE-compliant system without losing precision. Other than extreme
> values, you could probably interoperate with most non-IEEE systems as
> well, again by rewriting the pack and unpack routines.

In practice, if you limit yourself to IEEE systems, the only variation
you are likely to run into is byte order. And this shouldn't be a
problem if the input and output are done correctly, by aliasing (type
punning) to a comparably sized unsigned int. (This could cause problems
if unsigned int had padding bits, or used a different byte order than
the float. In practice, I don't think that any systems modern enough to
support IEEE have padding bits in their unsigned int's, but early
Microsoft compilers for 16 bit Intels DID use a different byte order for
unsigned long and for float.)

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Allan W

unread,
May 21, 2004, 5:56:00 AM5/21/04
to
> > "CHK" <b...@b.biz> wrote
> > > I was wondering if there is a portable way to convert float to a
> > > string whose length is less than 9 and then back. The conversion
> > > must be exact. That is: value==Str2Float(Float2Str(value))
>
> > > If sizeof(float)==4 then 8-char string (at most) must be enough (+
> > > one more char for null terminator) because any 32bit value can be
> > > represented by 8-digit hex number.

> all...@my-dejanews.com (Allan W) wrote


> > Simple. Used fixed-length strings so you don't need the null
> > terminator.

ka...@gabi-soft.fr wrote


> I do this, but reading fixed-length strings without a terminator isn't
> trivial using iostreams. The width parameter is not taken into account
> on input. The way I did this recently was to mmap the file, and then
> use istrstream's initialized with a pointer and a length. But mmap
> isn't standard, and istrstream is officially deprecated:-). The only
> 100% standard solution I know of would be to read the text into a
> string, and then use istringstream's on substrings of the string.

To read an 8-byte string from an iostream:
char buffer[8];
for (int i=0; i<8; ++i)
// Read one character at a time
mystream>>buffer[i];

This isn't as simple or pretty as mystream>>buffer, but it works.

John Potter

unread,
May 21, 2004, 8:42:23 PM5/21/04
to
On 21 May 2004 05:56:00 -0400, all...@my-dejanews.com (Allan W) wrote:

> To read an 8-byte string from an iostream:
> char buffer[8];
> for (int i=0; i<8; ++i)
> // Read one character at a time
> mystream>>buffer[i];

> This isn't as simple or pretty as mystream>>buffer, but it works.

Did I miss a mystream >> noskipws? Why use formatted input style to
do raw input? mystream.get(buffer[i]);

John

James Kanze

unread,
May 22, 2004, 9:33:30 AM5/22/04
to
all...@my-dejanews.com (Allan W) writes:

|> To read an 8-byte string from an iostream:
|> char buffer[8];
|> for (int i=0; i<8; ++i)
|> // Read one character at a time
|> mystream>>buffer[i];

|> This isn't as simple or pretty as mystream>>buffer, but it works.

It doesn't do the same thing, either. Although both could very easily
extract more than 8 characters from the input stream.

The context, don't forget, is fixed width text fields. Let's suppose
that we've standardized on 8 char's for integer values, and our file
looks something like this:

| 314 4912345678|
field: 111111112222222233333333

Using >> directly into an int will result in two fields, not three, and
you cannot change this by using setw.

Using >> directly into a string or a char[] will result in only two
fields ("314" and "4912345678"), unless you specify setw.

Your solution will also result in two fields: "31449123" and "4567????".
Not really what was wanted either.

In my case, I mmap'ed the file, and then used:
istrstream s( buffer, 8 ) ;
istrstream s( buffer + 8, 8 ) ;
etc. to read the individual streams.

An alternative solution would be to read the data into a vector<char>,
and use the istrstream on that. (That's what I did when writing. With
an ostrstream, of course. In my application, I wrote the stream record
by record, but read it in one go.)

You could also read the records into a string, say with get line, then
extract substrings and use istringstream on them. That's going to
involve a lot of copying, but for most applications, it is probably
adequate.

In my case, portability was not a concern, and the C API on my platform
specifies that structs consisting of only char[] will not be padded. So
I defined my records as structs, e.g.:

struct Header
{
char id [ 6 ] ;
char length[ 11 ] ;
char state [ 2 ] ;
// ...
} ;

I then used reinterpret_cast to "overlay" the struct's on the buffer,
so I could write things like:

istrstream s( header.id, sizeof( header.id ) ) ;

In fact, I used macros to generate most of this sort of boilerplate
code. It could probably be done with templates, but not with the
compilers I have to use. So the struct actually contained something
like:

getsetInt( id ) ;

which would expand to all of the necessary code for the getters and
setters (including the use of [io]strstream).

--
James Kanze


Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

9 place Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34

Stephen Howe

unread,
May 23, 2004, 7:41:06 AM5/23/04
to
> Using >> directly into an int will result in two fields, not three, and
> you cannot change this by using setw.

Which IMO you should be able to. scanf family has no problem reading this in
with its maximum number of character to read. setw() sould be made to apply
to input types other than arrays of chars/wide chars.

Stephen Howe

John Potter

unread,
May 23, 2004, 7:45:57 AM5/23/04
to
On 22 May 2004 09:33:30 -0400, James Kanze <ka...@gabi-soft.fr> wrote:

> | 314 4912345678|
> field: 111111112222222233333333

> Using >> directly into an int will result in two fields, not three, and
> you cannot change this by using setw.
>
> Using >> directly into a string or a char[] will result in only two
> fields ("314" and "4912345678"), unless you specify setw.

If you do specify setw (8 for string, 9 for char[>=9], you get three
fields ("314", "49123456" and "78"). If you also set noskipws, the
stream fails to extract anything for the first field and sets fail.
That could be solved by including leading zeroes rather than spaces in
the file.

> You could also read the records into a string, say with get line, then
> extract substrings and use istringstream on them. That's going to
> involve a lot of copying, but for most applications, it is probably
> adequate.

Yes, this works. Boost::lexical_cast allows initialization.

int i(boost::lexical_cast<int>(data.substr(0,8)));
int j(boost::lexical_cast<int>(data.substr(8,8)));
int k(boost::lexical_cast<int>(data.substr(16,8)));

John

Carl Barron

unread,
May 23, 2004, 9:22:32 PM5/23/04
to
James Kanze <ka...@gabi-soft.fr> wrote:

> In my case, portability was not a concern, and the C API on my platform
> specifies that structs consisting of only char[] will not be padded. So
> I defined my records as structs, e.g.:
>
> struct Header
> {
> char id [ 6 ] ;
> char length[ 11 ] ;
> char state [ 2 ] ;
> // ...
> } ;
>
> I then used reinterpret_cast to "overlay" the struct's on the buffer,
> so I could write things like:
>
> istrstream s( header.id, sizeof( header.id ) ) ;
>
> In fact, I used macros to generate most of this sort of boilerplate
> code. It could probably be done with templates, but not with the
> compilers I have to use. So the struct actually contained something
> like:
>
> getsetInt( id ) ;
>
> which would expand to all of the necessary code for the getters and
> setters (including the use of [io]strstream).

I don't like depreciated code so I use a SimpleReader class with very
simple stream buffer class

class no_copy
{
// don't impliment these!!!
no_copy(const &no_copy &);
no_copy & operator = (const no_copy &);
};

struct SimpleBuf:public std::streambuf,no_copy
{
SimpleBuf(char *begin,char *end) {setg(begin,begin,end);}
};

This provides enough input for following stream class to extract an
item from the buffer provided.

class SimpleReader:std::istream
{
SimpleBuf buf;
public:
SimpleReader(char *begin,char *end):buf(begin,end)
{
rdbuf(&buf);
}
};

Should be as efficient as istrstream if not more so.

The reinterpret_cast's can be removed with some template class
provided that std::size_t is a valid template argument on your 'older'
compiler. This is what your reinterpret casting does correct?

template <typename T,std::size_t S>
struct read_one
{
typedef T type;
static const std::size_t size = S;
static std::istream & read(std::istream &is,type &out)
{
char buf[size];
if(is.read(buf,size))
{
SimpleReader r(buf,buf+size);
if(r >> out)
return is;
}
is.setstate(std::ios_base::failbit);
return is;
}
};

for char arrays we have:
template <class N,class S> struct read_one<char[N],S>
{
typedef char type[N];
static const std::size_t size = S;
static std::istream & read(std::istream &is,type &out)
{
return is.read(out,size);
}
};

if this is partial specialization is a problem then just do what it
does in the extractor for your class.
struct user_reader
:read_one<long,6>,read_one<char[11],11>,read_one<short,2>
{
};

struct user_data:user_read
{
long a;
char b[11];
short c;
};

std::istream & operator >> (std::istream &is,user_data &user)
{
std::istream::sentry s(Is);
if(s)
{
return user.read(is,user.a) &&
user.read(is,user.b) &&
user.read(is,user.c);
}
return is;
}

for the three longs its simpler:)
struct three_longs:read_one<long,8>
{
long a,b,c;
};

std::istream & operator >> (std::istream &is,three_longs &out)
{
std::istream::sentry s(is);
if(s)
{
return user.read(is,user.a) &&
user.read(is,user.b) &&
user.read(is,user.c);
}
return is;
};

"Daniel Krügler (ne Spangenberg)"

unread,
May 25, 2004, 8:50:39 AM5/25/04
to
Hello James Kanze,

ka...@gabi-soft.fr schrieb:

>IEEE guarantees that the conversion if reversible if there are at least
>7 digits precision. So if you have IEEE format, and you output in
>format "%.6e", you should be able to recover a bit-for-bit identical
>value when reading. If you don't, I would consider it an error in the
>implementation.
>
>Obviously, if you write IEEE floats, and reread using the IBM format on
>an IBM mainframe, you won't get bit-for-bit identical values (which
>wouldn't make sense anyway), but you will get the closest possible
>approximation. (Note that some IEEE float values will not be exactly
>representable in IBM floating point format, and vice versa.)
>
>

Are you sure? If I take an IEEE float with p=24 mantissa bits (implied
leading bit + fraction bits) and
base b=2 and then again applying the generalized formula as for
DECIMAL_DIG in the C99 standard
(§5.2.4.2.2/p. 8), which is (for base != 10)

decimal_dig(p,b) = Ceil(1 + p* log10(b))

I get decimal 9 digits for rounding the IEEE float to decimal digits and
back without change to the value.

Did I misunderstand something? (Note: Even if I should have used the
pure fraction bits, the above result
would change to 8 and not to 7 decimal digits)

Greetings from Bremen,

Daniel

ka...@gabi-soft.fr

unread,
May 26, 2004, 10:08:53 AM5/26/04
to
"Daniel Kr=FCgler (ne Spangenberg)" <d...@bdal.de> wrote in message
news:<40B31A9...@bdal.de>...

> ka...@gabi-soft.fr schrieb:

> >IEEE guarantees that the conversion if reversible if there are at
> >least 7 digits precision. So if you have IEEE format, and you
> >output in format "%.6e", you should be able to recover a bit-for-bit
> >identical value when reading. If you don't, I would consider it an
> >error in the implementation.

> >Obviously, if you write IEEE floats, and reread using the IBM format
> >on an IBM mainframe, you won't get bit-for-bit identical values
> >(which wouldn't make sense anyway), but you will get the closest
> >possible approximation. (Note that some IEEE float values will not
> >be exactly representable in IBM floating point format, and vice
> >versa.)

> Are you sure?

No:-).

> If I take an IEEE float with p=24 mantissa bits (implied leading bit +
> fraction bits) and base b=2 and then again applying the generalized

> formula as for DECIMAL_DIG in the C99 standard (=A75.2.4.2.2/p. 8),


> which is (for base != 10)

> decimal_dig(p,b) = Ceil(1 + p* log10(b))

> I get decimal 9 digits for rounding the IEEE float to decimal digits
> and back without change to the value.

> Did I misunderstand something? (Note: Even if I should have used the
> pure fraction bits, the above result would change to 8 and not to 7
> decimal digits)

I know that beyond a certain number of decimal digits, a bit exact round
trip is guaranteed (and that the number decimal digits is significantly
less than the number of bits in the mantissa). I seem to recall having
read that this value is 7, but I wouldn't swear to it.

Given the doubt, I went to the net. According to David Goldberg ("What
Every Computer Scientist Should Know about Floating-Point Arithmetic",
ACM Computing Surveys, vol. 23#1, 1991, but widely available on the
net), "When a binary IEEE single-precision number is converted to the
closest eight digit decimal number, it is not always possible to recover
the binary number uniquely from the decimal one. If nine decimal digits
are used, however, then converting the decimal number to the closest
binary number will recover the original floating-point number." This is
followed by a rigorous proof of the fact.

So the correct format to guarantee accurate round-trip conversion is
"%.8e", and we need 15 characters, not 13.

Thanks for point the error out.

--
James Kanze GABI Software

Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34

Steve Downey

unread,
May 28, 2004, 7:30:22 AM5/28/04
to
ka...@gabi-soft.fr wrote:
> "Daniel Kr=FCgler (ne Spangenberg)" <d...@bdal.de> wrote in message
> news:<40B31A9...@bdal.de>...
>
>
>>ka...@gabi-soft.fr schrieb:
>
>
>>>IEEE guarantees that the conversion if reversible if there are at
>>>least 7 digits precision. So if you have IEEE format, and you
>>>output in format "%.6e", you should be able to recover a bit-for-bit
>>>identical value when reading. If you don't, I would consider it an
>>>error in the implementation.
>
>

It depends on which form is the starting point of your round trip. A
string of decimal digits, converted to IEEE floating point, and then
back exactly, or, an IEEE floating point, converted to a string of
decimal digits, and then back to IEEE floating point.

Starting with a IEEE float with an exact decimal representation and
round tripping that maintains more precision than starting with an
aribitrary decimal string. From recollection, those are 8 and 6 decimal
places.

-SMD

0 new messages