An strange point

80 views
Skip to first unread message

Fatemeh

unread,
Apr 1, 2009, 10:08:01 AM4/1/09
to
Dear all ;

I wrote a program which has a bug when the value of u is greater than
1.4.
when I tried to find the bug for values of y greater than 1.4 I saw an
strange point :

u=0.D0

do while (u.lt.20.D0)
.
.
.
print*,u, y
if (u==1.4D0) print*,"ok"
.
.
.
u=u+0.1D0
end do
------------------------
the output:
0.00000000000000 -0.900316316157106
0.100000000000000 -0.896557866215700
0.200000000000000 -0.892752298842229
0.300000000000000 -0.889009961947742
0.400000000000000 -0.885300844120634
0.500000000000000 -0.881608066131983
0.600000000000000 -0.877943572939470
0.700000000000000 -0.874308949879819
0.800000000000000 -0.870674660655724
0.900000000000000 -0.867096152952160
1.00000000000000 -0.863546597445969
1.10000000000000 -0.860024078786788
1.20000000000000 -0.856526008002337
1.30000000000000 -0.853103843918795
1.40000000000000 -0.849649287329888

but the program cannot print "ok" while it prints u=1.40000000000000 .
why???

I have got confused. Is there anyone can help me?

Thanks

Gordon Sande

unread,
Apr 1, 2009, 10:27:54 AM4/1/09
to

The detailed immediate problem is that 1.4d0 and 14.0d0 * 0.1d0 are different
numbers. They do not compare equal as you have tested then.

The real error is that you do not understand the technical issues of floating
point computation, otherwise know as finite precision computation. The problem
is that 0.1d0 is a notational fiction as there is no such number in binary
floating point. Ask google for "What every computer scientist should know about
floating point arithmetic" by Goldberg as supplied by Sun. Excellent paper.

0.1 would be a nonterminating repeating fraction in binary as so is only
approximated in finite precision.

One quick fix is to multiply every thing by 10 to get integers if you are
only interested in multiples of 0.1. Depending upon what you real intent
is there are other fixes.


Jugoslav Dujic

unread,
Apr 1, 2009, 10:27:13 AM4/1/09
to
Fatemeh wrote:
> Dear all ;
>
> I wrote a program which has a bug when the value of u is greater than
> 1.4.
> when I tried to find the bug for values of y greater than 1.4 I saw an
> strange point :
>
> u=0.D0
>
> do while (u.lt.20.D0)
> .
> .
> .
> print*,u, y
> if (u==1.4D0) print*,"ok"
>
> but the program cannot print "ok" while it prints u=1.40000000000000 .
> why???
>
> I have got confused. Is there anyone can help me?

This is probably the single most frequently asked question concerning
computing in general: the short answer is "because computer's floating-
point arithmetics is NOT exact, and not the same as mathematics real
number arithmetics".

For a longer answer, please read, for example,

http://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/41911/reply/12570/

There are many similar tutorials on the web, and even several books
on the subject. Please get familiar with the concept, if you plan
to do engineering computing for living.

--
Jugoslav
www.xeffort.com
Please reply to the newsgroup.
You can find my real e-mail on my home page above.

Arjen Markus

unread,
Apr 1, 2009, 10:28:10 AM4/1/09
to

The classic problem of floating-point data: a finite binary
representation
is not capable of exactly representing 1.4, just a finite decimal
representation
can not represent 1/3 exactly.

You need to define a small margin, like:

if (abs(u-1.4D0) < 2.0*epsilon(u)*1.4d0) print*,"ok"

where epsilon() is a standard function returning the smallest number E
such that 1 and 1+E can be distinguished.

It has everything to do with the finite precision of computers.

Regards,

Arjen

mecej4

unread,
Apr 1, 2009, 11:46:45 AM4/1/09
to
Fatemeh wrote:

> Dear all ;
>
> I wrote a program which has a bug when the value of u is greater
> than 1.4.
> when I tried to find the bug for values of y greater than 1.4 I
> saw an strange point :
>
> u=0.D0
>
> do while (u.lt.20.D0)
> .
> .
> .
> print*,u, y
> if (u==1.4D0) print*,"ok"
> .
> .
> .
> u=u+0.1D0
> end do
> ------------------------
> the output:

<--CUT-->

> but the program cannot print "ok" while it prints
> u=1.40000000000000 . why???
>
> I have got confused. Is there anyone can help me?
>
> Thanks

Try this simpler example:

program testFloat
real*8 X
character*3 C
data C/'1.4'/
read(C,*)X
if(X.eq.1.4)then
write(*,*)' Equal'
else
write(*,*)' Not Equal'
endif
end

Run it and see what it prints and compare to what you expect. Change the '1.4' on line-6 to '1.4d0' and try again.Then, do enough reading to make your expections more often in agreement with what the computer gives.

-- mecej4

Ron Shepard

unread,
Apr 1, 2009, 11:51:11 AM4/1/09
to
In article
<bd511e22-1b33-4f3c...@j39g2000yqn.googlegroups.com>,
Fatemeh <fateme....@gmail.com> wrote:

> u=0.D0
>
> do while (u.lt.20.D0)
> .
> .
> .
> print*,u, y
> if (u==1.4D0) print*,"ok"
> .
> .
> .
> u=u+0.1D0
> end do

Some other posters have explained your original problem. I wanted
to point out something else related to the floating point precision
problem.

Your above loop structure is probably intended to do 200 passes.
but if you actually count the passes, it might do 199, or 200, or
201 (depending on exactly how to test "u" to terminate the loop).
This is related to the fact that 0.1D0 is a decimal number that is
not represented exactly as a floating point value. The actual
floating point number might be either a little larger or a little
smaller than the decimal number. So when the value of "u" gets
incremented, it will always be incremented with a number that is a
little too large or too small, and those errors accumulate for every
addition. After the value of "u" starts to build up, the increment
operation is also subject to roundoff error, making the result of
the addition either a little larger or a little smaller than it
should be. This cumulative error will affect not only your
calculation (because "u" is used elsewhere), but also the number of
passes through the loop. So, in the end, the actual number of
passes might be right (200), or it could be off a little. In your
case, it might be off only by 1, most likely by being a little too
large, but in other situations it might be off by even more than
that.

So, what should you do to eliminate this cumulative error? If "u"
is changed only in that one place, and you know the number of passes
that are supposed to occur (i.e. in exact arithmetic), then you can
write the loop as, for example,

do iter = 0, 199
u = (iter) * 0.1D0
...
enddo

(I suggest also that you parameterize your double precision constant
with a real type, 0.1_dp, but this is a separate issue.)

With this loop, there is still a little error associated with each
value of u, but it is only a local error (i.e. probably no more than
the last bit only), it does not accumulate with the number of passes
through the loop. Furthermore, you will always get the right number
of passes because the test on the integer "iter" is always correct.
Finally, one other benefit is that compilers can optimize this loop
structure better than the do while form (e.g. with loop unrolling,
use of registers, etc.).

Some of the above comments are personal style issues, others are
practical issues that affect the results.

$.02 -Ron Shepard

e p chandler

unread,
Apr 1, 2009, 1:28:29 PM4/1/09
to
On Apr 1, 10:08 am, Fatemeh <fateme.mirj...@gmail.com> wrote:
> Dear all ;
>
> I wrote a program which has a bug when the value of u is greater than
> 1.4.
> when I tried to find the bug for values of y greater than 1.4 I saw an
> strange point :
>
> u=0.D0
>
> do while (u.lt.20.D0)
>     .
>     .
>     .
>    print*,u, y
>    if (u==1.4D0) print*,"ok"
>    .
>    .
>    .
>   u=u+0.1D0
> end do

> but the program cannot print "ok" while it prints u=1.40000000000000 .
> why???

1. Others have explained why comparing floating point numbers for
equality is almost always a problem.

2. It is traditional to use an integer variable in a DO loop, then
generate U or X or whatever from that.

For example

integer i
real u

do i=0,200
u=i/10.0
...
if(i == 14) print *,'OK'
...
end do

3. There are several advanatage to this approach.

a. calculating U = U0 + i * delta_U or U = U0 + (i-1) * delta_U
avoids accumulating errors unlike

U=U0
do

U=U + delta U
if ( ) exit
end do

b. It is easy to construct an array as a look-up table of values for
later use instead of re-computing them.

real aqz(100)
integer i
real x,x0,delta_x

do i=1,100
x=x0 + (i-1)*delta_x
aqz(i) = <some complicated function of x>
end do
....

given an X, finding which aqz()[s] to use for interpolation is easy!

--- e

Paul van Delst

unread,
Apr 1, 2009, 2:25:26 PM4/1/09
to
Gordon Sande wrote:

> The problem is that 0.1d0 is a notational fiction

I really like that phrase to refer to the issue of finite precision computation. Succinct
and imbibed with some mild shock value to make one think about the underlying mechanism(s).


cheers,

paulv


Gib Bogle

unread,
Apr 1, 2009, 4:12:08 PM4/1/09
to

Imbued, unless you've been drinking. ;-)

Paul van Delst

unread,
Apr 1, 2009, 4:33:17 PM4/1/09
to

Oops, yes. A, uh, Lagerian slip? I could go for a cold one right now. :o)

cheers,

paulv

Arjen Markus

unread,
Apr 2, 2009, 2:59:13 AM4/2/09
to
> Arjen- Tekst uit oorspronkelijk bericht niet weergeven -
>
> - Tekst uit oorspronkelijk bericht weergeven -

Actually, it is easier than what I wrote:

if ( abs(u-1.4D0) < 0.05D0 ) ...

where 0.05 is half the step (0.1).

u can not assume a value further off the intended value than 1/2 the
step,
unless you accumulate a lot of rounding errors. (It is a trick that is
especially useful if you need to compare u to zero: epsilon would be
tricky to use in that region!)

And read all the other postings for a better understanding too ;)

Regards,

Arjen

robin

unread,
Apr 2, 2009, 9:46:35 AM4/2/09
to
"Ron Shepard" <ron-s...@NOSPAM.comcast.net> wrote in message
news:ron-shepard-4B09...@forte.easynews.com...

> In article
> <bd511e22-1b33-4f3c...@j39g2000yqn.googlegroups.com>,
> Fatemeh <fateme....@gmail.com> wrote:
>
> > u=0.D0
> >
> > do while (u.lt.20.D0)
> > .
> > print*,u, y
> > if (u==1.4D0) print*,"ok"
> > .

Definitely not. This multiplies the error in 0.1 by iter,
and as iter approaches 199, you have multiplied this
error by 199.
A sure way to minimise the error is to divide by 10.
Thus,
u = iter/10.0d0
has a small error, while iter*0.1d0 has a large error.

> ...
> enddo
>
> (I suggest also that you parameterize your double precision constant
> with a real type, 0.1_dp, but this is a separate issue.)
>
> With this loop, there is still a little error associated with each
> value of u, but it is only a local error (i.e. probably no more than
> the last bit only),

The error is magnified as u increases in value.

Ron Shepard

unread,
Apr 2, 2009, 12:52:33 PM4/2/09
to
In article <%k3Bl.397$y61...@news-server.bigpond.net.au>,
"robin" <rob...@bigpond.com> wrote:

> > So, what should you do to eliminate this cumulative error? If "u"
> > is changed only in that one place, and you know the number of passes
> > that are supposed to occur (i.e. in exact arithmetic), then you can
> > write the loop as, for example,
> >
> > do iter = 0, 199
> > u = (iter) * 0.1D0
>
> Definitely not. This multiplies the error in 0.1 by iter,
> and as iter approaches 199, you have multiplied this
> error by 199.

Of course, but that is the best that can be done with floating point
arithmetic.

> A sure way to minimise the error is to divide by 10.
> Thus,
> u = iter/10.0d0
> has a small error, while iter*0.1d0 has a large error.

This is just drivel.

When most compilers see the constant in the denominator, they will
compute the reciprocal and use that in the expression. That is,
instead of doing the slow division each pass, they would do the
faster multiply. There can be differences in the two results, but
it should always be just in the last binary digit. Even when iter
is a multiple of 10, you cannot expect the result of (iter/10.0d0)
to be accurate to all bits. It might be correct, but you cannot
depend on it (and the fortran standard does not require it).

> > ...
> > enddo
> >
> > (I suggest also that you parameterize your double precision constant
> > with a real type, 0.1_dp, but this is a separate issue.)
> >
> > With this loop, there is still a little error associated with each
> > value of u, but it is only a local error (i.e. probably no more than
> > the last bit only),
>
> The error is magnified as u increases in value.

Of course, that is the way floating point arithmetic works. The
error in the last bit is always the same relative error, but its
value changes with the magnitude of the value. That error, relative
to exact arithmetic, cannot be eliminated.

It is useful to experiment with something like the following program.

program divide
implicit none
integer, parameter :: wp = selected_real_kind(12)
integer :: i, i1, i2
real(wp) :: r1, r2
write(*,*) 'input range:'
read(5,*) i1, i2
do i = i1, i2
r1 = i * 0.1_wp
r2 = i / 10._wp
if ( r1 .ne. r2 ) write(*,'(i0,2(f30.20,z20))') i, r1, r1, r2,
r2
enddo
end program divide

You will get different results with different compilers, or even
with different options on a given compiler, but it will be clear
that what I said above is true. The differences in the two values,
when they occur, will probably be only in the last binary digit (as
seen in the z20 formats), even for large values of "i".

This is in contrast to the original code in which the value was the
result of repeated accumulations. I that case, the error really
does grow beyond the last bit.

$.02 -Ron Shepard

glen herrmannsfeldt

unread,
Apr 2, 2009, 6:29:39 PM4/2/09
to
Ron Shepard <ron-s...@nospam.comcast.net> wrote:
(someone wrote)

>> Definitely not. This multiplies the error in 0.1 by iter,
>> and as iter approaches 199, you have multiplied this
>> error by 199.

> Of course, but that is the best that can be done with
> floating point arithmetic.

I agree.

-- glen

Reply all
Reply to author
Forward
0 new messages