Android Floating Point Determinism

Andre Posch

unread,

Jul 3, 2013, 5:59:31 AM7/3/13

to andro...@googlegroups.com

Hello,

I'm doing a RTS style game and atm I'm working on network multiplayer support. It is basicaly already working with my windows version. The standard approach is running identical simulations on the different devices and only send the commands across the network. For that it is necessary that both simulations do deliver exactly matching results. My engine data is mostly floating point and therefore the question about floating point determinism across android devices arises. Anyone has experience on that topic on android? I see problems due to the many different kinds of cpus used.

Thanks,

André

jeff shanab

unread,

Jul 4, 2013, 6:53:49 AM7/4/13

to andro...@googlegroups.com

Use Fixed Point math for deterministic real numbers.
http://en.wikipedia.org/wiki/Fixed-point_arithmetic
http://www.eetimes.com/author.asp?section_id=36&doc_id=1287491

I have used fixed point on embedded microcontrollers that do not have floating point types.

--
You received this message because you are subscribed to the Google Groups "android-ndk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to android-ndk...@googlegroups.com.
To post to this group, send email to andro...@googlegroups.com.
Visit this group at http://groups.google.com/group/android-ndk.
For more options, visit https://groups.google.com/groups/opt_out.

David Turner

unread,

Jul 4, 2013, 7:54:16 AM7/4/13

to andro...@googlegroups.com

Yes, you should not rely on deterministic FPU operations, because such things do not exist as long as you use different CPU designs, and this includes different ARM-based CPU designs.

As an example, the Skia team recently found a bug that was triggered by a computation which generated 0.999999.... on a Nexus 4, and 1.0 on previous Nexus devices. The end result was that this forced the renderer to use a slow path, killing performance :-)

The only way to deal with that is to use carefully-chosen epsilon values and range-based comparisons, instead of direct equalities.

--

Andre Posch

unread,

Jul 4, 2013, 1:05:28 PM7/4/13

to andro...@googlegroups.com

Hm, isn't there a way to force the ARM CPUs to use the same floating point behaviour?

e.g. on x86 you can use:

_controlfp(_PC_24, _MCW_PC);
_controlfp(_RC_NEAR, _MCW_RC);

to force internal precision to single and rounding mode to near which works pretty well (ok you have to check if this is still set all the time as some libraries change it as wel..)

andrew_esh

unread,

Jul 4, 2013, 8:01:02 PM7/4/13

to andro...@googlegroups.com

One thing to look out for is the storage size of the numbers being used. If they do not match, truncation and rounding will give different results. I ran into this when I was porting a lease accounting program from the IBM PC to the Macintosh about 25 years ago. The results did not match until I increased the storage size of most of the variables. Everything matched exactly after that.

Ray Donnelly

unread,

Jul 4, 2013, 8:03:39 PM7/4/13

to andro...@googlegroups.com

I'd recommend using an existing multiplayer networking solution like raknet and a shared memory model.

On 5 Jul 2013 01:01, "andrew_esh" <andrew...@gmail.com> wrote:

One thing to look out for is the storage size of the numbers being used. If they do not match, truncation and rounding will give different results. I ran into this when I was porting a lease accounting program from the IBM PC to the Macintosh about 25 years ago. The results did not match until I increased the storage size of most of the variables. Everything matched exactly after that.

--

Andre Posch

unread,

Jul 5, 2013, 5:59:21 AM7/5/13

to andro...@googlegroups.com

Hm, but you have to decrease the storage size on all cpus to fit the storage size of the cpu with the smallest storage size, right?

Andre Posch

unread,

Jul 5, 2013, 6:03:08 AM7/5/13

to andro...@googlegroups.com

Yes, I'm using RakNet atm. What do you mean with shared memory model? What data do you want to share? I think it's impossible to share all state variables for decent rts game because that would be a lot of data, imagine a few hundred units moving around on the map at 60 Hz.. The thing is I already have a working solution on windows, I only exchange data when the user issues a command and for the rest the simulations run synchronously. We are talking about that game: https://play.google.com/store/apps/details?id=com.luminencelabs.ProjectY and with the command approach I only have to send a few bytes per second. Syncing the complete state would go into the hundreds of kilobytes per second I guess. But maybe I got you wrong with the shared memory model?
André

Ray Donnelly

unread,

Jul 5, 2013, 6:21:58 AM7/5/13

to andro...@googlegroups.com

> But maybe I got you wrong with the shared memory model?

One typical way is to have a scheme for deciding who "owns" each unit
(a fairly obvious way is that the owner owns them ;-)), then they are
responsible for sending out only the changes that have happened either
as abstract, absolute commands (MovedTo X,Y,Z). Alternatively on each
frame, each client could serialise the units they are responsible for
into a block of opaque memory and then use a traditional delta-ing
algorithm to make an update packet (which could be compressed before
sending) which changes the state on the remote end.

Note, I've never written a network multiplayer title so my advice
should be taken with huge amounts of salt. There's dead reckoning and
such synchronisation / rollback that need to be carefully considered;
I think having an undo/redo system backing up your unit logic could
make this stuff manageable.

Hmm, we've drifted quite far from NDK discussion now, sorry everyone.

Glenn Kasten

unread,

Jul 15, 2013, 12:04:48 PM7/15/13

to andro...@googlegroups.com

sorry I'm coming late to this discussion, but a co-worker just shared these links with me, which may be relevant:

http://mauve.mizuumi.net/2012/02/05/investigating-rollcasters-desyncs/

http://mauve.mizuumi.net/2013/06/16/desyncs-and-fpu-synchronization/

as others have said above, don't use floating-point if you need exact same results

John McFarlane

unread,

Jul 18, 2013, 3:36:55 PM7/18/13

to andro...@googlegroups.com

Additionally, make sure that -ffast-math is turned off. Otherwise, certain optimizations performed between reads and writes of those stored values may introduce subtle inconsistencies. This will only have a chance of fixing things if all compilers/CPUs which intercommunicate strictly adhere to the same FP standard, e.g. IEEE 754.

Ken Turkowski

unread,

Oct 30, 2013, 2:56:57 AM10/30/13

to andro...@googlegroups.com, spamm...@gmx.de

Andre,

There is s preprocessor variable
FLT_EVAL_METHOD
a.k.a.
__FLT_EVAL_METHOD__
which dictates how floating-point expressions are evaluated.
0: expressions are evaluated in the highest precision of the operands.
1: expressions are evaluated in the highest precision of the operands, or double whichever is higher.
2: expressions are evaluated in the long double.
If you can control this variable, you can guarantee identical results.
Mac and ARM use FLT_EVAL_METHOD=0.
Windows uses FLT_EVAL_METHOD=2.

But there are ways of getting around this, by using simpler expression, and/or by using the keyword "volatile".

Suppose you are doing
float a, b, c, d;
…
d = a * b + c;
If you say
d = c;
float e = a * b;
d += e;
That will sometimes coerce the computation to be done in single precision.
But sometimes the compiler tries to be smart, and keeps some of the intermediate variables in long double registers. If you can't get around its optimizations, you can assign the intermediate results to a volatile variable of the desired precision.
With some compilers, you can use casts, e.g.
d = (float)((float)a * (float)b) + (float)c;
If the compiler respects the casts (not all do), then you can get consistent results. If you prefer, you could also use doubles, i.e.
d = (float)((double)((double)a * (double)b) + (double)c);
and get consistent results.

But beware the use of long doubles.
On some machines, it is identical to double (53 bit significant).
On other machines it is implemented as 80-bit extended (64 bit significant).
On other machines it is implemented with two doubles (106 bit significant).
They can produce different results when rounding down to lower precision.

-Ken

Reply all

Reply to author

Forward