64 bit system: Force Float default to Float32

1,625 views
Skip to first unread message

Hubert Soyer

unread,
Oct 8, 2014, 8:58:19 PM10/8/14
to julia...@googlegroups.com
Hello everybody,

I have just implemented my first project in Julia on a 64 bit system.
I re-implemented a C program and even after optimizing my code using the profiler I can only reach roughly half of its speed.
Most of the CPU time is spend in methods from the Base.LinAlg.BLAS module.
Playing around with these functions, I noticed that there is a huge difference (roughly factor 2) depending on whether my input to
BLAS is Float32 or Float64 (which is obviously the slower one). Particularly I am referring to the axpy! function.
Since I am on a 64 bit system, every float that I directly specify (x = 1.0) will be a Float64 by default,
and therefore, converting every float variable to Float32 will mess up my code considerably.
From reading the mailing list and the github issues I have the impression that I can't simply set a Flag on my 64 bit system to compile
Julia in 32 bit mode.

Is there any way to make floats in Julia default to 32 bit on a 64 bit system?
At least in a lot of my work 32 bit precision is just fine and the speed I could potentially gain by simply switching is enormous.

Thank you in advance.

Best,

Hubert

Patrick O'Leary

unread,
Oct 8, 2014, 9:06:15 PM10/8/14
to julia...@googlegroups.com
On Wednesday, October 8, 2014 7:58:19 PM UTC-5, Hubert Soyer wrote:
Since I am on a 64 bit system, every float that I directly specify (x = 1.0) will be a Float64 by default,
and therefore, converting every float variable to Float32 will mess up my code considerably.
From reading the mailing list and the github issues I have the impression that I can't simply set a Flag on my 64 bit system to compile
Julia in 32 bit mode.

To briefly clarify a common misconception: *even on a 32-bit system* floating-point math is done with double-precision (64-bit) numbers by default. Maintaining precision in floating point is often more important than raw performance, and floating-point units handle double-precision numbers even on 32-bit systems, so this is a reasonable decision to make. It is a choice shared with other technical computing environments, as well as other dynamic languages such as Python and Javascript; the latter of which represents all numbers (even integers) as double-precision floats.

Pontus Stenetorp

unread,
Oct 8, 2014, 10:51:43 PM10/8/14
to julia...@googlegroups.com
On 9 October 2014 09:58, Hubert Soyer <hubert...@gmail.com> wrote:
>
> Is there any way to make floats in Julia default to 32 bit on a 64 bit
> system?
> At least in a lot of my work 32 bit precision is just fine and the speed I
> could potentially gain by simply switching is enormous.

I think Patrick already handled the 32-bit vs 64-bit version confusion
nicely, so I will take a stab at the rest of the question. Avoiding
64-bit floats is not uncommon, 32-bit is the standard for most games
and I have even heard of 16-bit floats being used for real time
imaging systems.

However, as far as I know, there is no way to switch the whole inner
workings of Julia to default to 32-bit floats, I am far from a
seasoned Julia programmer though, so I may be wrong. The best way I
can come up with is to use some type aliasing.

typealias Float Float32

Then, wherever you may need to assign a float you would simply
constrain the variable as being `Float`.

x::Float = 1.0

This should make switching between 64/32/16-bit floats easy and keep
your code reasonably clean. Just keep in mind that you can not do
this at the top level of a module, I can not remember the current
technical reason though, but would need something along the lines of
`x = convert(Float, 1.0)`.

Pontus

Elliot Saba

unread,
Oct 9, 2014, 2:46:04 AM10/9/14
to julia...@googlegroups.com
If I understand you correctly, your problem boils down to the fact that you want a way to enter Float32 literals.

Note that if x and y are both Float32s, z = x + y is a Float32 as well.  Therefore, the challenge is just in ensuring everything is a Float32 at the get-go.  You can use Float32 syntax to do this:

julia> typeof(1.0f0)
Float32


This is a Float32 literal, documented here, and hinted at by Julia when it prints Float32s:

julia> float32(2.5)
2.5f0


Once you ensure all inputs to your algorithms Float32s, you should be able to do what you want without overriding the default Float or anything so drastic.

-E

Stefan Karpinski

unread,
Oct 9, 2014, 10:21:42 AM10/9/14
to Julia Users
Note also that while Float64 * Float32 (and other arithmetic ops) produce Float64, for other "less precise" numeric types, Float32 wins out, making it fairly straightforward to write type-stable generic code:

julia> 1.5 * 1.5f0
2.25

julia> 2 * 1.5f0
3.0f0

julia> pi * 1.5f0
4.712389f0

julia> 2//3 * 1.5f0
1.0f0

Hubert Soyer

unread,
Oct 15, 2014, 4:45:34 AM10/15/14
to julia...@googlegroups.com
Sorry for coming back to this so late. I forgot to subscribe to get updates via email and thought nobody had replied yet.
Thank you for all the comments, I think I get the point.

I am coming from a python background and was and am still working with a module called Theano that provides automatic differentiation and is used a lot for neural networks.
This module offers an environment variable (floatX) that lets me specify whether I want to use Float64 or Float32 all the way.

My workflow would then be:
Prototype with Float32 to get the speed benefit. 
When I want to have a more serious look at my results, I switch to Float64 to be safe.

So I thought I'd ask this question and if it turns out that there is a switch like that in Julia, I could just use it.
I do understand that this was just a convenient "hack" for me and it will definitely work without that functionality.
But in case it existed but just wasn't documented, I thought I'd ask.

Thank you a lot for your comments, Pontus' solution seems to cover my use case just fine, I think I'll go with that.

Best,

Hubert

Tamas Papp

unread,
Oct 15, 2014, 4:57:09 AM10/15/14
to julia...@googlegroups.com
Just out of curiosity, can you please post some benchmarks of Float32 vs
Float in Julia for your algorithm when you finish what you are working
on?

My experience on modern x86 architectures is that CPU handles both at
the approximately same speeds, and when I have big matrices the speed
benefit of single float comes from using less memory, but that is not
worth dealing with the subtle problems that come from loss of precision
(in particular, it is very easy to run into conditioning problems with
single float).

Best,

Tamas

Erik Schnetter

unread,
Oct 15, 2014, 8:59:38 AM10/15/14
to julia...@googlegroups.com
Modern x86 CPUs handle floats at about twice the speed as doubles. A
floating-point instruction usually takes one cycle, and each
instruction can execute multiple operations due to vectorization. With
doubles, you can have 4 operations per instruction, and with floats,
you can have 8 operations per instruction. The L1 cache bandwidth is
nicely adjusted to this, so that both CPU speed and L1 cache bandwidth
peak out at the same throughput.

-erik
--
Erik Schnetter <schn...@cct.lsu.edu>
http://www.perimeterinstitute.ca/personal/eschnetter/

Steven G. Johnson

unread,
Oct 15, 2014, 11:08:35 AM10/15/14
to julia...@googlegroups.com


On Wednesday, October 15, 2014 8:59:38 AM UTC-4, Erik Schnetter wrote:
Modern x86 CPUs handle floats at about twice the speed as doubles. A
floating-point instruction usually takes one cycle, and each
instruction can execute multiple operations due to vectorization. With
doubles, you can have 4 operations per instruction, and with floats,
you can have 8 operations per instruction.

That assumes that everything obtains optimal SIMD vectorization, which is usually false. 

Erik Schnetter

unread,
Oct 15, 2014, 11:14:28 AM10/15/14
to julia...@googlegroups.com
The original question stated "most time is spent in BLAS", in
particular in axpy. We can safely assume that axpy is vectorized.

-erik

Jiahao Chen

unread,
Oct 15, 2014, 11:43:58 AM10/15/14
to julia...@googlegroups.com
> We can safely assume that axpy is vectorized

I think it would depend on how closely the BLAS implementations follow the letter of the law when it comes to floating-point semantics. Arch Robison has written and spoken (recently at JuliaCon, but also elsewhere) about how the nonassociativity of floating point operations means that computations that require reordering of operations to vectorize cannot be vectorized without compromising strict IEEE-754-compliant rounding behavior.

Erik Schnetter

unread,
Oct 15, 2014, 11:47:06 AM10/15/14
to julia...@googlegroups.com
axpy is not affected by this, it has sufficiently many independent
floating point operations.
Reply all
Reply to author
Forward
0 new messages