Im compiling a large program with the new 15.0.2 compiler, where we previously used 12.0.4. Since we need to ensure our results are identical with the optimized and non-optimized version, we previously used the following compile arguments:
...and added in -O1 or -O2 for the nodebug/optimized version. All was well in the world, and our results matched perfectly between both executables. However, now that we're moving on to version 15, this is not so much the case anymore. We're matching with the majority of our results, however some executions that involve multiple inter-communicating processes are no longer matching. Our first hiccup was we had to change the "-fp-model precise" argument to "-fp-model source", according to the following warning:
Sorry, but I'm having an extremely difficult time digesting that error message. Upon examination of the manpage, it states that option fp-model precise is converted to fp-model source, however, just a few lines higher up in the manual it outlines that "precise" and "source" appear to do two completely different things. Is "precise" also setting "source"? It also seems to suggest you can select multiple options from the three groups listed, but doesn't say how. "-fp-model precise,source" throws an error, "-fp-model precise -fp-model source" issues the same warning above, implying it is being ignored.
Thanks, Steve. Actually, we're not trying to match the results from 12.0.4 to 15.0.2 -- I don't enjoy self-flagellation. We're just trying to get optimized and non-optimized results from one compiler version (15.0.2) to match. "Similar" results to 12.0.4 is our version-to-version metric. The core issue is if we have a strange condition appear in our code when running a large job (in optimized mode), we want to be able to go back and reproduce that result in the debug version so we can figure out what's going on. Due to the butterfly effect of our code, numerical differences early on could prevent the strange condition from even appearing at all in the debug version, despite all the inputs being the same.
If you're concerned about effects of data alignment, the option of recent compilers: '-align array32byte' may be useful, although your choices of -fp-model should have disabled optimizations where that would affect numerics (at least if you compile for intel64).
I don't know whether ifort would recognize the fltconsistency option the way you spelled it, but I would hope that fp-model options would take precedence anyway. fp-port also is an obsolete option, as the recent compilers should not use x87 code with the options you gave.
The fpe0 flag is not for numerical consistency, it's so we intentionally throw an error when we underflow or overflow. We prefer to know the locations in code where this happens, because it usually is indiciative of an error. Locations that are known to gradually underflow are trapped in code before underflowing.
Intel does not verify all solutions, including but not limited to any file transfers that may appear in this community. Accordingly, Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.
So I have been doing some research into the floating point model options and based on the various bits of documentation, I am very confused about what combinations are really recommended. For example:
"Use /fp:precise /fp:source (Windows) or -fpmodel precise -fp-model source (Linux or OS X) to improve the consistency and reproducibility of floating-point results while limiting the impact on performance"
So my question is this: if using fp:source also gets you the behavior of setting fp:precise, then why is the recommendation to use both of those settings on the command line? Isn't it enough to just set fp:source and be done with it?
but later on says that engaging fp:consistent sets "all of the above options", which presumably includes fp:source. So if that is indeed true, then why does the document go to the trouble of mentioning that both -fp-model consistent and -fp-model source were used together? This makes me question whether fp:source is adding some extra behavior beyond what fp:consistent gives you.
One additional question: to maximize consistency between platforms/chip architectures is the current recommendation to simply set /fp:consistent for Intel 17 and Intel 18 compilers or are there additional settings I should also be looking at?
In principle, /fp:source, /fp:double and /fp:extended control the intermediate precision for expression evaluation and are orthogonal to /fp:fast, precise, except, strict and consistent. However, as Steve indicates, this is only relevant to C or C++ (and is more important for IA-32 than Intel64). For technical reasons, only "source" is supported for expression evaluation in Fortran; therefore, you can safely ignore /fp:source and use only /fp:precise or consistent. The article you quote applies to both C/C++ and Fortran, so needs to mention source, double and extended. We try to make the switches and article accurate and logically consistent, but I understand that may be a bit confusing for Fortran where /fp:source doesn't do anything (except imply -fp-model precise if that wasn't set separately).
The context in which setting /fp:source as well as /fp:precise is important is for C/C++ on IA-32. Here, the default is "double" on Windows and "extended" on Linux, which gives best performance, but "source" gives the best consistency. On Intel64, "source" is the default and normally also gives best performance, so the other options, though supported for C/C++, are relatively unimportant. The architects like to encourage thinking of the expression evaluation method separately from reproducibility (value safety etc.), so like to use an explicit switch for each. But it's true that /fp:source isn't really adding anything for Fortran (or even for C/C++ on Intel64).
Yes, /fp:consistent is the recommendation for best consistency between different processors (microarchitectures) or different optimization levels. It is equivalent to /fp:precise /Qfma- /Qimf-arch-consistency:true, but was introduced in the version 17 compile as a "one stop" option for getting reproducible results that would be clearer to document and easier to find.
This should give consistency between different microarchitectures and instruction sets, such as SSE and AVX, but not necessarily between different architectures such as IA-32 and Intel64. It also does not necessarily ensure consistency between different major compiler versions - there could be improved versions of some math functions giving results that were more accurate, but different.
IMHO source and precise are not equivalent. While both sequence the expressions the same (with some latitude for scalar to vector), source is free to use the faster forms of intrinsic functions (sqrt, etc...) sacrificing precision for speed.
3a8082e126