Have you ever noticed ...

Bonita Montero

unread,

Jun 1, 2021, 1:07:01 AM6/1/21

to

If you call ...
C:\Program Files (x86)\Microsoft Visual
Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat
... and then ...
clang --version
... then you get the nearly most recent clang (V11) with MSVC ?
But unfortunately for most benchmarks I wrote MSVC generates
better code than clang.

Juha Nieminen

unread,

Jun 1, 2021, 3:11:33 AM6/1/21

to

What optimization flags did you use?

Bonita Montero

unread,

Jun 1, 2021, 5:06:18 AM6/1/21

to

>> If you call ...
>> C:\Program Files (x86)\Microsoft Visual
>> Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat
>> ... and then ...
>> clang --version
>> ... then you get the nearly most recent clang (V11) with MSVC ?
>> But unfortunately for most benchmarks I wrote MSVC generates
>> better code than clang.

> What optimization flags did you use?

-O3
There are also benchmarks which run faster with clang,
but most run faster with cl.

David Brown

unread,

Jun 1, 2021, 5:54:30 AM6/1/21

to

-O3 is sometimes faster than -O2, and often slower. Don't use -O3
unless you know what you are doing, and have tested it to see that it is
helpful for your particular requirements. (The same applies to gcc, to
a slightly lesser extent.)

clang also has a tendency to be enthusiastic about vectorisation and
loop unrolling that can help for big data sets, but be noticeably bigger
and slower on small sets. You see that especially on -O3, but also on
-O2. Optimising is not just a matter of "bigger number means faster for
all tests".

(That said, MSVC can be quite efficient for some kinds of code - it's
entirely possible that it is simply better for the examples you were using.)

Bonita Montero

unread,

Jun 1, 2021, 6:26:14 AM6/1/21

to

> -O3 is sometimes faster than -O2, and often slower. ...

For my benchmars it's always faster than -O2. F.e I've got an
improved version of CRC64 and FNV64, which profits a lot from
loop-unrolling.

Michael S

unread,

Jun 1, 2021, 4:23:32 PM6/1/21

to

I was under impression that in clang -O3 is the same as -O2. A flag that exists for compatibility with gcc and nothing else.

Real Troll

unread,

Jun 1, 2021, 5:50:26 PM6/1/21

to

On 01/06/2021 21:23, Michael S wrote:
> I was under impression that in clang -O3 is the same as -O2. A flag that exists for compatibility with gcc and nothing else.

Just checked in clang, latest version 12.0.0-win64, and nothing in it -O3. Not sure what are they talking about.

Keith Thompson

unread,

Jun 1, 2021, 9:24:31 PM6/1/21

to

As of version 12.0.0, "man clang" says:

-O0, -O1, -O2, -O3, -Ofast, -Os, -Oz, -Og, -O, -O4
Specify which optimization level to use:

-O0 Means “no optimization”: this level compiles the
fastest and generates the most debuggable code.

-O1 Somewhere between -O0 and -O2.

-O2 Moderate level of optimization which enables most
optimizations.

-O3 Like -O2, except that it enables optimizations that
take longer to perform or that may generate larger
code (in an attempt to make the program run faster).

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Bonita Montero

unread,

Jun 1, 2021, 10:51:16 PM6/1/21

to

I've just installed clang12 and now my lock-free LRU-cache-algorithm
is nearly twice as fast than with clang11, and faster than with MSVC !

Doctor Who

unread,

Jun 1, 2021, 11:12:51 PM6/1/21

to

good catch!

The Doctor

unread,

Jun 2, 2021, 12:10:47 AM6/2/21

to

In article <qotdbg9mkvkhe0obb...@4ax.com>,

Oi! What are you not in rec.arts.drwho ?

--
Member - Liberal International This is doctor@@nl2k.ab.ca Ici doctor@@nl2k.ab.ca
Yahweh, Queen & country!Never Satan President Republic!Beware AntiChrist rising!
Look at Psalms 14 and 53 on Atheism https://www.empire.kred/ROOTNK?t=94a1f39b
If one could not qualify for the ark, then how for heaven? -unknown

Juha Nieminen

unread,

Jun 2, 2021, 1:07:24 AM6/2/21

to

I also recommend using -march=native.

Michael S

unread,

Jun 2, 2021, 5:02:30 PM6/2/21

to

It sounds like a generic statement that was likely written at the very beginning of LLVM development and does not mean much in practice.
Same statement 6 years ago:
https://web.archive.org/web/20151027095342/https://clang.llvm.org/docs/CommandGuide/clang.html

In gcc, there is a list of sub-optimizations enabled by various -O flags.
https://gcc.gnu.org/onlinedocs/gcc-11.1.0/gcc/Optimize-Options.html#Optimize-Options
I had never seen similar list for clang/LLVM.

Also, when observing asm code, generated by clang, I never encountered different results between -O2 and -O3.
To be fair, typically I looked at relatively small snippets, so if the differences are related to interprocedural optimizations I could have overlooked them.

Vir Campestris

unread,

Jun 6, 2021, 4:31:08 PM6/6/21

to

On 01/06/2021 10:06, Bonita Montero wrote:
> -O3
> There are also benchmarks which run faster with clang,
> but most run faster with cl.

I found some code the other day that was faster with O3 on GCC but Ofast
on clang. (or it could have been the other way around).

Always benchmark if performance matters that much to you.

Andy