Warning about llvm 10 on Mac

25 views
Skip to first unread message

Larry Gritz

unread,
Apr 27, 2020, 3:11:20 PM4/27/20
to OSL Developers List
Ever since homebrew upgraded llvm to v10, I've been having some testsuite failures on Mac (both my laptop and GitHub CI) that basically boil down to the fact that in the testsuite, the oslc commands launched by runtest.py mysteriously only recognize the first directory in the include search path list, and therefore can fail to find some include files. This only happens when launched via runtest.py; the identical oslc command line run from the shell is fine.

Only happens on Mac. Only with llvm 10 Only when launched via python subprocess.call().

Here's the email I sent to the cfe-users list (dev mail list for the clang libraries, which we use for the preprocessor), just in case anybody here has a hypothesis about what's going on:

---

Excuse if this is a tricky explanation; I'm not sure I understand what's going on.

I have a C-like language and compiler for which I use clang libraries to do the preprocessing. My compiler lets users specify `-I` directories for searchpaths for includes, per usual convention. I'm doing something like this:

clang::HeaderSearchOptions &headerOpts = compilerinst.getHeaderSearchOpts();
headerOpts.UseBuiltinIncludes = 0;
headerOpts.UseStandardSystemIncludes = 0;
headerOpts.UseStandardCXXIncludes = 0;
for (auto&& inc : includepaths) {
headerOpts.AddPath (inc, clang::frontend::Quoted,
false /* not a framework */,
true /* ignore sys root */);
}


For the sake of a simple failure case, I have header a.h in directory incA/, and header b.h in incB/, and my test program just consists of

#include "a.h"
#include "b.h"

Also, I have set this to turn on some debugging:
headerOpts.Verbose = 1; // DEBUGGING

Now, when I invoke my compiler from the command line,

oslc -IincA -IincB test.osl

I get this output:

#include "..." search starts here:
#include <...> search starts here:
incA
incB
End of search list.

and my compile succeeds. As expected, and as it has for many many years.

But, as part of my compiler's test suite, there is a python script involved that boils down to:

#!/usr/bin/env python
import subprocess
subprocess.call ('oslc -IincA -IincB test.osl', shell=True)

When I run the python program,

python mytest.py

then I get this output:

#include "..." search starts here:
#include <...> search starts here:
incA
incB
End of search list.
error: test.osl:3:10: fatal error: cannot open file 'incA/b.h': No such file or directory
#include "b.h"
^
FAILED test.osl

Wha? So I've poked around a bit with the behavior, and near as I can tell, even though the diagnostics say that both incA and incB are in the search list, it's only actually searching the first directory listed.

Now, this only happens on OSX, and only when I'm using clang 10 libraries (installed via Homebrew, though also when I build clang from scratch). Works fine on Linux. Works fine on all platforms for clang 9, 8, 7, 6, and I've been using this since back to 3.3 or so. Only had this problem after upgrading to clang/llvm 10, and only on OSX. Fails the same way for python 2.7 and 3.7.

If I change the subprocess.call to:

subprocess.call (['oslc', '-IincA', '-IincB', 'blah.osl'], shell=False)

it succeeds. (But in real life, this isn't an adequate workaround, because I want to use shell=True and keep the whole command line together, because it's really an arbitrary shell command that has output redirect.)

Does any of this ring a bell for anybody? Or does anyone have suggestions for what to try next?


--
Larry Gritz
l...@larrygritz.com




Solomon Boulos

unread,
Apr 27, 2020, 3:58:12 PM4/27/20
to osl...@googlegroups.com
Do you have stuff in your bashrc/similar? Assuming you upgraded to Catalina, there are a number of “ha ha, we break you” like bash => zsh, weird file permissions that control what programs can see what on your computer, etc.

As a simple test, instead of fixing it by running oslc as the binary with manual args, what happens with running bash explicitly and giving it those args instead?

--
You received this message because you are subscribed to the Google Groups "OSL Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osl-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osl-dev/F01DC5E9-CD9E-4D16-94A0-77DC0F1E537E%40larrygritz.com.

Larry Gritz

unread,
Apr 27, 2020, 4:04:15 PM4/27/20
to OSL Developers List
I'm not on Catalina yet.

I don't think it's my bashrc/etc, because I see the same problem on GitHub Actions CI for the Mac case.

I'm not quite sure I understand what you are suggesting in your second paragraph.

-- lg


Solomon Boulos

unread,
Apr 28, 2020, 2:50:34 PM4/28/20
to osl...@googlegroups.com
If you're not on Catalina, it's likely a false alarm. But for debugging, I meant to turn:

   subprocess.call (['oslc', '-IincA', '-IincB', 'blah.osl'], shell=False)

into

   subprocess.call (['/bin/bash', 'oslc', '-IincA', '-IincB', 'blah.osl'], shell=False)



Larry Gritz

unread,
Apr 28, 2020, 5:00:07 PM4/28/20
to OSL Developers List
Ooh, you might be on to something!


$ oslc -IincA -IincB blah.osl
adding 'incA'        <-- my debug, shows that the '-IincA' was parsed on the command line
adding 'incB'        <-- my debug, shows that the '-IincB' was parsed on the command line
#include "..." search starts here:
#include <...> search starts here:
 incA             <-- LLVM diagnostics, shows I passed incA in includedirs list
 incB             <-- LLVM diagnostics, shows I passed incB in includedirs list
End of search list.
Compiled blah.osl -> blah.oso


$ /bin/bash -c "oslc -IincA -IincB blah.osl"
adding 'incA'
adding 'incB'
#include "..." search starts here:
#include <...> search starts here:
 incA
 incB
End of search list.
error: blah.osl:3:10: fatal error: cannot open file 'incA/b.h': No such file or directory
#include "b.h"
         ^
FAILED blah.osl


Now let's try rebuilding all of OSL with llvm@9 from homebrew instead of llvm (which is llvm 10.0):

$ /bin/bash -c "oslc -IincA -IincB blah.osl"
adding 'incA'
adding 'incB'
#include "..." search starts here:
#include <...> search starts here:
 incA
 incB
End of search list.
Compiled blah.osl -> blah.oso


The mind reels. At least it's not python per se. But I don't have a working theory for what's going on here.



Larry Gritz

unread,
Apr 28, 2020, 5:24:54 PM4/28/20
to OSL Developers List
Same behavior for csh, tcsh, zsh.

I suppose I'm about to start digging through libclang source code, but I'm not looking forward to it.


Solomon Boulos

unread,
Apr 28, 2020, 5:25:16 PM4/28/20
to osl...@googlegroups.com
What's in your .bashrc and .bash_profile or .profile (or whatever your default shell reads from). What's curious is that assuming you ran /bin/bash manually like that it should inherit your environment variables from the interactive shell (while a direct invocation via subprocess.call would result in just executing the !login variants).

http://hayne.net/MacDev/Notes/unixFAQ.html#shellStartup has a good discussion, *except* it just dumps you out to the man page for bash for non-interactive shells. Which ... wow, has stuff I didn't know:

>     When bash is invoked as an interactive login shell, or as a  non-inter-
       active  shell with the --login option, it first reads and executes com-
       mands from the file /etc/profile, if that file exists.   After  reading
       that file, it looks for ~/.bash_profile, ~/.bash_login, and ~/.profile,
       in that order, and reads and executes commands from the first one  that
       exists  and  is  readable.  The --noprofile option may be used when the
       shell is started to inhibit this behavior.

       When an interactive login shell exits, or a non-interactive login shell
       executes  the  exit  builtin  command, bash reads and executes commands
       from the file ~/.bash_logout, if it exists.

       When an interactive shell that is not a login shell  is  started,  bash
       reads  and executes commands from ~/.bashrc, if that file exists.  This
       may be inhibited by using the --norc option.  The --rcfile file  option
       will  force  bash  to  read  and  execute commands from file instead of
       ~/.bashrc.

       When bash is started non-interactively, to  run  a  shell  script,  for
       example, it looks for the variable BASH_ENV in the environment, expands
       its value if it appears there, and uses the expanded value as the  name
       of  a  file to read and execute.  Bash behaves as if the following com-
       mand were executed:
              if [ -n "$BASH_ENV" ]; then . "$BASH_ENV"; fi
       but the value of the PATH variable is not used to search for the  file-
       name.

       If  bash  is  invoked  with  the name sh, it tries to mimic the startup
       behavior of historical versions of sh as  closely  as  possible,  while
       conforming  to the POSIX standard as well.  When invoked as an interac-
       tive login shell, or a non-interactive shell with the  --login  option,
       it  first  attempts  to read and execute commands from /etc/profile and
       ~/.profile, in that order.  The  --noprofile  option  may  be  used  to
       inhibit  this  behavior.  When invoked as an interactive shell with the
       name sh, bash looks for the variable ENV, expands its value  if  it  is
       defined,  and uses the expanded value as the name of a file to read and
       execute.  Since a shell invoked as sh does not attempt to read and exe-
       cute  commands from any other startup files, the --rcfile option has no
       effect.  A non-interactive shell invoked with  the  name  sh  does  not
       attempt  to  read  any  other  startup files.  When invoked as sh, bash
       enters posix mode after the startup files are read.

       When bash is started in posix mode, as with the  --posix  command  line
       option, it follows the POSIX standard for startup files.  In this mode,
       interactive shells expand the ENV variable and commands  are  read  and
       executed  from  the  file  whose  name is the expanded value.  No other
       startup files are read.

Larry Gritz

unread,
Apr 28, 2020, 5:31:22 PM4/28/20
to OSL Developers List
Aha!?

if I make sure my .bashrc contents are skipped, I can get the same behavior from a bare /bin/bash when I do 'oslc ...', no secondary shell needed. So.. something I'm setting up is somehow making this all work in my ordinary shell?

OK, will narrow it to the line. Stay tuned.


Larry Gritz

unread,
Apr 28, 2020, 6:55:32 PM4/28/20
to OSL Developers List
If /usr/local/Cellar/llvm/10.0.0_3 is in my DYLD_LIBRARY_PATH, it works. If not, it doesn't.

What I don't understand yet is why. If it can't find a library, why does it not fail, but instead have this incredibly weird behavior?
Still digging.


Larry Gritz

unread,
Apr 28, 2020, 9:43:16 PM4/28/20
to OSL Developers List
works: oslc -IincA -IincB blah.osl
broken: DYLD_LIBRARY_PATH= oslc -IincA -IincB blah.osl

The only thing in my DYLD_LIBRARY_PATH was /usr/local/opt/llvm

DYLD_PRINT_LIBRARIES=1 is helpful

The broken one seems to pull in TWO different copies of libc++.1.dylib, one from /usr/local/opt/llvm (as expected) but also a second one from /usr/lib.

So I think... that maybe this is an rpath botch somewhere, or perhaps a sign that I should always try hard to link statically against the llvm components?


Solomon Boulos

unread,
Apr 28, 2020, 10:00:02 PM4/28/20
to osl...@googlegroups.com
This terrifying github issue:

  

says that libc++ is/was fairly broken (hardcoded rpath in the cmake build). I can’t see from a simple skim if homebrew gave up on it permanently, but obviously a lot can change in five-ish years.

Either way, I think that this implies you’re getting an old libc++ from your system when running with an empty DYLD path, and that whatever you need for this code to work is only in the newer one. (Which is a surprising behavioral difference, maybe an initializer of some sort that doesn’t break the ABI but does change behavior?)



Solomon Boulos

unread,
Apr 28, 2020, 10:27:08 PM4/28/20
to osl...@googlegroups.com
And the follow up in the -core repo:

 

seems like they just gave up :). This recent and open one suggests badness for your LLVM 10 upgrade:


So I dunno, don’t install LLVM via homebrew? (Meaning just build it yourself locally)

Larry Gritz

unread,
Apr 29, 2020, 1:30:55 AM4/29/20
to OSL Developers List
What a mess!

But after some experimentation, it seems that it can all be solved (or at least be asymptomatic?) if I link to the llvm libs statically rather than dynamically. Luckily, I already had a way to tell my FindLLVM.cmake to prefer the static libs. So I think the solution I'll use for now is just ot always set that to true on OSX and force static linkage of libclang.


Larry Gritz

unread,
Apr 29, 2020, 1:31:42 AM4/29/20
to OSL Developers List
Thanks, Solomon! Your prods definitely pointed me in a fruitful direction. This has been a low level annoyance to me for at least a couple weeks now.

-- lg
--
Larry Gritz




Larry Gritz

unread,
Apr 29, 2020, 2:38:55 AM4/29/20
to OSL Developers List
Reply all
Reply to author
Forward
0 new messages