Greetings, all!
I have a somewhat unusual first project with Emscripten. I need to get a domain-specific C-like language generating WebAssembly. This is not too bad, because the DSL compiles into C code, which I then feed to the C compiler for the relevant platform. At this level, WebAssembly is "just another platform" but as always, the details are more complicated.
I need to provide declarations for the platform's C run-time library functions, types, constants, and so on to the DSL. This is a fairly routine task, given the platform's C headers, but there are a few things about the Emscripten headers that are puzzling me.
To forestall the obvious question, no, I can't just use the provided headers. The DSL is C-like, but has some syntax differences. Unlike C++, many C programs are not valid programs in the DSL, which gives much more freedom in the language design. It's a separate development that split from normal C in the mid-eighties and is still very much worth using for its specialised role. Yes, new hires have to learn it, which takes about two days for someone who knows C or C++. Learning about the specialised application area it is used for takes much longer.
I'm running Emscripten on Linux. I started by looking at Emscripten 3.1.41, since that's the version used by a couple of other product teams that work on my site. That is fairly simple when I run a few emcc compiles with -H to get a report of what files are referenced. The top-level headers come from emsdk/upstream/emscripten/cache/sysroot/include. A few Clang-associated ones come from emsdk/upstream/lib/clang/17/include. That's all fine with me.
Then I decided to check the latest emsdk, and found that with 3.1.69, the headers come from a different place. It builds a cache of the headers and other files it uses under ~/.emscripten_cache. The problems with that are:
Storage: It will take 36MB in the user directory of everyone who ever compiles with Emscripten. We keep all our user directories on a server disk, because that's enormously convenient in many ways. But we really don't want to burn space with duplicates of that cache.
Version lock: We need to be able to have several Emscripten versions in use simultaneously, by the same account, without conflicts. Our reason for this is that we plan to release products on WebAssembly, and from time to time, update the version of Emscripten we use, to get access to new C and C++ standards, compiler bug fixes, and so on. But we will not update the tools used to build a product version that's been released and is under maintenance, because we'd have to re-do a lot of the QA that we do at a release. So the service accounts that run our builds can't have caches of version-specific headers in their user directories.
I need a way to tell Emscripten to put that cache somewhere else. I did some grep'ing of the Emscripten scripts, and found this line, in both 3.1.41 and 3.1.69:
upstream/emscripten/tools/config.py: CACHE = os.path.expanduser(os.path.join('~', '.emscripten_cache'))
I have not yet attempted to read the Python code and learn how it all works, because that could take ages; I don't know Python well at all and am not keen on it. I'm a naturally low-level programmer, much happier with assembly code than object orientation
Is there an environment variable I can use to relocate that cache?
Thanks very much,
John