[distcc-patches] Accept relative compiler path

110 views
Skip to first unread message

Atte Peltomaki

unread,
Nov 2, 2011, 7:16:27 AM11/2/11
to distcc-...@googlegroups.com
Hi,

I did this little hack to allow distcc client call for compiler using
relative path, eg:

distcc toolchain/target/gcc source.c

This is useful in an environment where toolchain location is a subject
to change. This way when toolchain location changes in the source tree,
it can be automatically updated in server side instead of changing
distcc configuration.

Patch against distcc 3.1.

--- a/source/src/exec.c 2008-12-02 23:50:24.000000000 +0200
+++ b/source/src/exec.c 2011-08-30 19:25:20.377696605 +0300
@@ -246,6 +246,11 @@
static void dcc_execvp(char **argv)
{
char *slash;
+ const char *envpath, *p, *n;
+ char *buf = NULL;
+ struct stat sb;
+ int len;
+ char linkbuf[MAXPATHLEN];

execvp(argv[0], argv);

@@ -263,10 +268,36 @@
if (slash)
execvp(slash + 1, argv);

- /* shouldn't be reached */
- rs_log_error("failed to exec %s: %s", argv[0], strerror(errno));
+ /* Try if a relative path to compiler was given
+ * XXX: Duplicates code from dcc_check_compiler_masq() */

- dcc_exit(EXIT_COMPILER_MISSING); /* a generalization, i know */
+ if (!(envpath = getenv("PATH"))) {
+ /* shouldn't be reached */
+ rs_log_error("failed to exec %s: %s", argv[0], strerror(errno));
+ dcc_exit(EXIT_COMPILER_MISSING); /* a generalization, i know */
+ }
+
+ for (n = p = envpath; *n; p = n) {
+ n = strchr(p, ':');
+ if (n)
+ len = n++ - p;
+ else {
+ len = strlen(p);
+ n = p + len;
+ }
+ /* if (asprintf(&buf, "%.*s/%s", len, p, compiler_name) == -1) {
+ rs_log_crit("asprintf failed");
+ return EXIT_DISTCC_FAILED;
+ } */
+
+ if (lstat(buf, &sb) == -1)
+ continue; /* ENOENT, EACCESS, etc */
+ else
+ //argv[0] = buf;
+ execvp(buf, argv);
+ }
+
+ free(buf);
}


--
Atte Peltomäki
NVIDIA Corporation
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

Fergus Henderson

unread,
Nov 2, 2011, 7:36:30 AM11/2/11
to distcc-...@googlegroups.com
On Wed, Nov 2, 2011 at 10:16 PM, Atte Peltomaki <apelt...@nvidia.com> wrote:
Hi,

I did this little hack to allow distcc client call for compiler using
relative path, eg:

distcc toolchain/target/gcc source.c

What exactly is the intended semantics in such a case?

For example, should distcc send the compiler to the remote machine?
Why/why not?

I don't really understand your use case; please explain it in more detail.
 
This is useful in an environment where toolchain location is a subject
to change. This way when toolchain location changes in the source tree,
it can be automatically updated in server side instead of changing
distcc configuration.

Patch against distcc 3.1.

--- a/source/src/exec.c 2008-12-02 23:50:24.000000000 +0200
+++ b/source/src/exec.c 2011-08-30 19:25:20.377696605 +0300
@@ -246,6 +246,11 @@
 static void dcc_execvp(char **argv)
 {
    char *slash;
+    const char *envpath, *p, *n;

Please use more meaningful variable names than 'p' and 'n'.
 
+    char linkbuf[MAXPATHLEN];

This is unused.
 
+        /* if (asprintf(&buf, "%.*s/%s", len, p, compiler_name) == -1) {
+            rs_log_crit("asprintf failed");
+            return EXIT_DISTCC_FAILED;
+        } */

Why is this code commented out?

--
Fergus Henderson <fer...@google.com>

"Defend the user, exclude no one, and create magic." -- Eric Schmidt.


Atte Peltomaki

unread,
Nov 2, 2011, 12:35:00 PM11/2/11
to distcc-...@googlegroups.com, apelt...@nvidia.com
(Please keep me Cc'ed, since I'm not joined in the list)

On Nov 2, 1:36 pm, Fergus Henderson <fer...@google.com> wrote:


> On Wed, Nov 2, 2011 at 10:16 PM, Atte Peltomaki <apeltom...@nvidia.com>wrote:
>
> > I did this little hack to allow distcc client call for compiler using
> > relative path, eg:
> >
> > distcc toolchain/target/gcc source.c
>
> What exactly is the intended semantics in such a case?

This concerns a project that delivers toolchain binaries along with
source. There are a bunch of compilers which are subject to change and
they are called from the build system by using relative path.

Client side is slightly modified to include distcc:

distcc path/to/toolchain/some-funky-compiler-gcc-version source.c <args>

Server side is configured to automatically keep the toolchain
repositories in sync and DISTCCD_PATH is set to this path.

> > + const char *envpath, *p, *n;
>
> Please use more meaningful variable names than 'p' and 'n'.

That part of the code is copied directly from dcc_check_compiler_masq().

That patch was fairly old, which I never had time to properly finish. I
put some effort into it, so now it at least works. However if this
functionality should be officially supported, the logic should be moved
elsewhere and add some debug logging.

--- a/source/src/exec.c 2008-12-02 23:50:24.000000000 +0200

+++ b/source/src/exec.c 2011-11-02 18:29:52.962662151 +0200
@@ -246,6 +246,10 @@


static void dcc_execvp(char **argv)
{
char *slash;
+ const char *envpath, *p, *n;

+ char *buf = NULL;
+ struct stat sb;
+ int len;

execvp(argv[0], argv);

@@ -263,10 +267,35 @@


if (slash)
execvp(slash + 1, argv);

- /* shouldn't be reached */
- rs_log_error("failed to exec %s: %s", argv[0], strerror(errno));

+ if (!(envpath = getenv("PATH"))) {
+ /* shouldn't be reached */
+ rs_log_error("failed to exec %s: %s", argv[0], strerror(errno));
+ dcc_exit(EXIT_COMPILER_MISSING);

+ }
+
+ /* Try if a relative path to compiler was given */


+ for (n = p = envpath; *n; p = n) {
+ n = strchr(p, ':');
+ if (n)
+ len = n++ - p;
+ else {
+ len = strlen(p);
+ n = p + len;
+ }

+ if (asprintf(&buf, "%.*s/%s", len, p, argv[0]) == -1) {


+ rs_log_crit("asprintf failed");
+ return EXIT_DISTCC_FAILED;
+ }

+
+ if (lstat(buf, &sb) == -1)

+ dcc_exit(EXIT_COMPILER_MISSING); /* ENOENT, EACCESS, etc */
+ else {
+ argv[0] = buf; /* this should be safe now */
+ execvp(argv[0], argv);
+ }
+ }



- dcc_exit(EXIT_COMPILER_MISSING); /* a generalization, i know */

Fergus Henderson

unread,
Nov 2, 2011, 7:19:27 PM11/2/11
to distcc-...@googlegroups.com, apelt...@nvidia.com
On Thu, Nov 3, 2011 at 3:35 AM, Atte Peltomaki <apelt...@nvidia.com> wrote:
(Please keep me Cc'ed, since I'm not joined in the list)

On Nov 2, 1:36 pm, Fergus Henderson <fer...@google.com> wrote:
> On Wed, Nov 2, 2011 at 10:16 PM, Atte Peltomaki <apeltom...@nvidia.com>wrote:
>
> > I did this little hack to allow distcc client call for compiler using
> > relative path, eg:
> >
> > distcc toolchain/target/gcc source.c
>
> What exactly is the intended semantics in such a case?

This concerns a project that delivers toolchain binaries along with
source. There are a bunch of compilers which are subject to change and
they are called from the build system by using relative path.

Client side is slightly modified to include distcc:

 distcc path/to/toolchain/some-funky-compiler-gcc-version source.c <args>

Server side is configured to automatically keep the toolchain
repositories in sync and DISTCCD_PATH is set to this path.

I think your patch is making some assumptions about how files are set up on the server, without documenting these assumptions in the distcc documentation (man page, etc.).  This is likely to create problems for future users who don't know what assumptions you are making and whose server configuration might not satisfy those assumptions.  Any change to add a new feature needs to be accompanied by appropriate documentation, especially if that feature is only going to work in certain situations and might do the wrong thing in other situations.

Also, your patch below refers to PATH rather than DISTCCD_PATH, which confuses me.

By the way, wouldn't it make more sense to name the environment variable DISTCCD_COMPILER_PATH, since this would be the path that distccd uses to find the compiler, not the path for distccd itself?

Why is this a colon-separated path rather than a single directory?
You need to handle the case when execvp() returns.
 
+       }
+    }

-    dcc_exit(EXIT_COMPILER_MISSING); /* a generalization, i know */
+    free(buf);
 }

--
Atte Peltomäki
NVIDIA Corporation
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

--
You received this message because you are subscribed to the "distcc-patches" list.
To post to this list, send email to <distcc-...@googlegroups.com>.
To unsubscribe from this list, send email to <distcc-patche...@googlegroups.com>.
For archives and more options, see <http://groups.google.com/group/distcc-patches>.

Atte Peltomaki

unread,
Nov 3, 2011, 3:55:58 AM11/3/11
to Fergus Henderson, distcc-...@googlegroups.com
On Thu, Nov 03, 2011 at 12:19:27AM +0100, Fergus Henderson wrote:
> On Thu, Nov 3, 2011 at 3:35 AM, Atte Peltomaki <[1]apelt...@nvidia.com>

>> (Please keep me Cc'ed, since I'm not joined in the list)
>>
>>> What exactly is the intended semantics in such a case?
>>
>> This concerns a project that delivers toolchain binaries along with
>> source. There are a bunch of compilers which are subject to change and
>> they are called from the build system by using relative path.
>>
>> Client side is slightly modified to include distcc:
>>
>>  distcc path/to/toolchain/some-funky-compiler-gcc-version source.c
>> <args>
>>
>> Server side is configured to automatically keep the toolchain
>> repositories in sync and DISTCCD_PATH is set to this path.
>
> I think your patch is making some assumptions about how files are set up
> on the server, without documenting these assumptions in the distcc
> documentation (man page, etc.).  This is likely to create problems for
> future users who don't know what assumptions you are making and whose
> server configuration might not satisfy those assumptions.  Any change to
> add a new feature needs to be accompanied by appropriate documentation,
> especially if that feature is only going to work in certain situations and
> might do the wrong thing in other situations.

I agree, this patch is undocumented and what's worse, it's operation is
not even visible in debug log when running distccd in wizard mode.

However I can't outright think of any situation where this would
outright do something unexpected; it simply allows distccd to execute
daemon when called with relative path if it finds it from it's own
search path. In fact, the dcc_check_compiler_masq() function in util.c
already does find the correct compiler in this case, but the information
is not stored anywhere.

> Also, your patch below refers to PATH rather than DISTCCD_PATH, which
> confuses me.
> By the way, wouldn't it make more sense to name the environment variable
> DISTCCD_COMPILER_PATH, since this would be the path that distccd uses to
> find the compiler, not the path for distccd itself?
> Why is this a colon-separated path rather than a single directory?

I haven't confirmed this from source, but the way things seem to work
is that DISTCCD_PATH simply rewrites system PATH somewhere during
distccd startup. Since it's a colon-separated list, it should be parsed
as such so multiple search paths can be defined.

I'm hoping to be able to schedule another day to use on this in order to
move this logic into a proper place and have all the debug logging show
correct information. Writing accompanying documentation should be
simple, after all writing documentation is my main job description :)

Fergus Henderson

unread,
Nov 3, 2011, 6:23:03 PM11/3/11
to Atte Peltomaki, distcc-...@googlegroups.com
I had a closer look at this patch.
The code is already calling execvp() which already searches using the PATH environment variable.
So I don't understand why there is a need to explicitly search using PATH after execvp() fails.

Atte Peltomaki

unread,
Nov 4, 2011, 2:44:21 AM11/4/11
to Fergus Henderson, distcc-...@googlegroups.com
On Thu, Nov 03, 2011 at 11:23:03PM +0100, Fergus Henderson wrote:
> I had a closer look at this patch.
> The code is already calling execvp() which already searches using the PATH
> environment variable.
> So I don't understand why there is a need to explicitly search using PATH
> after execvp() fails.

execvp() doesn't support executing a binary with a relative path, even
if that path would be in PATH. My patch rewrites argv[0] to include
absolute path for execvp(), if it finds the binary.

Atte Peltomaki

unread,
Nov 4, 2011, 3:50:35 AM11/4/11
to Fergus Henderson, distcc-...@googlegroups.com

I spent some more time reading distcc sources and came to the conclusion
that this patch is in fact pretty much good to go as it is. Potential
alternative would be rewriting argv[0] somewhere earlier in the code
path (eg. in dcc_check_compiler_masq()), but that doesn't in fact seem
to provide any added value.

I fixed a few minor issues in the patch and wrote a more descriptive
comment above the code block. I also added a line about DISTCCD_PATH in
the INSTALL file, since it had no reference to it.

Please consider including this in the upcoming distcc 3.2, it applies
cleanly against latest version from SVN.

Patch attached.

distcc_use_relative_path.patch

Fergus Henderson

unread,
Nov 4, 2011, 8:25:22 AM11/4/11
to Atte Peltomaki, distcc-...@googlegroups.com

My understanding is that execvp handles relative paths iff you have "." in your PATH.

Atte Peltomaki

unread,
Nov 4, 2011, 8:53:46 AM11/4/11
to Fergus Henderson, distcc-...@googlegroups.com
That appears to be correct, but it's of no use in this case. Setting
distcc daemon PATH to either "." or "/path/to/toolchain/root/." does
make distccd find the compiler from relative path, but something goes
wonky and distcc exits with COMPILER_ERROR.

On Fri, Nov 04, 2011 at 01:25:22PM +0100, Fergus Henderson wrote:
> My understanding is that execvp handles relative paths iff you have "." in
> your PATH.
>

Atte Peltomaki

unread,
Dec 15, 2011, 5:21:39 AM12/15/11
to Fergus Henderson, distcc-...@googlegroups.com
Hi,

Any progress on getting the patch below merged?

On Fri, Nov 04, 2011 at 09:50:35AM +0200, Atte Peltomäki wrote:
> I spent some more time reading distcc sources and came to the conclusion
> that this patch is in fact pretty much good to go as it is. Potential
> alternative would be rewriting argv[0] somewhere earlier in the code
> path (eg. in dcc_check_compiler_masq()), but that doesn't in fact seem
> to provide any added value.
>
> I fixed a few minor issues in the patch and wrote a more descriptive
> comment above the code block. I also added a line about DISTCCD_PATH in
> the INSTALL file, since it had no reference to it.
>
> Please consider including this in the upcoming distcc 3.2, it applies
> cleanly against latest version from SVN.
>
> Patch attached.
>
> --
> Atte Peltomäki
> NVIDIA Corporation

> --- a/src/exec.c 2008-12-02 23:50:24.000000000 +0200
> +++ b/src/exec.c 2011-11-04 09:35:34.698661658 +0200


> @@ -246,6 +246,10 @@
> static void dcc_execvp(char **argv)
> {
> char *slash;
> + const char *envpath, *p, *n;
> + char *buf = NULL;
> + struct stat sb;
> + int len;
>
> execvp(argv[0], argv);
>

> @@ -263,9 +267,37 @@


> if (slash)
> execvp(slash + 1, argv);
>
> - /* shouldn't be reached */
> - rs_log_error("failed to exec %s: %s", argv[0], strerror(errno));
> + if (!(envpath = getenv("PATH"))) {

> + rs_log_error("failed to exec %s: %s", argv[0], strerror(errno));
> + dcc_exit(EXIT_COMPILER_MISSING);
> + }
> +

> + /* If PATH environment variable is set, see if argv[0] is a relative path
> + * to the compiler, since execvp() doesn't do this on it's own. If compiler
> + * is found, rewrite argv[0] and execute. */


> + for (n = p = envpath; *n; p = n) {
> + n = strchr(p, ':');
> + if (n)
> + len = n++ - p;
> + else {
> + len = strlen(p);
> + n = p + len;
> + }
> + if (asprintf(&buf, "%.*s/%s", len, p, argv[0]) == -1) {
> + rs_log_crit("asprintf failed");

> + free(buf);


> + return EXIT_DISTCC_FAILED;
> + }
> +
> + if (lstat(buf, &sb) == -1)

> + continue;


> + else {
> + argv[0] = buf; /* this should be safe now */
> + execvp(argv[0], argv);

> + }
> + }
>
> + free(buf);


> dcc_exit(EXIT_COMPILER_MISSING); /* a generalization, i know */
> }
>

> --- a/INSTALL 2011-11-04 09:48:09.532705250 +0200
> +++ b/INSTALL 2011-11-04 09:47:52.117693611 +0200
> @@ -188,7 +188,8 @@
>
> If the daemon is started from an rc script, then make sure that it
> sees a PATH setting which can find any compilers installed in
> -nonstandard directories.
> +nonstandard directories. If DISTCCD_PATH is set, it replaces local
> +PATH for the daemon.
>
> You should create a "distcc" account on server machines so that distcc
> can run with minimal privilege. It is not necessary for this account

Fergus Henderson

unread,
Dec 15, 2011, 12:54:38 PM12/15/11
to Atte Peltomaki, distcc-...@googlegroups.com
On Fri, Nov 4, 2011 at 8:53 AM, Atte Peltomaki <apelt...@nvidia.com> wrote:
That appears to be correct, but it's of no use in this case. Setting
distcc daemon PATH to either "." or "/path/to/toolchain/root/." does
make distccd find the compiler from relative path, but something goes
wonky and distcc exits with COMPILER_ERROR.

What exactly goes wonky?

You shouldn't set PATH to just "." or "/path/to/toolchain/root/.".
You need to also include the standard directories such as /bin and /usr/bin in the PATH.

On Fri, Nov 04, 2011 at 01:25:22PM +0100, Fergus Henderson wrote:
> My understanding is that execvp handles relative paths iff you have "." in
> your PATH.
>
> On Nov 4, 2011 5:44 PM, "Atte Peltomaki" <[1]apelt...@nvidia.com> wrote:
>
> On Thu, Nov 03, 2011 at 11:23:03PM +0100, Fergus Henderson wrote:
> > I had a closer look at this patch.
> > The code is already calling execvp() which already searches using the
> > PATH environment variable.
> > So I don't understand why there is a need to explicitly search using
> > PATH after execvp() fails.
>
> execvp() doesn't support executing a binary with a relative path, even
> if that path would be in PATH. My patch rewrites argv[0] to include
> absolute path for execvp(), if it finds the binary.

--
Atte Peltomäki
NVIDIA Corporation
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

Fergus Henderson

unread,
Dec 15, 2011, 12:57:05 PM12/15/11
to Atte Peltomaki, distcc-...@googlegroups.com
My understanding is that what this patch achieves can be done without any changes in distcc sources, just by setting PATH appropriately before running distccd.  See my other email.  If that is the case, then I don't see the advantage of merging this patch into the distcc sources.  If I'm wrong about that, please explain in more detail so that I can correct my understanding.

Atte Peltomaki

unread,
Dec 20, 2011, 8:28:30 AM12/20/11
to Fergus Henderson, distcc-...@googlegroups.com
On Thu, Dec 15, 2011 at 06:54:38PM +0100, Fergus Henderson wrote:

> On Fri, Nov 4, 2011 at 8:53 AM, Atte Peltomaki <[1]apelt...@nvidia.com> wrote:
>
> That appears to be correct, but it's of no use in this case. Setting
> distcc daemon PATH to either "." or "/path/to/toolchain/root/." does
> make distccd find the compiler from relative path, but something goes
> wonky and distcc exits with COMPILER_ERROR.
>
> What exactly goes wonky?
> You shouldn't set PATH to just "." or "/path/to/toolchain/root/.".
> You need to also include the standard directories such as /bin and /usr/bin in the PATH.

Result is always the same, regardless of what is set in PATH.
You can test this yourself:

server:/opt> mkdir test
server:/opt> cp `which gcc` test/gcc-test
server:/opt> export DISTCCD_PATH=/opt:/opt/.:.:$PATH
server:/opt> sudo -E -u distccd /usr/bin/distccd --wizard --allow 10.0.0.0/8 --listen 10.0.0.1

client:/opt> export DISTCC_HOSTS='server'
client:/opt> mkdir test
client:/opt> cp `which gcc` test/gcc-test
client:/opt> echo 'int main() { return 0; }' > test.c
client:/opt> distcc test/gcc-test -c test.c

After some more investigation, execvp() *does* indeed search for
relative path as well, but only in the current working directory. I
don't see any way to change the distccd working directory without
code changes.

> On Fri, Nov 04, 2011 at 01:25:22PM +0100, Fergus Henderson wrote:
> > My understanding is that execvp handles relative paths iff you have "." in
> > your PATH.
> >

Fergus Henderson

unread,
Dec 20, 2011, 10:17:28 AM12/20/11
to Atte Peltomaki, distcc-...@googlegroups.com
On Tue, Dec 20, 2011 at 8:28 AM, Atte Peltomaki <apelt...@nvidia.com> wrote:
On Thu, Dec 15, 2011 at 06:54:38PM +0100, Fergus Henderson wrote:
> On Fri, Nov 4, 2011 at 8:53 AM, Atte Peltomaki <[1]apelt...@nvidia.com> wrote:
>
> That appears to be correct, but it's of no use in this case. Setting
> distcc daemon PATH to either "." or "/path/to/toolchain/root/." does
> make distccd find the compiler from relative path, but something goes
> wonky and distcc exits with COMPILER_ERROR.
>
> What exactly goes wonky?
> You shouldn't set PATH to just "." or "/path/to/toolchain/root/.".
> You need to also include the standard directories such as /bin and /usr/bin in the PATH.

Result is always the same, regardless of what is set in PATH.
You can test this yourself:

server:/opt> mkdir test
server:/opt> cp `which gcc` test/gcc-test
server:/opt> export DISTCCD_PATH=/opt:/opt/.:.:$PATH

You're setting DISTCCD_PATH rather than PATH.

Atte Peltomaki

unread,
Dec 20, 2011, 11:01:09 AM12/20/11
to Fergus Henderson, distcc-...@googlegroups.com
On Tue, Dec 20, 2011 at 04:17:28PM +0100, Fergus Henderson wrote:

> On Tue, Dec 20, 2011 at 8:28 AM, Atte Peltomaki <[1]apelt...@nvidia.com> wrote:
> On Thu, Dec 15, 2011 at 06:54:38PM +0100, Fergus Henderson wrote:
> > On Fri, Nov 4, 2011 at 8:53 AM, Atte Peltomaki <[1][2]apelt...@nvidia.com> wrote:
> >
> > That appears to be correct, but it's of no use in this case. Setting
> > distcc daemon PATH to either "." or "/path/to/toolchain/root/." does
> > make distccd find the compiler from relative path, but something goes
> > wonky and distcc exits with COMPILER_ERROR.
> >
> > What exactly goes wonky?
> > You shouldn't set PATH to just "." or "/path/to/toolchain/root/.".
> > You need to also include the standard directories such as /bin and /usr/bin in the PATH.
>
> Result is always the same, regardless of what is set in PATH.
> You can test this yourself:
>
> server:/opt> mkdir test
> server:/opt> cp `which gcc` test/gcc-test
> server:/opt> export DISTCCD_PATH=/opt:/opt/.:.:$PATH
>
> You're setting DISTCCD_PATH rather than PATH.

Very observant. As I've mentioned before: if set, DISTCCD_PATH value is
used to rewrite PATH environment variable, long before execvp() is
called.

Reply all
Reply to author
Forward
0 new messages