Application start code crashes since r234

1 view
Skip to first unread message

Nicolas Alvarez

unread,
Aug 23, 2008, 8:57:24 PM8/23/08
to synecdo...@googlegroups.com
Since r234, all workunits finish with "computation error", and I get a
stacktrace on the client's stderr. However, the client doesn't quit due to
that error. This is when running Synecdoche under Linux.

I found this strange behavior is because the client code is crashing after
fork()ing, but before exec()ing the science app. So the forked process gets
a segfault, but the parent process (the core client) keeps running just
fine.

I've yet to find the exact cause.

Nicolas Alvarez

unread,
Aug 23, 2008, 9:31:51 PM8/23/08
to synecdo...@googlegroups.com

Stacktrace:
ACTIVE_TASK::start(bool) at client/app_start.C:760
ACTIVE_TASK::resume_or_start(bool) at client/app_start.C:875
CLIENT_STATE::enforce_schedule() at client/cpu_sched.C:986
CLIENT_STATE::poll_slow_events() at client/client_state.C:605
boinc_main_loop() at client/main.C:475
main at client/main.C:738

Something there throws a std::logic_error.

Nicolas Alvarez

unread,
Aug 23, 2008, 10:03:28 PM8/23/08
to synecdo...@googlegroups.com

Problem found. getenv returns NULL if the requested environment variable
isn't set. $LD_LIBRARY_PATH isn't set on my machine.
std::string libpath(getenv("LD_LIBRARY_PATH")) makes the constructor throw a
std::logic_error when getenv returns NULL.

The question is, why didn't it crash before r234 change? The original
printf-based code wasn't checking getenv return value.

Turns out glibc printf doesn't crash with NULL pointers. printf("%s", NULL)
prints the string "(null)". However, BOINC would have crashed with other C
implementations that don't handle the null pointer this way (like Solaris,
afaik). printf'ing a null pointer is undefined behavior.

I have reported the latter to boinc_dev mailing list. Fix to synecdoche
string-based code coming soon.

Der Meister

unread,
Aug 24, 2008, 7:12:27 AM8/24/08
to synecdo...@googlegroups.com
Nicolas Alvarez wrote:
> The question is, why didn't it crash before r234 change? The original
> printf-based code wasn't checking getenv return value.
That's obviously my fault. As the return value wasn't checked before and
I didn't see any crashes because of this I thought it would be OK this
way. And yes, I only really tested the windows code and checked that the
Linux code can be compiled, my bad.

> Turns out glibc printf doesn't crash with NULL pointers. printf("%s", NULL)
> prints the string "(null)". However, BOINC would have crashed with other C
> implementations that don't handle the null pointer this way (like Solaris,
> afaik). printf'ing a null pointer is undefined behavior.

That's interessting. But I didn't see anything about this in the manpage
for printf. Seems to be nonstandard behaviour on Linux, which is bad
(but obviously we can't do anything about that).

Anyway, thanks for the fix.

Nicolas Alvarez

unread,
Aug 24, 2008, 4:09:47 PM8/24/08
to synecdo...@googlegroups.com
Der Meister wrote:
> That's interessting. But I didn't see anything about this in the manpage
> for printf. Seems to be nonstandard behaviour on Linux, which is bad
> (but obviously we can't do anything about that).

Non-standard behavior for GNU libc, to be more specific.


Reply all
Reply to author
Forward
0 new messages