Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

IRB, Mac OS X, command-line require via "-r" and Bus Errors

15 views
Skip to first unread message

James Adam

unread,
May 16, 2005, 7:14:33 AM5/16/05
to
Hey All,

I just wanted to put this out there and see firstly if anyone has
experienced the same problem, and secondly maybe someone has some ideas
as to how to go about fixing it.

I've posted some stuff mentioning this problem with Ruby-ODBC, the
Ruby-OpenGL module, and Timothy Hunter has also noticed it with RMagick
(http://www.codecomments.com/message491846.html). My current problem is
with the ruby-odbc module on Tiger with Ruby 1.8.2 (self-compiled,
although the problem exists with rubys compiled via fink, darwinports
and Apple). I have also seen this in the past (on Panther, too) when I
did some work to compile OpenGL on the mac using native frameworks, and
a similar problem has been hinted at with RubyCocoa and IRB
(http://rubycocoa.sourceforge.net/doc/programming.en.html#label-2).

The problem is *only* present when requiring the library from the
command-line, and not once IRB has loaded, i.e.

$ irb -r odbc
==> bus error

..but...

$ irb
irb(main):0> require 'odbc'
==> true

I'm confident that you can substitute "opengl", "rmagick", or maybe
even rubycocoa for odbc wherever I use it below. It gets worse. I then
added the following to my .irbrc file:

IRB.conf[:LOAD_MODULES] << "odbc"

.. which basically adds the ODBC module to the array used to determine
which modules IRB should load. Now, ODBC gets loaded automatically into
IRB, without needing to be specified, i.e.:

$ irb
irb(main):0> puts ODBC.class
==> Module

Which means that it loaded fine! Here's why this is very strange - you
can trace the execution within IRB to get to this point as follows:

/usr/local/bin/irb
/usr/local/lib/ruby/1.8/irb.rb (IRB.start, around line 50)
/usr/local/lib/ruby/1.8/irb/init.rb (IRB.setup, at the top)

In IRB.setup, a bunch of stuff happens, including parsing the
command-line options and adding anything after a "-r" to
@CONF[:LOAD_MODULES] (IRB.parse_opts), and then finally taking each
string in that Array and loading each module (IRB.load_modules, at the
bottom of init.rb).

So what I can tell from this is that at the point where the module is
actually loaded (IRB.load_modules) adding a module to the @CONF
structure has resulted in exactly the same situation as having
specified it with "-r" on the command line. The @CONF[:LOAD_MODULES]
array is in exactly the same state if you specify modules in .irbrc as
it is if you require them with "-r" on the command line, and yet one
method works, and the other fails. Very, very strange.

BUT - here's where it gets TRULY weird. Keep this line in .irbrc, so
the offending module will be loaded. Next, in /usr/local/bin/irb,
immediately after the #! line, add this:

ARGV.clear

... so basically I'm nuking ANY command-line arguments at all. Watch
and be amazed:

$ irb
irb(main):0> puts ODBC.class
==> Module
irb(main):1> exit
$ irb a_nonsense_argument_that_will_get_cleared_anyway
.... Bus Error.

It basically looks like having ANYTHING as a command line argument,
even if IRB doesn't do anything with that data (remember, the ARGV
array gets wiped immediately so IRB.parse_opts doesn't do anything), we
still get a bus error.

You can replicate this effect without even clearing ARGV. Try this:

$ irb -I .

... it should crash too. I'm at the end of my tether, and can only see
this as being a bug in Mac OS X. However, I can't even find a reference
for memcmp in __CFInitialize (see
http://darwinsource.opendarwin.org/10.4/CF-368/Base.subproj/CFRuntime.c,
the Darwin source for Tiger). I am totally stumped.

Anyway, if ANYONE has got anything that might enlighten me, please
please let me know.

- James

Luc Heinrich

unread,
May 16, 2005, 7:53:17 AM5/16/05
to
James Adam <james...@gmail.com> wrote:

> I just wanted to put this out there and see firstly if anyone has
> experienced the same problem, and secondly maybe someone has some ideas
> as to how to go about fixing it.

I've been tracking this down to the same level as you did, and just like
you I am stumped by the __CFInitialize stack trace (which ends up in
'bcmp' here, not in 'memcmp' btw).

I just hope that the 10.4.1 update will fix this, it's really annoying.

--
Luc Heinrich - luc...@mac.com

James Adam

unread,
May 16, 2005, 9:14:31 AM5/16/05
to
I'm not sure it's going to have much to do with anything new in
10.4.1... just because I've seen this happen in Panther too and it
wasn't fixed in any update or Tiger. Still, I hope at least for some
enlightenment! It's the most bizzare bug/code interaction I've ever
come across in 10 years of programming

- J

Luc Heinrich

unread,
May 17, 2005, 2:10:05 AM5/17/05
to
James Adam <james...@gmail.com> wrote:

> I'm not sure it's going to have much to do with anything new in
> 10.4.1...

Right, I have just installed 10.4.1 (which is now available) the crash
still occurs.

Luc Heinrich

unread,
May 17, 2005, 7:44:41 AM5/17/05
to
James Adam <james...@gmail.com> wrote:

> I'm at the end of my tether, and can only see
> this as being a bug in Mac OS X. However, I can't even find a reference
> for memcmp in __CFInitialize (see
> http://darwinsource.opendarwin.org/10.4/CF-368/Base.subproj/CFRuntime.c,
> the Darwin source for Tiger).

Allright, here's some more data.

As already shown, this happens when calling irb with a request to load a
compiled extension through the '-r' option. *Any* compiled extension. So
to track this problem down a little bit more I created a simple dummy
extension but linked with the debug version of the CoreFoundation
framework, to hopefully be able to have more info in the stack trace.

Here's the code of the dummy extension.

## [begin test.c] ##

#include <ruby.h>
void Init_test( void )
{
rb_define_module( "Test" );
}

## [end test.c] ##

And here is how it is being built on my machine:

gcc -fno-common \
-pipe \
-I. \
-I/opt/local/include \
-I/opt/local/lib/ruby/1.8/powerpc-darwin8.0.0 \
-I/opt/local/lib/ruby/1.8/powerpc-darwin8.0.0 \
-O \
-c test.c

cc -dynamic \
-bundle \
-undefined suppress \
-flat_namespace \
-L/opt/local/lib \
-lruby \
-ldl \
-lobjc \
-o test.bundle \
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFound
ation_debug \
test.o

Now let's see what we get:

% irb
irb(main):001:0> require 'test'
2005-05-17 13:31:03.113 irb[2408] CFLog (0): Assertions enabled
=> true

Looks like it works and we have some debug messages coming from the
debug version of the CoreFoundation framework. Now let's see what the
crashing version gives us:

% irb -r test
./test.bundle: [BUG] Bus Error
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0.0]

As expected, but this time the stack trace is a little bit more explicit
(showing here only the relevant lines):

Thread 0 Crashed:
0 libSystem.B.dylib 0x900033c0 strcmp + 192
1 CoreFoundation_debug 0x004bbfc4 __CFInitialize+600 (CFRuntime.c:720)

Two things:
- The crashing memcmp/bcmp that we got before is actually an strcmp
(probably inlined in the non-debug version).
- This strcmp is called from CFRuntime.c *around* line 720.

"Around" line 720 and not "at" line 720 because it appears that the
CoreFoundation framework that ships with 0S X is a little bit different
than the one for which we can have the sources.

Now, if we look at CFRuntime.c and find the strcmp closest to line 720
we fine this block of code:

{
CFIndex idx, cnt;
char **args = *_NSGetArgv();
cnt = *_NSGetArgc();
for (idx = 1; idx < cnt - 1; idx++) {
if (0 == strcmp(args[idx], "-AppleLanguages")) {
CFIndex length = strlen(args[idx + 1]);
__CFAppleLanguages = malloc(length + 1);
memmove(__CFAppleLanguages, args[idx + 1], length + 1);
break;
}
}
}

And not so surprisingly, this block is noodling with command line
arguments. What we need to know now is WHY exactly does this strcmp
crash.

James Edward Gray II

unread,
May 17, 2005, 8:58:31 AM5/17/05
to
On May 17, 2005, at 1:10 AM, Luc Heinrich wrote:

> James Adam <james...@gmail.com> wrote:
>
>
>> I'm not sure it's going to have much to do with anything new in
>> 10.4.1...
>>
>
> Right, I have just installed 10.4.1 (which is now available) the crash
> still occurs.

Just FYI, my secret to OS X Ruby happiness was to build my own copy
in /usr/local/, then adjust my path so it comes up first. This gives
me Ruby exactly as I want it, I never have to fiddle with Apple's
version, upgrades don't break it, etc. To me, it's the only way to go.

James Edward Gray II

Jonathan Paisley

unread,
May 17, 2005, 9:06:02 AM5/17/05
to luc...@mac.com
On Tue, 17 May 2005 14:44:41 +0200, Luc Heinrich wrote:

> CFIndex idx, cnt;
> char **args = *_NSGetArgv();
> cnt = *_NSGetArgc();
> for (idx = 1; idx < cnt - 1; idx++) {
> if (0 == strcmp(args[idx], "-AppleLanguages")) {
> CFIndex length = strlen(args[idx + 1]);
> __CFAppleLanguages = malloc(length + 1);
> memmove(__CFAppleLanguages, args[idx + 1], length + 1);
> break;
>
> And not so surprisingly, this block is noodling with command line
> arguments. What we need to know now is WHY exactly does this strcmp
> crash.

I've investigated this a bit more, after seeing your post which was very
useful.

The problem is that before the code above loads, the ruby interpreter has
changed the arguments array to contain just the name of the program. So,
for example, if you run something like this:

$ irb arg1 arg2

argv will contain, at startup, {"irb","arg1","arg2"}, with argc=3
Assigning to $0 in ruby sets argv[0] to whatever you're set $0 to, and
sets the rest of the arguments to NULL (see set_arg0 in ruby.c).

So, by the time the CFRuntime code executes, argv looks like this:
{"irb",NULL,NULL}. Since the code doesn't check for potentially NULL
entries, it crashes.

The quick-fix is to just disable the assignment to $0 in irb.rb - the
first line of IRB.start.

I've filed a bug report with Apple, number 4121317.

Thanks,
Jonathan

Gavin Kistner

unread,
May 17, 2005, 9:35:40 AM5/17/05
to
On May 17, 2005, at 6:58 AM, James Edward Gray II wrote:
> Just FYI, my secret to OS X Ruby happiness was to build my own copy
> in /usr/local/, then adjust my path so it comes up first. This
> gives me Ruby exactly as I want it, I never have to fiddle with
> Apple's version, upgrades don't break it, etc. To me, it's the
> only way to go.

Amen to that. I did the same, and life has been quite happy for a
long time since.

Luc Heinrich

unread,
May 17, 2005, 11:51:03 AM5/17/05
to
James Edward Gray II <ja...@grayproductions.net> wrote:

> Just FYI, my secret to OS X Ruby happiness was to build my own copy
> in /usr/local/, then adjust my path so it comes up first. This gives
> me Ruby exactly as I want it, I never have to fiddle with Apple's
> version, upgrades don't break it, etc. To me, it's the only way to go.

Oh, we do agree on this, that's also what I do, although I have started
to use DarwinPorts since Tiger instead of manually building from the
source tarball.

Are you saying that the crash does *not* occur on your machine with your
custom built ruby ? That would surprise me after what was already found.

Luc Heinrich

unread,
May 17, 2005, 11:51:03 AM5/17/05
to
Jonathan Paisley <jp-...@dcs.gla.ac.uk> wrote:

> The quick-fix is to just disable the assignment to $0 in irb.rb - the
> first line of IRB.start.
>
> I've filed a bug report with Apple, number 4121317.

Very interesting. So we now know exactly what happens and why it
happens. Cool :)

But whose bug is it, really ?

On one hand, I agree that checking for NULL values in argv should
probably be done in CFRuntime.c. But on the other hand, nullifying argv
entries in set_arg0 while keeping the argc value intact is really asking
for troubles don't you think ? :)

James Edward Gray II

unread,
May 17, 2005, 12:05:04 PM5/17/05
to
On May 17, 2005, at 10:55 AM, Luc Heinrich wrote:

> Are you saying that the crash does *not* occur on your machine with
> your
> custom built ruby ? That would surprise me after what was already
> found.

Oh no. Sorry if I implied that. I was just trying to be helpful in
a thread I haven't been following as closely as I should have before
posting. :)

James Edward Gray II

Jonathan Paisley

unread,
May 17, 2005, 12:21:36 PM5/17/05
to
On Tue, 17 May 2005 18:51:03 +0200, Luc Heinrich wrote:

> Very interesting. So we now know exactly what happens and why it
> happens. Cool :)
>
> But whose bug is it, really ?
>
> On one hand, I agree that checking for NULL values in argv should
> probably be done in CFRuntime.c. But on the other hand, nullifying argv
> entries in set_arg0 while keeping the argc value intact is really asking
> for troubles don't you think ? :)

Fiddling with command line arguments like this is a common unix idiom for
changing the way the process appears in output from the 'ps' command, so
it's fair game for ruby to be doing this.

Unfortunately there's nothing you can do about the argc value - since C is
call-by-reference there's no value that can be updated to reflect the new
length of the argument list.

Since Apple are overloading the interpretation of arguments to command
line applications, I think it's up to them to make sure they do it
defensively.

nobu....@softhome.net

unread,
May 17, 2005, 1:40:57 PM5/17/05
to
Hi,

At Tue, 17 May 2005 20:45:31 +0900,
Luc Heinrich wrote in [ruby-talk:142884]:


> Now, if we look at CFRuntime.c and find the strcmp closest to line 720
> we fine this block of code:
>
> {
> CFIndex idx, cnt;
> char **args = *_NSGetArgv();
> cnt = *_NSGetArgc();
> for (idx = 1; idx < cnt - 1; idx++) {
> if (0 == strcmp(args[idx], "-AppleLanguages")) {
> CFIndex length = strlen(args[idx + 1]);
> __CFAppleLanguages = malloc(length + 1);
> memmove(__CFAppleLanguages, args[idx + 1], length + 1);
> break;
> }
> }
> }

Is this function called at each time when .so is loaded, and
the area pointed by _NSGetArgv() shouldn't be changed?

I suspect that system dependent initialization of arguments
should be integrated.


Index: main.c
===================================================================
RCS file: /cvs/ruby/src/ruby/main.c,v
retrieving revision 1.13
diff -U2 -p -r1.13 main.c
--- main.c 23 Jun 2004 12:59:01 -0000 1.13
+++ main.c 17 May 2005 17:36:48 -0000
@@ -13,8 +13,4 @@
#include "ruby.h"

-#if defined(__MACOS__) && defined(__MWERKS__)
-#include <console.h>
-#endif
-
/* to link startup code with ObjC support */
#if (defined(__APPLE__) || defined(__NeXT__)) && defined(__MACH__)
@@ -27,11 +23,5 @@ main(argc, argv, envp)
char **argv, **envp;
{
-#ifdef _WIN32
- NtInitialize(&argc, &argv);
-#endif
-#if defined(__MACOS__) && defined(__MWERKS__)
- argc = ccommand(&argv);
-#endif
-
+ ruby_sysinit(&argc, &argv);
ruby_init();
ruby_options(argc, argv);
Index: ruby.c
===================================================================
RCS file: /cvs/ruby/src/ruby/ruby.c,v
retrieving revision 1.101
diff -U2 -p -r1.101 ruby.c
--- ruby.c 14 May 2005 02:48:07 -0000 1.101
+++ ruby.c 17 May 2005 17:24:44 -0000
@@ -1253,2 +1253,28 @@ ruby_process_options(argc, argv)
}
}
+
+#if defined(__MACOS__) && defined(__MWERKS__)
+#include <console.h>
+#endif
+
+#if defined(_WIN32)
+void
+ruby_sysinit(argc, argv)
+ int *argc;
+ char ***argv;
+{
+#if defined(__APPLE__) && (defined(__MACH__) || defined(__DARWIN__)) && !defined(__MacOS_X__)
+ if (*argv == *_NSGetArgv()) {
+ int i, n = *argc, len = 0;
+ char **v1 = *argv, **v2 = ALLOC_N(char*, n + 1);;
+ for (i = 0; i < n; ++i) {
+ *v2[i] = strdup(*v1[i]);
+ }
+ v2[i] = 0;
+ *_NSGetArgv() = v2;
+ }
+#elif defined(__MACOS__) && defined(__MWERKS__)
+ *argc = ccommand(argv);
+#endif
+}
+#endif
Index: ruby.h
===================================================================
RCS file: /cvs/ruby/src/ruby/ruby.h,v
retrieving revision 1.113
diff -U2 -p -r1.113 ruby.h
--- ruby.h 15 May 2005 09:56:49 -0000 1.113
+++ ruby.h 17 May 2005 17:13:11 -0000
@@ -561,4 +561,5 @@ NORETURN(void rb_throw _((const char*,VA
VALUE rb_require _((const char*));

+void ruby_sysinit _((int*, char***));
void ruby_init _((void));
void ruby_options _((int, char**));
Index: win32/win32.c
===================================================================
RCS file: /cvs/ruby/src/ruby/win32/win32.c,v
retrieving revision 1.151
diff -U2 -p -r1.151 win32.c
--- win32/win32.c 17 May 2005 02:50:42 -0000 1.151
+++ win32/win32.c 17 May 2005 17:26:00 -0000
@@ -418,5 +518,10 @@ void
NtInitialize(int *argc, char ***argv)
{
+ ruby_sysinit(argc, argv);
+}

+void
+ruby_sysinit(int *argc, char ***argv)
+{
WORD version;
int ret;

--
Nobu Nakada


Jonathan Paisley

unread,
May 17, 2005, 2:26:11 PM5/17/05
to
On Wed, 18 May 2005 03:40:57 +0900, nobu.nokada wrote:


> Is this function called at each time when .so is loaded, and
> the area pointed by _NSGetArgv() shouldn't be changed?
>
> I suspect that system dependent initialization of arguments
> should be integrated.

The _NSGetArgv function would appear to be an private function (inferring
this from the leading underscore), so perhaps it'd be inappropriate to
use it in the ruby core. Having said that, it is declared in <crt_externs.h>...

Perhaps an alternative would be to modify set_arg0() to change the
non-first argument to be empty C strings ("") rather than NULL?


More below:

> +#if defined(__APPLE__) && (defined(__MACH__) || defined(__DARWIN__)) && !defined(__MacOS_X__)

What is the purpose of the !defined(__MacOS_X__) ?

> + int i, n = *argc, len = 0;
> + char **v1 = *argv, **v2 = ALLOC_N(char*, n + 1);;
> + for (i = 0; i < n; ++i) {
> + *v2[i] = strdup(*v1[i]);

I think the above line should be:

v2[i] = strdup(v1[i]);

Thanks
Jonathan

nobu....@softhome.net

unread,
May 17, 2005, 2:42:47 PM5/17/05
to
Hi,

At Wed, 18 May 2005 03:30:31 +0900,
Jonathan Paisley wrote in [ruby-talk:142938]:


> > Is this function called at each time when .so is loaded, and
> > the area pointed by _NSGetArgv() shouldn't be changed?
> >
> > I suspect that system dependent initialization of arguments
> > should be integrated.
>
> The _NSGetArgv function would appear to be an private function (inferring
> this from the leading underscore), so perhaps it'd be inappropriate to
> use it in the ruby core. Having said that, it is declared in <crt_externs.h>...
>
> Perhaps an alternative would be to modify set_arg0() to change the
> non-first argument to be empty C strings ("") rather than NULL?

"-AppleLanguages" can be disappeared?

> More below:
>
> > +#if defined(__APPLE__) && (defined(__MACH__) || defined(__DARWIN__)) && !defined(__MacOS_X__)
>
> What is the purpose of the !defined(__MacOS_X__) ?

Just copied from process.c.

> > + int i, n = *argc, len = 0;
> > + char **v1 = *argv, **v2 = ALLOC_N(char*, n + 1);;
> > + for (i = 0; i < n; ++i) {
> > + *v2[i] = strdup(*v1[i]);
>
> I think the above line should be:
>
> v2[i] = strdup(v1[i]);

Yes, of cource.

What about this?

void
ruby_sysinit(argc, argv)
int *argc;
char ***argv;
{
#if defined(__APPLE__) && (defined(__MACH__) || defined(__DARWIN__))


int i, n = *argc, len = 0;

char **v1 = *argv, **v2 = ALLOC_N(char*, n + 1);

MEMCPY(v2, v1, char*, n);
v2[n] = 0;


for (i = 0; i < n; ++i) {

v1[i] = strdup(v1[i]);
}
*argv = v2;
#elif defined(__MACOS__) && defined(__MWERKS__)
*argc = ccommand(argv);
#endif
}

--
Nobu Nakada


Jonathan Paisley

unread,
May 17, 2005, 5:37:06 PM5/17/05
to
On Wed, 18 May 2005 04:42:47 +0900, nobu.nokada wrote:

> Hi,
>
> At Wed, 18 May 2005 03:30:31 +0900,
> Jonathan Paisley wrote in [ruby-talk:142938]:
>> > Is this function called at each time when .so is loaded, and
>> > the area pointed by _NSGetArgv() shouldn't be changed?
>> >
>> > I suspect that system dependent initialization of arguments
>> > should be integrated.
>>
>> The _NSGetArgv function would appear to be an private function (inferring
>> this from the leading underscore), so perhaps it'd be inappropriate to
>> use it in the ruby core. Having said that, it is declared in <crt_externs.h>...
>>
>> Perhaps an alternative would be to modify set_arg0() to change the
>> non-first argument to be empty C strings ("") rather than NULL?
>
> "-AppleLanguages" can be disappeared?

I'm not entirely sure what you mean here. The CFRuntime code removes -AppleLanguages
arguments from the command line, but the code that does it crashes if there are
NULLs present. We see the error if a ruby script calls set_arg0 before this happens.
set_arg0 sets argv[0], and sets argv[1 to argc-1] to NULL.

One of my suggestions was: in set_arg0 use empty strings ("") instead
of NULL values in set_arg0. This would prevent the crash.

An alternative would be just to link ruby with the CoreFoundation framework. That way,
the CFRuntime initialisation code should get executed before ruby initialises, and
the problem goes away.


> What about this?
>
> void
> ruby_sysinit(argc, argv)
> int *argc;
> char ***argv;
> {
> #if defined(__APPLE__) && (defined(__MACH__) || defined(__DARWIN__))
> int i, n = *argc, len = 0;
> char **v1 = *argv, **v2 = ALLOC_N(char*, n + 1);
> MEMCPY(v2, v1, char*, n);
> v2[n] = 0;
> for (i = 0; i < n; ++i) {
> v1[i] = strdup(v1[i]);
> }
> *argv = v2;
> #elif defined(__MACOS__) && defined(__MWERKS__)
> *argc = ccommand(argv);
> #endif
> }

The problem with this is that it causes set_arg0 to not have the desired effect.

set_arg0 is designed to change the process name, as it appears in 'ps' output.
To do this, it's necessary to overwrite the existing argument strings.
Note: what I mean here is overwriting the character data in the existing
strings, and not changing string pointers in the argv array. 'ps' (or in
kernel process info, depending on OS) reads directly from the argument
strings, not going through any pointers in an argv array.

The upshot is that changing argv for ruby in the proposed ruby_sysinit
means that a later set_arg0 will not change the necessary kernel-known
arguments.

I still feel the solution is either to use empty strings rather than NULL
for filling the remaining argv pointers in set_arg0, or link ruby on OSX
to CoreFoundation by default. (I haven't tested whether the latter solves
the problem at hand - it should though).

Please let me know if I'm not making sense! :)

Thanks
Jonathan

Luc Heinrich

unread,
May 17, 2005, 4:05:29 PM5/17/05
to
<nobu....@softhome.net> wrote:

> Is this function called at each time when .so is loaded, and
> the area pointed by _NSGetArgv() shouldn't be changed?

This block of code is taken from the __CFInitialize function in
CFRuntime.c and is "protected" from being executed multiple times by a
static at the beginning of the function.

The complete file is here if you want to have a look:
<http://darwinsource.opendarwin.org/10.4.1/CF-368.1/Base.subproj/CFRunti
me.c>

nobu....@softhome.net

unread,
May 17, 2005, 9:50:28 PM5/17/05
to
Hi,

At Wed, 18 May 2005 06:40:31 +0900,
Jonathan Paisley wrote in [ruby-talk:142962]:


> I'm not entirely sure what you mean here. The CFRuntime code removes -AppleLanguages
> arguments from the command line, but the code that does it crashes if there are
> NULLs present. We see the error if a ruby script calls set_arg0 before this happens.
> set_arg0 sets argv[0], and sets argv[1 to argc-1] to NULL.

The code "around line 720" you posted doesn't remove
-AppleLanguages, but copies it to __CFAppleLanguages. So I
guess it might be used later if it was given.

> One of my suggestions was: in set_arg0 use empty strings ("") instead
> of NULL values in set_arg0. This would prevent the crash.
>
> An alternative would be just to link ruby with the CoreFoundation framework. That way,
> the CFRuntime initialisation code should get executed before ruby initialises, and
> the problem goes away.

Does the initialization run just once in the whole process?

No, it duplicates the argument strings and remains them in the
original argv (i.e., pointed from *_NSGetArgv()), but let the
entire string area to be pointed by newly allocated argv, which
is returned to the caller.

--
Nobu Nakada


Jonathan Paisley

unread,
May 18, 2005, 3:15:12 AM5/18/05
to
On Wed, 18 May 2005 11:50:28 +0900, nobu.nokada wrote:


> At Wed, 18 May 2005 06:40:31 +0900,
> Jonathan Paisley wrote in [ruby-talk:142962]:
>> I'm not entirely sure what you mean here. The CFRuntime code removes -AppleLanguages
>> arguments from the command line, but the code that does it crashes if there are
>> NULLs present. We see the error if a ruby script calls set_arg0 before this happens.
>> set_arg0 sets argv[0], and sets argv[1 to argc-1] to NULL.
>
> The code "around line 720" you posted doesn't remove
> -AppleLanguages, but copies it to __CFAppleLanguages. So I
> guess it might be used later if it was given.

Sorry, you are right. I misread the CFRuntime code. It just copies the
argument to __CFAppleLanguags.

>
>> One of my suggestions was: in set_arg0 use empty strings ("") instead
>> of NULL values in set_arg0. This would prevent the crash.
>>
>> An alternative would be just to link ruby with the CoreFoundation framework. That way,
>> the CFRuntime initialisation code should get executed before ruby initialises, and
>> the problem goes away.
>
> Does the initialization run just once in the whole process?

Yes, just once. You can see if you look at the complete source:

http://darwinsource.opendarwin.org/10.4.1/CF-368.1/Base.subproj/CFRuntime.c

The whole __CFInitialize function is protected by a static 'done' variable.

>> The problem with this is that it causes set_arg0 to not have the desired effect.
>>
>> set_arg0 is designed to change the process name, as it appears in 'ps' output.
>> To do this, it's necessary to overwrite the existing argument strings.
>> Note: what I mean here is overwriting the character data in the existing
>> strings, and not changing string pointers in the argv array. 'ps' (or in
>> kernel process info, depending on OS) reads directly from the argument
>> strings, not going through any pointers in an argv array.
>>
>> The upshot is that changing argv for ruby in the proposed ruby_sysinit
>> means that a later set_arg0 will not change the necessary kernel-known
>> arguments.
>
> No, it duplicates the argument strings and remains them in the
> original argv (i.e., pointed from *_NSGetArgv()), but let the
> entire string area to be pointed by newly allocated argv, which
> is returned to the caller.

The effect your describe is right, but my point is that set_arg0 (defined
in ruby.c) needs to operate on the original argv in order to be able to
change the process name as reported by 'ps'. If set_arg0 sees strdup()-ed
strings, its changes will have no effect.

Nakada, Nobuyoshi

unread,
May 18, 2005, 4:44:38 AM5/18/05
to
Hi,

At Wed, 18 May 2005 16:20:13 +0900,
Jonathan Paisley wrote in [ruby-talk:143017]:


> > No, it duplicates the argument strings and remains them in the
> > original argv (i.e., pointed from *_NSGetArgv()), but let the
> > entire string area to be pointed by newly allocated argv, which
> > is returned to the caller.
>
> The effect your describe is right, but my point is that set_arg0 (defined
> in ruby.c) needs to operate on the original argv in order to be able to
> change the process name as reported by 'ps'. If set_arg0 sees strdup()-ed
> strings, its changes will have no effect.

Caller's argv is changed but *argv isn't changed. Pointers pointed by
original argv are changed. The pointers are not important.

+-----+-----+-----+-----+-----+
,-- | 0 | 1 | 2 | 3 | 4 | <----- original argv
| +--|--+--|--+--|--+--|--+--|--+
| | | | | |
| V V V V V
,-|- "./argv" "a" "b" "c" NULL <----- original strings
| |
~~|~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| | "./argv" "a" "b" "c" NULL <----- duplicated strings
| | A A A A A
| | | | | | |
| | +--|--+--|--+--|--+--|--+--|--+
| `-> | 0 | 1 | 2 | 3 | 4 | <----- original argv kept
| +-----+-----+-----+-----+-----+ in the runtime internal
|
`--> "./argv" "a" "b" "c" NULL <----- original strings
A A A A A in the system area
| | | | |
+--|--+--|--+--|--+--|--+--|--+
| 0 | 1 | 2 | 3 | 4 | <----- duplicated argv returned
+-----+-----+-----+-----+-----+ to the caller.


$ cat argv.c
#include "ruby.h"

#undef xmalloc
#define xmalloc malloc

void
ruby_sysinit(argc, argv)
int *argc;
char ***argv;
{

int i, n = *argc, len = 0;
char **v1 = *argv, **v2 = ALLOC_N(char*, n + 1);
MEMCPY(v2, v1, char*, n);
v2[n] = 0;
for (i = 0; i < n; ++i) {
v1[i] = strdup(v1[i]);
}
*argv = v2;
}

void
dump_argv(name, argc, argv)
char *name;
int argc;
char **argv;
{
int i;
printf("%s: argc=%d argv=%p\n", name, argc, argv);
for (i = 0; i < argc; ++i) {
char *arg = argv[i];
printf("argv[%d] = %p \"%s\"\n", i, arg, arg);
}
}

int main(argc, argv)
int argc;
char **argv;
{
int origargc = argc;
char **origargv = argv;
dump_argv("before", argc, argv);
ruby_sysinit(&argc, &argv);
dump_argv("after", argc, argv);
dump_argv("original", origargc, origargv);
return 0;
}

$ make argv.o && gcc -o argv argv.o

$ ./argv a b c
before: argc=4 argv=0x61813d80
argv[0] = 0x61813dc8 "./argv"
argv[1] = 0x61813df0 "a"
argv[2] = 0x61813e08 "b"
argv[3] = 0x61813e20 "c"
after: argc=4 argv=0xa050008
argv[0] = 0x61813dc8 "./argv"
argv[1] = 0x61813df0 "a"
argv[2] = 0x61813e08 "b"
argv[3] = 0x61813e20 "c"
original: argc=4 argv=0x61813d80
argv[0] = 0xa0505c0 "./argv"
argv[1] = 0xa0505d0 "a"
argv[2] = 0xa0505e0 "b"
argv[3] = 0xa0505f0 "c"

--
Nobu Nakada


Jonathan Paisley

unread,
May 18, 2005, 6:02:33 AM5/18/05
to
On Wed, 18 May 2005 18:44:38 +0900, Nakada, Nobuyoshi wrote:

> Caller's argv is changed but *argv isn't changed. Pointers pointed by
> original argv are changed. The pointers are not important.

You are right. I was getting mixed up. Sorry! This solution looks like it
should work. Ruby in set_arg0 is no longer stomping on the argv array that
CFRuntime will look at, but it's still looking at the same system area
strings.

Thanks for the delightful diagram! Did you make that by hand, or was there
some tool involved?

Jonathan

nobu....@softhome.net

unread,
May 18, 2005, 11:10:37 AM5/18/05
to
Hi,

At Wed, 18 May 2005 19:05:13 +0900,
Jonathan Paisley wrote in [ruby-talk:143027]:


> Thanks for the delightful diagram! Did you make that by hand, or was there
> some tool involved?

By hand.

--
Nobu Nakada


0 new messages