Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Reading the symbol table of the currently running executable

0 views
Skip to first unread message

Clifford Neuman

unread,
Aug 30, 1989, 4:22:32 PM8/30/89
to
Does anyone know how to read the symbol table of a program from within
that program itself? More precisely, from within a procedure in a
library which was used in linking the executable. The simplest way is
to read the symbol table from the executable. Unfortunately, I might
not know the name of the executable. I can solve my problem in any of
several ways, and I would appreciate suggested solutions to any of
these problems.

1) Directly reading the symbol table from within the running program

2) Obtaining the full path name of the presently running executable.
Remember, argv[0] will not contain the full path if the executable was
found through the search path. Also, the program may have been
started by execl with an argv[0] that is unrelated to the file name.

3) Obtaining a file descriptor for the currently running executable
(i.e. I don't need the name as long as I can read it).

Please respond to me directly since I do not normally read this list.
I will summarize any personal responses I receive. Thanks,

~ Cliff

PS: I am presently using a Vax running Ultrix 3.0, but I will
will ultimately need to do this under other versions of
Unix as well.

David Goodenough

unread,
Sep 1, 1989, 1:14:25 PM9/1/89
to
b...@cs.washington.edu (Clifford Neuman) asks:

> Does anyone know how to read the symbol table of a program from within
> that program itself? More precisely, from within a procedure in a
> library which was used in linking the executable. The simplest way is
> to read the symbol table from the executable. Unfortunately, I might
> not know the name of the executable. I can solve my problem in any of
> several ways, and I would appreciate suggested solutions to any of
> these problems.

This is fairly grotesque, but it might just work:

int cpid;
char pidbuf[10];

sprintf(pidbuf, "%d", getpid());
if ((cpid = vfork()) == -1)
{
bitch and complain - the fork failed
}
else if (cpid == 0)
{
/* child thread */
sleep(1) /* snooze a while to make sure parent is
* in the wait */
execl("/usr/ucb/gcore", "gcore", pidbuf, 0);
bitch and complain - the execl failed
}
else
wait(0); /* wait for child to do it's bit */

/* now you have a core image in file core.pidbuf. Take nm to it,
* open it and do nlist on it, whatever */

It does assume you have gcore, which living in /usr/ucb may be a beserkley
enhancement. Still, it is possible to achieve the equivalent by opening
/dev/mem (you can set this to run effective uid 0 right :-) ), seeking
and reading, it's just a bit of an art to know where to go, and how much
to grab.
--
d...@lakart.UUCP - David Goodenough +---+
IHS | +-+-+
....... !harvard!xait!lakart!dg +-+-+ |
AKA: dg%lakar...@xait.xerox.com +---+

Conor P. Cahill

unread,
Sep 3, 1989, 11:36:57 AM9/3/89
to
In article <6...@lakart.UUCP>, d...@lakart.UUCP (David Goodenough) writes:
> b...@cs.washington.edu (Clifford Neuman) asks:
> > [question about how to examine the symbol table at run time]

>
> This is fairly grotesque, but it might just work:
>
> [ sample of using vfork/gcore deleted]

>
> It does assume you have gcore, which living in /usr/ucb may be a beserkley
> enhancement. Still, it is possible to achieve the equivalent by opening
> /dev/mem (you can set this to run effective uid 0 right :-) ), seeking
> and reading, it's just a bit of an art to know where to go, and how much
> to grab.

Using gcore to generate a core image, or trying to read /dev/kmem won't
work because the executable image of a program at run time does not
include the symbol table. Why would the system choose to load a large
totally useless (as far as execution is concerned) portion of the executable
file? There is no reason and it does not occur (Disclaimer: on any system
that I have seen, I guess there could be some exception).

Did you ever happen to note that you need an unstripped version of
the program to examine a core file? That is because the core file
does not have the symbol table, it is in the unstripped executable.

The only time that information is available is in the disk copies
of the programs executables, so if you need that information, you
need to have read access to the executable.
--
+-----------------------------------------------------------------------+
| Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 !
| Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 |
+-----------------------------------------------------------------------+

Adam R de Boor

unread,
Sep 3, 1989, 7:02:35 PM9/3/89
to

Wrong. The core file doesn't contain the symbol table. The way I've always
handled things of this nature is to use argv[0] and look for an executable
file of the same name along $PATH. Useful also for having a configuration
file that resides in the same directory as the executable, but they
both can be wherever one wants them to be.

True, this plan does have a few flaws (you have to have argv[0]), but you're
not going to get anywhere via /dev/mem short of locating the text-table entry
for the process in the kernel, reading the inode structure to obtain the
inode number, then looking through the filesystem (or $PATH again :) for a file
with that number...

a

Mitch Bunnell

unread,
Sep 5, 1989, 1:17:36 PM9/5/89
to
In article <91...@june.cs.washington.edu> b...@cs.washington.edu (Clifford Neuman) writes:
>
> 1) Directly reading the symbol table from within the running program
>
> 2) Obtaining the full path name of the presently running executable.
>
> 3) Obtaining a file descriptor for the currently running executable


1 - Not possible. The symbol table is NOT loaded with the program.

2 - Not possible.

3 - Not possible.

SORRY

---------------

Mitch Bunnell - Lynx Real-time systems
mitch@lynx. voice - (408) 370-2233


Robot: "They will never believe that it is the real amulet."

Dr. Smith: "Of course they will. These naive aliens will believe anything."

David Barts

unread,
Sep 7, 1989, 11:49:51 AM9/7/89
to
In article <61...@lynx.UUCP>, mi...@lynx.uucp (Mitch Bunnell) writes:
> In article <91...@june.cs.washington.edu> b...@cs.washington.edu (Clifford Neuman) writes:
> >
> > 1) Directly reading the symbol table from within the running program
> >
> > 2) Obtaining the full path name of the presently running executable.
> >
> > 3) Obtaining a file descriptor for the currently running executable
>
>
> 1 - Not possible. The symbol table is NOT loaded with the program.
>
> 2 - Not possible.

True, it's not possible to do this in a fail-safe manner guarenteed
to work all the time. But you can usually figure it out by reading
argv[0] and PATH and applying the same rules the shell does in
determining the location of an executable. Should work if the
program was run from the shell by its true name, and you do this
before modifying PATH or calling chdir(2) or chroot(2).

If the program was invoked via an alias, you may or may not have
trouble depending on how the shell you use handles aliases (they are
not all the same!). All bets are off if the program has been exec'ed
with argv[0] != program_name (often the case if the executable is a
shell or was exec'ed by login(1)).
--
David Barts Pacer Corporation
dav...@pacer.uucp ...!fluke!pacer!davidb

Greg Limes

unread,
Sep 7, 1989, 6:31:03 PM9/7/89
to
In article <61...@lynx.UUCP> mi...@lynx.uucp (Mitch Bunnell) writes:

> In article <91...@june.cs.washington.edu> b...@cs.washington.edu (Clifford Neuman) writes:
> > 2) Obtaining the full path name of the presently running executable.

> 2 - Not possible.

Back before I knew this was impossible, I wrote the following piece of
support code. It has been doing the impossible for me for quite some
time (geez, has it been that long?) with limitations as stated.

/*
* findx package 25may88 li...@sun.com
*
* Over the last few days (weeks?) there has been some traffic about how to
* tell where a running program came from. Well, there is a way to find
* out without changing the shell, the kernel, C language startup
* conventions, or whatever.
*
* Anyway, here is the basic idea, presented as a package that should compile
* and run without too many problems.
*
* WHAT IT DOES First, it locates the path to the executable that was used by
* the exec() that started this process. If the command name starts with a
* "/", it must be taken literally; if it contains a "/", then it is
* always relative to the current working directory at the start of the
* program; otherwise, we have to chase across the PATH value in the
* environment. If there is no PATH, or the PATH is empty, check the
* current working directory.
*
* On systems with symbolic links, we are not through yet. The purpose is to
* locate the directory it is in, so we can get at any related data files.
* So, we chase symbolic links until we have the real path name of the
* final resolution file.
*
* SECURITY This can be spoofed easily by making a hard link to, or a copy
* of, the executable. If you want your program to be sure that it has
* found the one true installation location, you will have to verify that
* for yourself. findx() just locates the most likely candidate.
*
* PORTABILITY This was developed on a Sun3 running SunOS 4.0, but I think I
* at least made the algorithm portable. You may need to mess with include
* files and such. Symbolic link searching is turned on if your errno.h
* supplies ELOOP, and off otherwise; I assume that all systems with
* symlinks have a readlink() call.
*
* HOW TO USE IT Here is the definition of the various parameters. Further
* down you will find an example main, so fear not ...
*
* findx (cmd, cwd, dir, pgm, run, path)
*
* cmd pass the command name (argv[0]) here. findx() knows how to handle
* just about anything. If it starts with /, then we use the absolute
* name, and ignore the path. If it contains a /, then use the relative
* name and ignore the path. Otherwise, look for the file in each
* directory named in the path for the file; if there is no path, pretend
* its "." like the execvp does.
*
* cwd pass a big buffer here. if this begins with a slash, I will assume
* it is filled in with the current working directory; otherwise, I will
* fill it in using getcwd(). Should be at least MAXPATHLEN bytes, if you
* do not fill it in yourself.
*
* dir pass a big buffer here. this gets the full path name of the
* directory that the executable was read from. Should be at least
* MAXPATHLEN bytes.
*
* pgm pass THE ADDRESS of a pointer variable here. findx() will fill the
* pointer variable with a pointer to the final component of the string
* passed as cmd above. Send a (char **)0 if you don't care about this.
*
* run pass THE ADDRESS of a pointer variable here. findx() will fill the
* pointer variable with a pointer to the final component of the name of
* the runnning program. Send a (char **)0 if you don't care about this.
*
* path pass the user's PATH variable here. I made it a parameter so you
* can fiddle with the path first. If you do not want to fiddle, pass
* getenv("PATH").
*
* RETURN VALUES: Normally, findx() will return zero if all is well. If
* something goes wrong, it will return -1 with the global variable
* "errno" set to a corresponding error number.
*/

#include <strings.h>
#include <errno.h>
#include <sys/param.h>

#define X_OK 1

#ifndef MAXPATHLEN
#define MAXPATHLEN 1024
#endif

#ifndef ENAMETOOLONG
#define ENAMETOOLONG EINVAL
#endif

int findx (); /* get location of directory */
int resolve (); /* get link resolution name */

#ifdef TESTMAIN
extern char *getenv (); /* read value from environment */
char *pn = (char *) 0;/* program name */
char *rn = (char *) 0;/* run name */
char rd[MAXPATHLEN]; /* run directory */
char wd[MAXPATHLEN] = "."; /* working directory */

int
main (argc, argv)
int argc;
char **argv;
{
findx (*argv, wd, rd, &pn, &rn, getenv ("PATH"));
printf ("%s: %s running in %s from %s\n", pn, rn, wd, rd);
return 0;
}

#endif

/*-
* findx - find executable file in PATH
* PARAMETERS:
* cmd filename as typed by user
* cwd where to return working directory
* dir where to return program's directory
* pgm where to return what user called it
* run where to return final resolution name
* path user's path from environment
* RETURNS: returns zero for success, -1 for error (with errno set properly).
*/
int
findx (cmd, cwd, dir, pgm, run, path)
char *cmd;
char *cwd;
char *dir;
char **pgm;
char **run;
char *path;
{
int rv = 0;
char *f, *s;

if (!cmd || !*cmd || !cwd || !dir) {
errno = EINVAL; /* stupid arguments! */
return -1;
}
if (!path || !*path) /* missing or null path */
path = "."; /* assume sanity */

if (*cwd != '/')
if (!(getcwd (cwd, MAXPATHLEN)))
return -1; /* cant get working directory */

f = rindex (cmd, '/');
if (pgm) /* user wants program name */
*pgm = f ? f + 1 : cmd;

if (dir) { /* user wants program directory */
rv = -1;
if (*cmd == '/') /* absname given */
rv = resolve ("", cmd + 1, dir, run);
else if (f) /* relname given */
rv = resolve (cwd, cmd, dir, run);
else if (f = path) { /* from searchpath */
rv = -1;
errno = ENOENT; /* errno gets this if path empty */
while (*f && (rv < 0)) {
s = f;
while (*f && (*f != ':'))
++f;
if (*f)
*f++ = 0;
if (*s == '/')
rv = resolve (s, cmd, dir, run);
else {
char abuf[MAXPATHLEN];

sprintf (abuf, "%s/%s", cwd, s);
rv = resolve (abuf, cmd, dir, run);
}
}
}
}
return rv;
}

/*
* resolve - check for specified file in specified directory sets up
* dir, following symlinks. returns zero for success, or -1 for error
* (with errno set properly)
*/
int
resolve (indir, cmd, dir, run)
char *indir; /* search directory */
char *cmd; /* search for name */
char *dir; /* directory buffer */
char **run; /* resultion name ptr ptr */
{
char *p;
int rv = -1;

#ifdef ELOOP
int lcc = 0;
int sll;
char symlink[MAXPATHLEN + 1];

#endif

do {
errno = ENAMETOOLONG;
if (strlen (indir) + strlen (cmd) + 2 > MAXPATHLEN)
break;

sprintf (dir, "%s/%s", indir, cmd);
if (access (dir, X_OK) < 0)
break; /* not an executable program */

#ifdef ELOOP
while ((sll = readlink (dir, symlink, MAXPATHLEN)) >= 0) {
symlink[sll] = 0;
if (*symlink == '/')
strcpy (dir, symlink);
else
sprintf (rindex (dir, '/'), "/%s", symlink);
}
if (errno != EINVAL)
break;
#endif

p = rindex (dir, '/');
*p++ = 0;
if (run) /* user wants resolution name */
*run = p;
rv = 0; /* complete, with success! */

} while (0);

return rv;
}

--
-- Greg Limes li...@sun.com ...!sun!limes 73327,2473 [choose one]

Mark Rosenthal

unread,
Sep 8, 1989, 5:11:55 PM9/8/89
to
In article <LIMES.89S...@ouroborous.wseng.sun.com> li...@sun.com (Greg Limes) writes:
>In article <61...@lynx.UUCP> mi...@lynx.uucp (Mitch Bunnell) writes:
>
>> In article <91...@june.cs.washington.edu> b...@cs.washington.edu (Clifford Neuman) writes:
>> > 2) Obtaining the full path name of the presently running executable.
>
>> 2 - Not possible.
>
>Back before I knew this was impossible, I wrote the following piece of
>support code. It has been doing the impossible for me for quite some
>time (geez, has it been that long?) with limitations as stated.

Your approach works if the program was exec'd by a reasonably well-behaved
program. 'sh' and 'csh' fall into this category. Unfortunately, your code
(or anybody else's) fails to solve the general case. Your code depends on the
value in argv. Let's say you call your findx() inside a program called
yourprog in /usr/bin. If you invoke it from either 'sh' or 'csh' with either a
full or partial pathname, it should come up with the right directory. But try
invoking your code by running the following:

main()
{
execl("/usr/bin/yourprog", "garbagename", "arg1", "arg2");
}

The point is that the value in argv[0] is not necessarily guaranteed to have
anything to do with the name of the file the program resided in.

This is not merely hypothetical. There is code in vi to execute a command from
within vi. Have you checked it to see if it calls system() or parses the
command itself, and does its own fork() and exec(). What about emacs? In order
to be sure you can count on argv[0], you would have to have checked every single
Unix utility that calls any version of exec() to make sure it passes the right
thing for argv[0].

More bad news. You can't depend on the value of environment variables like
PATH. Your program could have been invoked with execve() or execle(), in
which case the PATH variable your program sees has no necessary relationship
to the PATH variable used to find your program in order to exec() it.

Sorry, but it really is not possible using information legally available to
the program. It might be possible by examining tables in kernel memory (if
your program has privileges to do this), but I'm not certain of that.
--
Mark of the Valley of Roses
...!bbn.com!aoa!mbr

Dan McCue

unread,
Sep 14, 1989, 9:45:46 AM9/14/89
to

Recent discussions on the net about a program's inability to find out
the name of the running executable point up a deficiency in the
OS interface of UNIX(s). Of course, on systems that support dynamic
loading, this problem generalizes to finding out the name of
ALL of the executable images in an address space (and where they are
loaded). The only UNIX operating system I am aware of that
provides this service (reliably, in user mode) is Apollo's Domain/OS.

Two questions:

Are there other systems that provide a "system service" for finding
the (path)name/load map of the running executable(s)?

Are any of the UNIX standardization bodies either official (e.g.,
POSIX, X/OPEN) or unoffical (e.g., OSF, Unix International) working
on this problem?

Greg Limes

unread,
Sep 14, 1989, 9:42:07 PM9/14/89
to
In article <9...@aoa.UUCP> m...@aoa.UUCP (Mark Rosenthal) writes:
In article <LIMES.89S...@ouroborous.wseng.sun.com> li...@sun.com (Greg Limes) writes:
>In article <61...@lynx.UUCP> mi...@lynx.uucp (Mitch Bunnell) writes:
>
>> In article <91...@june.cs.washington.edu> b...@cs.washington.edu (Clifford Neuman) writes:
>> > 2) Obtaining the full path name of the presently running executable.
>
>> 2 - Not possible.
>
>Back before I knew this was impossible, I wrote the following piece of
>support code. It has been doing the impossible for me for quite some
>time (geez, has it been that long?) with limitations as stated.

Your approach works if the program was exec'd by a reasonably well-behaved
program. 'sh' and 'csh' fall into this category. Unfortunately, your code
(or anybody else's) fails to solve the general case.

After receiving several pieces of mail -- as usual with postings of
findx(), generally ranging from flames to thanks -- I finally went
back and re-read the disclaimers at the top.

It appears that I nuked the section that described in detail the
assumptions that findx() made and the ways that these assumptions
might break down, and some specific limits on the utility of the
routine based on those assumptions.

Boiling it all down: findx() assumes you have a list of candidate
directories in the form of a normal $PATH, and the name of a file to
be found within those directories.

If your application keeps a name or list of names under which it can
be installed, it can dispense with looking in argv[0] for the name;
similarly, if your application keeps a directory or list of
directories under which it could be installed you can dispense with
querying the $PATH variable.

As Mark notes, there is no way of getting the *one*true*name* of the
binary out of the kernel, and even if you could (which he does not
note), you can still be in a position where the binary has been
replaced by something else (consider a binary that has been updated
with a new version).

Clifford Neuman

unread,
Sep 17, 1989, 9:03:49 PM9/17/89
to
Thanks for all the responses to my question on reading the symbol
table from a running program. As promised, here is a summary of
responses.

Approach one: Directly reading the symbol table from within the
running program.

A number of people suggested including the symbol table as an integral
part of the program. This is best accomplished by declaring a large
static and initialized data area within the program. Once the program
is linked the executable can then be postprocessed to copy the symbol
table into data area. This approach has the added advantage that even
if someone later strips the executable, the copied symbol table
remains.

If it is known ahead of time which symbols will be needed, then you
can compile your own symbol table into the program, and do away with
the postprocessing entirely. In my case, I am doing dynamic linking,
but it was acceptable for me to restrict the procedures that should be
callable. This is the approach I am using.

Approach two: Obtaining the full path name of the presently running
executable. Remember, argv[0] will not contain the full path if the


executable was found through the search path. Also, the program may
have been started by execl with an argv[0] that is unrelated to the
file name.

There were basically two ways of approaching the problem from this
angle. In the simplest approach, one would create a wrapper script
which would call the desired program with the full path name, and the
user would call the wrapper. This assumed, of course, that user never
called the actual program directly. It also required that the wrapper
know where the program is installed.

In the second approach, one searched the path to find the executable
with the name found in argv[0]. This approach makes a few
assumptions, but in most situaltions, it would work. The code to do
the search was posted to this newsgroup.

One response pointed out that ksh sets a variable $_ that refers
to the full path of the file to be executed, and suggested looking
at it. Another suggested looking at the "-c" option to ps.

Approach three: Obtaining a file descriptor for the currently running
executable (i.e. I don't need the name as long as I can read it).

The answers along this lines involved adding a new device
(/dev/text) with an open routine that creates a file descriptor
pointing at the inode hidden in the text structure.

Again, thanks to everyone who responded. Responses were received from
Bill Griswold (who also implemented, and made the necessary cahnges to
the dynamic linking package I am using), Bill Sommerfeld, Chris Torek,
Oliver Laumann, Rich Salz, Conor P. Cahill, David Barts, Adam R de
Boor, David Goodenough, Guy Harris, Gordon Burditt, Greg Limes, Mark
Rosenthal, Dan McCue, Perry Hutchison, siswat!bu...@gazette.bcm.tmc.edu,
and vsi!fri...@uunet.uu.net.

~ Cliff

Diomidis Spinellis

unread,
Sep 21, 1989, 6:09:36 AM9/21/89
to
In article <1989Sep14....@newcastle.ac.uk> mc...@turing.newcastle.ac.uk (Dan McCue) writes:
[...]

> Are there other systems that provide a "system service" for finding
> the (path)name/load map of the running executable(s)?

The 8th Research Unix Edition provides a method of obtaining a file descriptor
to the text file of a running executable. The idea is to open the image of
the process in the process file system /proc and send the appropriate ioctl.

#include <sys/proc.h>
#include <sys/pioctl.h>

/*
* Return a read only file descriptor to the text file of an executable
* given its process id. Returns -1 on failure. (Not tested).
*/
int
pid2fd(pid)
int pid;
{
static char fname[1024];
long result;
int fd;

sprintf(fname, "/proc/%d", pid);
if ((fd = open(fname, 0)) == -1)
return -1;
if (ioctl(fd, PIOCOPENT, &result) == -1)
return -1;
return (int)result;
}

Diomidis
--
Diomidis Spinellis European Computer-Industry Research Centre (ECRC)
Arabellastrasse 17, D-8000 Muenchen 81, West Germany +49 (89) 92699199
USA: diomidis%ecrcva...@pyramid.pyramid.com ...!pyramid!ecrcvax!diomidis
Europe: diom...@ecrcvax.uucp ...!unido!ecrcvax!diomidis

0 new messages