I was told by a friend that having a tool which could add a rpath to
an already linked binary would help a lot of people.
So I wrote a small utility (in C) to do it, and it works quite well,
apart from a very crude interface.
The actual big problem is that to do the things properly, it moves the
program headers table, which triggers a bug in the linux kernel
(2.6.21.1 has it)(*). I have got a patch to the kernel which corrects
the problem(#). There is also a compilation option for the binaries
produced to work on an uncorrected kernel, but it doesn't work on PIE
for a reason I don't understand (whereas on patched kernels it works).
(maybe a bug in ld.so)
You will find a page which explains a bit how it works, and which
provides a link to the source.
http://www.eleves.ens.fr/home/godfroy/addrpathen.html
I would be happy to recieve any comments on this. (or on the kernel
patch)
Thanks.
(*) And it's not the only software for which produced binaries trigger
bugs. The libBFD seems to have dirty assumptions about elf binaries,
as well as eu-strip.
(#) I have submitted it to linux-kernel, but it seems to have been
forgotten. I will resubmit it when I recieve enough reviews on it.
--
Quentin Godfroy
> Hi all,
>
> I was told by a friend that having a tool which could add a rpath to
> an already linked binary would help a lot of people.
Hi,
It already exists:
man chrpath
(but you may have to install it - it's not always lying around on a default
install).
Why not have a look at it's source too - I presume it does what you are
trying to achieve.
HTH
Tim
I don't think so.
chrpath can shrink a rpath, remove one but not add one, nor expand
one.
(at least on the version I have on my debian)
Cheers,
Quentin
You're right - having bothered to actually read the man page myself *blush*.
I see the problem now... Unfortunately I'm not in a position to help :(
Cheers
Tim
> I was told by a friend that having a tool which could add a rpath to
> an already linked binary would help a lot of people.
> So I wrote a small utility (in C) to do it, and it works quite well,
> apart from a very crude interface.
Not diminishing your work in any way, I wounder why these people
can't either set LD_LIBRARY_PATH [1], or use chrpath? [2]
[1] Yes, I know it doesn't work for set-uid executables. But how
often do such executables need their RPATH modified?
[2] The limitation that new RPATH must not be longer than the
original has a trivial workaround -- just create a short symlink
to the (desired) long path.
Cheers,
--
In order to understand recursion you must first understand recursion.
Remove /-nsp/ for email.
And I wonder why people obviously not interested in developing
something keep posting this ...
LD_LIBRARY_PATH is not really adapted when you have tons of
directories where programs & libraries are compiled. Personally I
never compile things in the same root, for the reason that it becomes
really rapidly a real mess.
And shell wrappers are not really reliable.
But that's a question of taste I suppose.
> [1] Yes, I know it doesn't work for set-uid executables. But how
> often do such executables need their RPATH modified?
Quite rarely indeed. But you don't always have the sources or whish to
recompile everything to move a library. Or even try to understand why
the makefile or libtool decides to put a bad rpath, or something.
>
> [2] The limitation that new RPATH must not be longer than the
> original has a trivial workaround -- just create a short symlink
> to the (desired) long path.
That supposes you have access to large parts of the filesystem, which
is not always the case, and personnally I would not like to have my
libraries being looked for by a symlink in /var/tmp/libfoo.so. IMHO,
making a symlink is really worse than everything else.
But anyway, nobody asks you to use the tool. I'm just suggesting it.
[...]
> http://www.eleves.ens.fr/home/godfroy/addrpathen.html
>
> I would be happy to recieve any comments on this. (or on the kernel
> patch)
I'll confine myself to the kernel patch. A introductory remark: If
everybody else is doing it in the different way and there is a reason
to assume that 'everybody else' may have been doing it quite some time
longer, it is not an unreasonable assumption that 'the different way'
may actually be the correct one.
+++ linux-2.6.21.1-patch/fs/binfmt_elf.c 2007-05-12 21:10:46.000000000 -0400
@@ -133,7 +133,7 @@ static int padzero(unsigned long elf_bss
static int
create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
- int interp_aout, unsigned long load_addr,
+ int interp_aout, unsigned long phdr_addr,
unsigned long interp_load_addr)
The person who called this 'load address' instead of 'program header
address' presumably thought 'load address' would make more
sense. Since you didn't change the caller, you shouldn't change the
parameter name it doesn't matter if you think it makes more sense
this way, because you might be wrong and apart from that, the
maintainer, if he or she isn't the very person with the differing
opinion, will consider this to be a pointless change (if you were
designing a language called C++, doing it wrong from the start and
then doing it wrong again by changing to what should have been used
from the beginning after everybody got accustomed to the other would be
normal behaviour ...).
- NEW_AUX_ENT(AT_PHDR, load_addr + exec->e_phoff);
+ NEW_AUX_ENT(AT_PHDR, phdr_addr);
The e_phoff member of an ELF header is defined as
This member holds the program header table's file offset in bytes.
If the file has no program header table, this member holds
zero.
(SysV ABI, p. 50)
This means that 'load_addr + exec->e_phoff' is decidedly the location
where the program header should be.
- for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++)
+ for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++) {
+ if (elf_ppnt->p_type == PT_PHDR) {
+ phdr_addr = elf_ppnt->p_vaddr;
+ continue;
+ }
At this point, you overwrite the value passed as 'load address' into
the subroutine with the virtual address stored in the program header.
The p_vaddr member of the program header structure is defined as
This member gives the virtual address at which the first byte of
the segment resides in memory.
(ib, p. 75)
Since the exact details of program loading are architecture specific,
the gabi specification refers to a 'processor specific
supplement'. For i386, this contains the following text (on page 2-3)
Base Address
The virtual addresses in the program headers might not
represent the actual virtual addresses of the program's memory
image. Executable files typically contain absolute code. To
let the process execute correctly, the segments must reside at
the virtual addresses used to build the executable file. On
the other hand, shared object segments typically contain
position-independent code. This lets a segment's virtual
address change from one process to another, without
invalidating execution behavior. Though the system chooses
virtual addresses for individual processes, it maintains the
segments' relative positions. Because
Conclusion: The kernel is right and your code is wrong.
Agreed.
Yes, in the file, not in the memory map.
To me, giving to the ld.so load_addr + exec->e_phoff seems more like a
guess than looking for where is exactly in memory
Well, If the program headers are NOT in the first loaded segment, this
code is wrong. And I don't see where the norm prohibits this.
>
> - for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++)
> + for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++) {
> + if (elf_ppnt->p_type == PT_PHDR) {
> + phdr_addr = elf_ppnt->p_vaddr;
> + continue;
> + }
>
> At this point, you overwrite the value passed as 'load address' into
> the subroutine with the virtual address stored in the program header.
> The p_vaddr member of the program header structure is defined as
yes, but It is corrected later by adding load_bias, which makes
> This member gives the virtual address at which the first byte of
> the segment resides in memory.
> (ib, p. 75)
>
> Since the exact details of program loading are architecture specific,
> the gabi specification refers to a 'processor specific
> supplement'. For i386, this contains the following text (on page 2-3)
>
> Base Address
>
> The virtual addresses in the program headers might not
> represent the actual virtual addresses of the program's memory
> image. Executable files typically contain absolute code. To
> let the process execute correctly, the segments must reside at
> the virtual addresses used to build the executable file. On
> the other hand, shared object segments typically contain
> position-independent code. This lets a segment's virtual
> address change from one process to another, without
> invalidating execution behavior. Though the system chooses
> virtual addresses for individual processes, it maintains the
> segments' relative positions. Because
>
> Conclusion: The kernel is right and your code is wrong.
This patch works for PIE, I perfectly know that elf_ppnt->p_vaddr
might not be the address where the program headers are in the actual
map.
I did change it. Or I do not understand what do you mean.
[...]
>> - NEW_AUX_ENT(AT_PHDR, load_addr + exec->e_phoff);
>> + NEW_AUX_ENT(AT_PHDR, phdr_addr);
>>
>> The e_phoff member of an ELF header is defined as
>>
>> This member holds the program header table's file offset in bytes.
>> If the file has no program header table, this member holds
>> zero.
>> (SysV ABI, p. 50)
>>
>> This means that 'load_addr + exec->e_phoff' is decidedly the location
>> where the program header should be.
>
> Yes, in the file, not in the memory map.
>
> To me, giving to the ld.so load_addr + exec->e_phoff seems more like a
> guess than looking for where is exactly in memory
load_addr is calculated in the caller as p_vaddr - p_offset of the
first text segment found in the file. This is by definition the
beginning of the loaded file (see TIS/ ELF, p 2-2). Since e_phoff is
the offset of the program header from the beginning of the file,
e_phoff + (p_vaddr - p_offset) is the location of the program header
if the code that actually maps the respective pages works like the
code used here does (wich freaked-outedly loads the file into some
memory from the start ...).
> Well, If the program headers are NOT in the first loaded segment, this
> code is wrong. And I don't see where the norm prohibits this.
Reading it could help here. Try the figure on page 1-1.
No. This is a guess.
>> Well, If the program headers are NOT in the first loaded segment, this
>> code is wrong. And I don't see where the norm prohibits this.
> Reading it could help here. Try the figure on page 1-1.
You should read it too :
NOTE
Although the figure shows the program header table immediately after
the
ELF header, and the section header table following the sections,
actual files
may differ. Moreover, sections and segments have no specified order.
Only the ELF header has a fixed position in the file.
(Book I : ELF, 1-2, Tool Interface Standard, Executable and Linking
Format
Specification, Version 1.2)
Has your version of the PDF got holes??
It is the required file format for SVR4 ELF executables on
Intel. Which happens to be the executable file format that is used on
Linux, too. Even if this wasn't true, the kernel works the way it
works and your program modifies a working executable in a way that it
can no longer be executed. If is, of course, possible to modify the
ELF code in the kernel to work with QG-ELF, too, but why?
BTW, don't bother to respond. You are too much of a nuisance to
continue to read your posts.
Why can't you use /etc/ld.so.conf?
>> [1] Yes, I know it doesn't work for set-uid executables. But how
>> often do such executables need their RPATH modified?
>
> Quite rarely indeed. But you don't always have the sources or whish to
> recompile everything to move a library. Or even try to understand why
> the makefile or libtool decides to put a bad rpath, or something.
Again, that's what /etc/ld.so.conf is for.
--
`On a scale of one to ten of usefulness, BBC BASIC was several points ahead
of the competition, scoring a relatively respectable zero.' --- Peter Corlett
So SRV4 ELF files do not respect the norm itself.
By the way, why to bother do a code like load_address + header-
>e_phoff as we all know that the header is 52 bytes long? Why not
load_address + sizeof(ELF32_Ehdr)?
> Which happens to be the executable file format that is used on
> Linux, too. Even if this wasn't true, the kernel works the way it
> works and your program modifies a working executable in a way that it
> can no longer be executed. If is, of course, possible to modify the
> ELF code in the kernel to work with QG-ELF, too, but why?
That is NOT true. The patch perfectly works for every executable and
shared libraries produced by the binutils. Please show me an example
where the code fails instead of saying that I am wrong.
Because a user may not whish to ask the administrator to add its
libraries in ld.so.conf
> >> [1] Yes, I know it doesn't work for set-uid executables. But how
> >> often do such executables need their RPATH modified?
>
> > Quite rarely indeed. But you don't always have the sources or whish to
> > recompile everything to move a library. Or even try to understand why
> > the makefile or libtool decides to put a bad rpath, or something.
>
> Again, that's what /etc/ld.so.conf is for.
For root. Not for a simple user.
Because it adds a lot of generality to the cost of changing TEN lines
of code, and it makes the kernel respect the norm.