glibc on ia64 or how relocations bootstrap
It was a rainy evening on #gentoo-ia64 and suddenly
00:40 < undersys> trying out glibc 2.21 on my ia64 box
00:41 < undersys> all compiles fine, gets to preinstall test and fails
00:41 < undersys> "simple run test (/usr/bin/cal) failed" wat :C
... <some trials and errors, still fails>
16:53 < undersys> /usr/portage/sys-libs/glibc/files/eblits/pkg_preinst.eblit:
line 24: 17141 Segmentation fault
LC_ALL=C ./ld-*.so --library-path . ${x} > /dev/null
Brendan tried hard to get glibc building on ia64 by picking various gcc versions (4.7, 4.8, 4.9) but no luck. Late stage of libc sanity check suggested resulting glibc is completely busted. Bug is known for a while as a bug 503838.
the problem
I’ve tried to reproduce crash on one of ia64 boxes:
$ emerge -v1 =glibc-2.22-r1
...
>>> Installing (1 of 1) sys-libs/glibc-2.22-r1
* Defaulting /etc/host.conf:multi to on
/bound/portage/sys-libs/glibc/files/eblits/pkg_preinst.eblit: line 24:
6623 Segmentation fault (core dumped) LC_ALL=C ./ld-*.so --library-path . ${x} > /dev/null
Got the same thing!
I think I’ve seen the same crash before but never bothered finding the actual cause of glibc failure. I always ignored it and kept using old glibc as I usually was up to something else when visited ia64 (likely GHC binary rebuild).
But not this time :)
reproducing
The crash happened in pkg_preinst phase after a successful attempt to build the package:
* ERROR: sys-libs/glibc-2.22-r1 failed (preinst phase):
* simple run test (/usr/bin/cal) failed
*
* Call stack:
* ebuild.sh, line 93: Called pkg_preinst
* environment, line 2841: Called eblit-run 'pkg_preinst'
* environment, line 930: Called eblit-glibc-pkg_preinst
* pkg_preinst.eblit, line 57: Called glibc_sanity_check
* pkg_preinst.eblit, line 36: Called die
* The specific snippet of code:
* LC_ALL=C \
* ./ld-*.so --library-path . ${x} > /dev/null \
* || die "simple run test (${x}) failed"
sys-libs/glibc/files/eblits/pkg_preinst.eblit tries to run the following code:
pushd ../../image/lib # 'make install' places freshly built glibc here
for x in cal date env free ls true uname uptime ; do
x=$(type -p ${x})
LC_ALL=C \
./ld-*.so --library-path . ${x} > /dev/null \
|| die "simple run test (${x}) failed"
done
This code snippet picks commonly installed programs from user’s system and tries to run them under freshly built dynamic interpreter (as opposed to default /lib/ld-linux-ia64.so.2) And that new interpreter mysteriously crashes.
We can try to reproduce crash right from a build directory:
$ pwd
/var/tmp/portage/sys-libs/glibc-2.22-r1/work/build-ia64-ia64-unknown-linux-gnu-nptl
$ elf/ld.so --library-path ../../image/lib /usr/bin/cal
Segmentation fault (core dumped)
$ readelf -a /usr/bin/cal | grep interpreter
[Requesting program interpreter: /lib/ld-linux-ia64.so.2]
$ /lib/ld-linux-ia64.so.2 --library-path ../image/lib /usr/bin/cal
December 2015
Su Mo Tu We Th Fr Sa
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31
Old ld.so works, the new one doesn’t.
nailing down the location
Next step is to drop down into gdb and look at the crash:
# gdb -q --args elf/ld.so --library-path ../image/lib /usr/bin/cal
(gdb) run
Starting program: /var/tmp/portage/sys-libs/glibc-2.22-r1/work/build-ia64-ia64-unknown-linux-gnu-nptl/elf/ld.so --library-path ../image/lib /usr/bin/cal
Failed to read a valid object file image from memory.
Program received signal SIGSEGV, Segmentation fault.
0x200000080000b1f1 in elf_get_dynamic_info (temp=0x0, l=0x2000000800052ef8 <_rtld_local+2456>) at get-dynamic-info.h:70
70 + DT_VERSIONTAGNUM + DT_EXTRANUM + DT_VALNUM] = dyn;
(gdb) list
65 else if ((d_tag_utype) DT_VALTAGIDX (dyn->d_tag) < DT_VALNUM)
66 info[DT_VALTAGIDX (dyn->d_tag) + DT_NUM + DT_THISPROCNUM
67 + DT_VERSIONTAGNUM + DT_EXTRANUM] = dyn;
68 else if ((d_tag_utype) DT_ADDRTAGIDX (dyn->d_tag) < DT_ADDRNUM)
69 info[DT_ADDRTAGIDX (dyn->d_tag) + DT_NUM + DT_THISPROCNUM
70 + DT_VERSIONTAGNUM + DT_EXTRANUM + DT_VALNUM] = dyn;
71 ++dyn;
72 }
73
74 #define DL_RO_DYN_TEMP_CNT 8
(gdb) bt
#0 0x200000080000b1f1 in elf_get_dynamic_info (temp=0x0, l=0x2000000800052ef8 <_rtld_local+2456>) at get-dynamic-info.h:70
#1 _dl_start (arg=0x60000fffffffb2a0) at rtld.c:382
#2 0x2000000800001a50 in _start ()
gdb shows the exact line number where crash happens, that’s good. I tried to check disassembly to see if anything obvious stands up.
(gdb) disassemble
Dump of assembler code for function _dl_start:
0x200000080000a800 <+0>: [MMB] alloc r51=ar.pfs,26,22,0
0x200000080000a801 <+1>: mov r52=r12
0x200000080000a802 <+2>: nop.b 0x0
0x200000080000a810 <+16>: [MII] adds r12=-16,r12
0x200000080000a811 <+17>: mov r50=b0
... <some pages later>
0x200000080000b1d0 <+2512>: [MMI] sub r16=r25,r14;;
0x200000080000b1d1 <+2513>: cmp.ltu p6,p7=10,r16
0x200000080000b1d2 <+2514>: nop.i 0x0;;
0x200000080000b1e0 <+2528>: [MMI] nop.m 0x0
0x200000080000b1e1 <+2529>: (p07) shladd r14=r14,3,r0
0x200000080000b1e2 <+2530>: nop.i 0x0;;
0x200000080000b1f0 <+2544>: [MMI] (p07) sub r14=r26,r14;;
=> 0x200000080000b1f1 <+2545>: (p07) st8 [r14]=r15
0x200000080000b1f2 <+2546>: adds r15=16,r15;;
0x200000080000b200 <+2560>: [MMI] nop.m 0x0
0x200000080000b201 <+2561>: ld8 r14=[r15]
0x200000080000b202 <+2562>: nop.i 0x0;;
0x200000080000b210 <+2576>: [MIB] nop.m 0x0
... <some pages later>
0x200000080000b711 <+3857>: nop.i 0x0
0x200000080000b712 <+3858>: br.few 0x200000080000b620 <_dl_start+3616>
0x200000080000b720 <+3872>: [MMI] nop.m 0x0
0x200000080000b721 <+3873>: ld8 r14=[r15]
0x200000080000b722 <+3874>: nop.i 0x0;;
0x200000080000b730 <+3888>: [MIB] nop.m 0x0
0x200000080000b731 <+3889>: cmp.eq p7,p6=0,r14
0x200000080000b732 <+3890>: br.few 0x200000080000a940 <_dl_start+320>;;
End of assembler dump.
All in all _dl_start disassembly was 720 lines long (15 pages of code). I was not able to easily find where r15 register assignment happened. Basically I had no idea what i am looking at :)
First off it’s worth understanding why disassembly shows us _dl_start() and not elf_get_dynamic_info().
Here is the annotated backtrace (click on the links! they are fun):
- #0 0x200000080000b1f1 in elf_get_dynamic_info (temp=0x0, l=0x2000000800052ef8 <_rtld_local+2456>) at get-dynamic-info.h:70
- #1 _dl_start (arg=0x60000fffffffb2a0) at rtld.c:382
- #2 0x2000000800001a50 in _start ()
It is easy to trace the whole chain from the very _start (every dynamically linked program starts like that on ia64):
- _start (ld.so entry point) where very little happens:
- the module-base register gp (also known as r1) is being computed
- control is passed to _dl_start()
- _dl_start is C code entry point where ld.so own ELF header is being parsed:
bootstrap_map (dynamic linker context) is being initialised as:
static ElfW(Addr) _dl_start (void *arg) { ... #include "dynamic-link.h" /* includes defintion of elf_get_dynamic_info() */ ... (bootstrap_map.l_info, '\0', sizeof (bootstrap_map.l_info)); __builtin_memset .l_addr = elf_machine_load_address (); bootstrap_map.l_ld = (void *) bootstrap_map.l_addr + elf_machine_dynamic (); bootstrap_map (&bootstrap_map, NULL); /* crash happens here */ elf_get_dynamic_info ... }
Thus the elf_get_dynamic_info() is a local inline function:
auto inline void __attribute__ ((unused, always_inline))
(struct link_map *l, ElfW(Dyn) *temp) { ... elf_get_dynamic_info
That’s it. The reason of unreadable _dl_start is excessive inlining.
I undid the inline damage to make disassembly slightly more readable. Basically changed inline to noiline and moved out exact code bit that crashed to yet another noinline function elf_get_dynamic_info_addr_tag:
--- ../glibc-2.22/elf/get-dynamic-info.h.orig 2015-12-27 12:29:22.468333779 +0000
+++ ../glibc-2.22/elf/get-dynamic-info.h 2015-12-27 12:33:43.124279949 +0000
@@ -1,100 +1,113 @@
#include <assert.h>
#include <libc-internal.h>
+#ifndef RESOLVE_MAP
+static
+#else
+auto
+#endif
+void __attribute__ ((unused, noinline))
+elf_get_dynamic_info_addr_tag (struct link_map *l, ElfW(Dyn) *dyn)
+{
+ ElfW(Dyn) **info = l->l_info;
+
+ info[DT_ADDRTAGIDX (dyn->d_tag) + DT_NUM + DT_THISPROCNUM
+ + DT_VERSIONTAGNUM + DT_EXTRANUM + DT_VALNUM] = dyn;
+}
+
#ifndef RESOLVE_MAP
static
#else
auto
#endif-inline void __attribute__ ((unused, always_inline))
+void __attribute__ ((unused, noinline))
elf_get_dynamic_info (struct link_map *l, ElfW(Dyn) *temp)
{
ElfW(Dyn) *dyn = l->l_ld;
ElfW(Dyn) **info;
#if __ELF_NATIVE_CLASS == 32
typedef Elf32_Word d_tag_utype;
#elif __ELF_NATIVE_CLASS == 64
typedef Elf64_Xword d_tag_utype;
#endif
#ifndef RTLD_BOOTSTRAP
if (dyn == NULL)
return;
#endif
info = l->l_info;
while (dyn->d_tag != DT_NULL)
{
if ((d_tag_utype) dyn->d_tag < DT_NUM)
info[dyn->d_tag] = dyn;
else if (dyn->d_tag >= DT_LOPROC &&
dyn->d_tag < DT_LOPROC + DT_THISPROCNUM)
{
/* This does not violate the array bounds of l->l_info, but
gcc 4.6 on sparc somehow does not see this. */
DIAG_PUSH_NEEDS_COMMENT;
DIAG_IGNORE_NEEDS_COMMENT (4.6,
"-Warray-bounds");
info[dyn->d_tag - DT_LOPROC + DT_NUM] = dyn;
DIAG_POP_NEEDS_COMMENT;
}
else if ((d_tag_utype) DT_VERSIONTAGIDX (dyn->d_tag) < DT_VERSIONTAGNUM)
info[VERSYMIDX (dyn->d_tag)] = dyn;
else if ((d_tag_utype) DT_EXTRATAGIDX (dyn->d_tag) < DT_EXTRANUM)
info[DT_EXTRATAGIDX (dyn->d_tag) + DT_NUM + DT_THISPROCNUM
+ DT_VERSIONTAGNUM] = dyn;
else if ((d_tag_utype) DT_VALTAGIDX (dyn->d_tag) < DT_VALNUM)
info[DT_VALTAGIDX (dyn->d_tag) + DT_NUM + DT_THISPROCNUM
+ DT_VERSIONTAGNUM + DT_EXTRANUM] = dyn;
else if ((d_tag_utype) DT_ADDRTAGIDX (dyn->d_tag) < DT_ADDRNUM)- info[DT_ADDRTAGIDX (dyn->d_tag) + DT_NUM + DT_THISPROCNUM
- + DT_VERSIONTAGNUM + DT_EXTRANUM + DT_VALNUM] = dyn;
+ elf_get_dynamic_info_addr_tag (l, dyn);
++dyn; }
That way I’ve got the following crash dump:
# gdb -q --args elf/ld.so --library-path ../image/lib /usr/bin/cal
Reading symbols from elf/ld.so...done.
(gdb) run
Starting program: /var/tmp/portage/sys-libs/glibc-2.22-r1/work/build-ia64-ia64-unknown-linux-gnu-nptl/elf/ld.so --library-path ../image/lib /usr/bin/cal
Failed to read a valid object file image from memory.
Program received signal SIGSEGV, Segmentation fault.
0x200000080000a8b0 in elf_get_dynamic_info_addr_tag (dyn=0x200000080004e4b0, l=0x2000000800053178 <_rtld_local+2456>)
at get-dynamic-info.h:33
33 + DT_VERSIONTAGNUM + DT_EXTRANUM + DT_VALNUM] = dyn;
(gdb) bt
#0 0x200000080000a8b0 in elf_get_dynamic_info_addr_tag (dyn=0x200000080004e4b0, l=0x2000000800053178 <_rtld_local+2456>)
at get-dynamic-info.h:33
#1 0x200000080000ade0 in elf_get_dynamic_info (temp=0x0, l=0x2000000800053178 <_rtld_local+2456>) at get-dynamic-info.h:83
#2 0x200000080000afe0 in _dl_start (arg=0x60000fffffffb2a0) at rtld.c:382
#3 0x2000000800001a50 in _start ()
(gdb) disassemble
Dump of assembler code for function elf_get_dynamic_info_addr_tag:
0x200000080000a880 <+0>: [MMI] ld8 r14=[r32];;
0x200000080000a881 <+1>: shladd r15=r14,3,r0
0x200000080000a882 <+2>: addl r14=163120,r1;;
0x200000080000a890 <+16>: [MMI] ld8 r14=[r14];;
0x200000080000a891 <+17>: adds r14=992,r14
0x200000080000a892 <+18>: nop.i 0x0;;
0x200000080000a8a0 <+32>: [MMI] nop.m 0x0
0x200000080000a8a1 <+33>: sub r14=r14,r15
0x200000080000a8a2 <+34>: nop.i 0x0;;
=> 0x200000080000a8b0 <+48>: [MIB] st8 [r14]=r32
0x200000080000a8b1 <+49>: nop.i 0x0
0x200000080000a8b2 <+50>: br.ret.sptk.many b0;;
End of assembler dump.
12 instructions (4 of which are nops) is more manageable.
More readable but still is completely unclear. r32 is the only used input register here (r33 would be the second) while elf_get_dynamic_info_addr_tag() clearly has two arguments:
void __attribute__ ((unused, noinline))
(struct link_map *l, ElfW(Dyn) *dyn) elf_get_dynamic_info_addr_tag
At this point i’ve started looking at what exactly crashing code is supposed to do.
the first workaround
_rtld_local is a whole linker context (defined here) of type struct rtld_global (slightly simplified):
struct rtld_global
{
struct link_namespaces
{
....
} _dl_ns[DL_NNS];
...
struct link_map _dl_rtld_map;
...
};
...
struct link_map
{
(Addr) l_addr; /* Difference between the address in the ELF
ElfW file and the addresses in memory. */
char *l_name; /* Absolute file name object was found in. */
(Dyn) *l_ld; /* Dynamic section of the shared object. */
ElfWstruct link_map *l_next, *l_prev; /* Chain of loaded objects. */
...
(Dyn) *l_info[DT_NUM + DT_THISPROCNUM + DT_VERSIONTAGNUM
ElfW+ DT_EXTRANUM + DT_VALNUM + DT_ADDRNUM];
...
}
The code crashed when tried to fill in _rtld_local._dl_rtld_map.l_info global variable. ld.so even succeded at previous step (inspecting values right after crash):
(gdb) print _rtld_local._dl_rtld_map.l_info
$4 = {0x0 <repeats 14 times>, 0x200000080004e4a0, 0x0 <repeats 62 times>}
(gdb) print _rtld_local._dl_rtld_map.l_info[14]->d_tag
$3 = 14 # DT_SONAME
but was not able to handle current section type:
(gdb) (gdb) print dyn->d_tag
$5 = 1879047925 # 0x6ffffef5
Looking at /usr/include/elf.h it’s a section of DT_GNU_HASH type.
What kinds of dynamic sections does ld.so have?
# readelf -d elf/ld.so
Dynamic section at offset 0x3e4a0 contains 21 entries:
Tag Type Name/Value
0x000000000000000e (SONAME) Library soname: [ld-linux-ia64.so.2]
0x000000006ffffef5 (GNU_HASH) 0x190
0x0000000000000005 (STRTAB) 0x8d0
0x0000000000000006 (SYMTAB) 0x318
0x000000000000000a (STRSZ) 952 (bytes)
0x000000000000000b (SYMENT) 24 (bytes)
0x0000000070000000 (IA_64_PLT_RESERVE) 0x52660 -- 0x52678
0x0000000000000003 (PLTGOT) 0x2a980
0x0000000000000002 (PLTRELSZ) 120 (bytes)
0x0000000000000014 (PLTREL) RELA
0x0000000000000017 (JMPREL) 0x1278
0x0000000000000007 (RELA) 0xdb0
0x0000000000000008 (RELASZ) 1224 (bytes)
0x0000000000000009 (RELAENT) 24 (bytes)
0x000000006ffffffc (VERDEF) 0xd08
0x000000006ffffffd (VERDEFNUM) 5
0x000000000000001e (FLAGS) BIND_NOW
0x000000006ffffffb (FLAGS_1) Flags: NOW
0x000000006ffffff0 (VERSYM) 0xc88
0x000000006ffffff9 (RELACOUNT) 17
0x0000000000000000 (NULL) 0x0
ld.so managed to load SONAME (first) section and failed at GNU_HASH (second). What if we drop GHU_HASH from ld.so image?
Tried to relink it with default sysv hash style.
The default command to link ld.so is:
# ia64-unknown-linux-gnu-gcc \
-Wl,-O1 -Wl,--as-needed \
\
-Wl,--hash-style=gnu \
\
-nostdlib -nostartfiles \
-shared \
\
-o elf/ld.so.new \
\
-Wl,-z,combreloc -Wl,-z,relro -Wl,-z,defs -Wl,-z,now \
elf/librtld.os \
-Wl,--version-script=ld.map \
-Wl,-soname=ld-linux-ia64.so.2 \
-Wl,-defsym=_begin=0
I’ve changed -Wl,--hash-style=gnu to -Wl,--hash-style=sysv:
# ia64-unknown-linux-gnu-gcc \
-Wl,-O1 -Wl,--as-needed \
\
-Wl,--hash-style=sysv \
\
-nostdlib -nostartfiles \
-shared \
\
-o elf/ld.so.new \
\
-Wl,-z,combreloc -Wl,-z,relro -Wl,-z,defs -Wl,-z,now \
elf/librtld.os \
-Wl,--version-script=ld.map \
-Wl,-soname=ld-linux-ia64.so.2 \
-Wl,-defsym=_begin=0
And behold! Resulting ld.so can load simple binaries:
# elf/ld.so.new --library-path ../image/lib /usr/bin/cal
December 2015
Su Mo Tu We Th Fr Sa
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31
# readelf -d /usr/bin/cal | grep GNU_HASH
0x000000006ffffef5 (GNU_HASH) 0x4000000000000270
Even if these binaries contain GNU_HASH themselves. That is unexpected.
Thus the first workaround to get working glibc working on ia64 is by tweaking LDFLAGS:
LDFLAGS="-Wl,--hash-style=sysv" emerge -av sys-libs/glibc
down the rabbit hole
But why does SIGSEGV happen in the first place? Who is at fault here?
The primary suspects are gcc, glibc and binutils.
What is the difference between SONAME and GNU_HASH sections? All the faulty code does is storing pointer to ElfW(Dyn):
void __attribute__ ((unused, noinline))
(struct link_map *l, ElfW(Dyn) *dyn)
elf_get_dynamic_info_addr_tag {
(Dyn) **info = l->l_info;
ElfW
[DT_ADDRTAGIDX (dyn->d_tag) + DT_NUM + DT_THISPROCNUM
info+ DT_VERSIONTAGNUM + DT_EXTRANUM + DT_VALNUM] = dyn;
}
I spent some time trying to write simple example to reproduce the crash. No matter how hard I tried all samples work just fine. The best approximation is:
#include <stdio.h>
struct dyn_t {
long d_tag;
long foo;
};
struct g_t {
long b[307];
long a[8];
struct dyn_t *p[77];
};
struct g_t _l __attribute__ ((visibility ("hidden"), section (".sdata")));
static void __attribute__ ((unused, noinline))
(struct g_t * g, struct dyn_t * dyn)
f{
struct dyn_t **info = g->p;
[(0x6ffffeff - dyn->d_tag) + 66] = dyn;
info}
void __attribute__ ((unused, noinline))
(struct dyn_t * dyn)
foo {
(&_l, dyn);
f}
int __attribute__ ((noinline))
(int argc) {
mainstruct dyn_t d = { argc + 66 };
(&d);
foo
return _l.p[0] != 0;
}
When built as -O2 -fPIC it generates very similar asm for function f (immediate around addl r14 is the only deviation):
-r -d a
$ objdump
....constprop.0>:
<f] ld8 r14=[r32];; # read dyn->d_tag
0b 70 00 40 18 10 [MMI70 00 24 40 c0 shladd r15=r14,3,r0 # multiply dyn->d_tag by 8
f0 9f addl r14=-1316,r1;; # ?
c1 ed d7 ] ld8 r14=[r14];; # ??
0b 70 00 1c 18 10 [MMI00 3b 0e 42 00 adds r14=992,r14 # ???
e0 nop.i 0x0;;
00 00 04 00 ] nop.m 0x0
09 00 00 00 01 00 [MMI70 3c 0a 40 00 sub r14=r14,r15 # compute pointer in info[]
e0 nop.i 0x0;;
00 00 04 00 ] st8 [r14]=r32 # perform store
11 00 80 1c 98 11 [MIBnop.i 0x0
00 00 00 02 00 80 .ret.sptk.many b0;;
08 00 84 00 br ...
gcc’s dump is slightly more readable:
-unknown-linux-gnu-gcc -O2 -S a.c -o a.S
$ ia64
...f.constprop.0:
.prologue
.body
.mmir14 = [r32]
ld8 ;;
r15 = r14, 3, r0
shladd r14 = @ltoffx(_l#+15032385536), r1
addl ;;
.mmi.mov r14 = [r14], _l#+15032385536
ld8;;
r14 = 992, r14
adds nop 0
;;
.mminop 0
sub r14 = r14, r15
nop 0
;;
.mib[r14] = r32
st8 nop 0
.ret.sptk.many rp
br.constprop.0# .endp f
At line adds r14 = 992, r14 register r14 should contain absolute address of _l#+15032385536+992. But where that huge number comes from, why we don’t see it in objdump?
The magic is in ld8.mov r14 = [r14], _l#+15032385536 line. It instructs assembly to load this (absolute) addres somewhere from ltoffx(_l#+15032385536) + r1 (aka -1316 + r1).
I returned to broken ld.so and checked how this same relocation looks:
# gdb -q elf/ld.so
Reading symbols from elf/ld.so...done.
(gdb) run
Starting program: /var/tmp/portage/sys-libs/glibc-2.22-r1/work/build-ia64-ia64-unknown-linux-gnu-nptl/elf/ld.so
Failed to read a valid object file image from memory.
Program received signal SIGSEGV, Segmentation fault.
0x200000080000a8b0 in elf_get_dynamic_info_addr_tag (dyn=0x200000080004e4b0, l=0x2000000800053178 <_rtld_local+2456>)
at get-dynamic-info.h:33
33 + DT_VERSIONTAGNUM + DT_EXTRANUM + DT_VALNUM] = dyn;
(gdb) disassemble
Dump of assembler code for function elf_get_dynamic_info_addr_tag:
0x200000080000a880 <+0>: [MMI] ld8 r14=[r32];;
0x200000080000a881 <+1>: shladd r15=r14,3,r0
0x200000080000a882 <+2>: addl r14=163120,r1;;
0x200000080000a890 <+16>: [MMI] ld8 r14=[r14];;
0x200000080000a891 <+17>: adds r14=992,r14
0x200000080000a892 <+18>: nop.i 0x0;;
0x200000080000a8a0 <+32>: [MMI] nop.m 0x0
0x200000080000a8a1 <+33>: sub r14=r14,r15
0x200000080000a8a2 <+34>: nop.i 0x0;;
=> 0x200000080000a8b0 <+48>: [MIB] st8 [r14]=r32
0x200000080000a8b1 <+49>: nop.i 0x0
0x200000080000a8b2 <+50>: br.ret.sptk.many b0;;
(gdb) print *(void**)(163120+$r1)
$1 = (void *) 0x3800527e0
(gdb) quit
# readelf -r elf/ld.so | fgrep 3800527e0
0000000526b0 00000000006f R_IA64_REL64LSB 3800527e0
# objdump -R elf/ld.so | fgrep 3800527e0
00000000000526b0 REL64LSB *ABS*+0x00000003800527e0
This actually is an absolute relocation (contains absolute address). I did not expect such things to be present in PIC mode. Such relocations work for normal binaries but don’t for ld.so for a simple reason: ld.so did not adjust any relocations yet.
ld’s own section info is read at rtld.c:382 and relocations are applied later at rtld.c:397.
SONAME section does not use R_IA64_REL64LSB relocation while GNU_HASH does. It explains the crash but does not explain why generated code is different.
the cause
The answer is in the method how gcc optimises the following code:
(&_rtld_local._dl_rtld_map, NULL); elf_get_dynamic_info_addr_tag
Here is a lot of constants to expand:
void __attribute__ ((unused, noinline))
(struct link_map *l, ElfW(Dyn) *dyn)
elf_get_dynamic_info_addr_tag {
(Dyn) **info = l->l_info;
ElfW
[DT_ADDRTAGIDX (dyn->d_tag) + DT_NUM + DT_THISPROCNUM
info+ DT_VERSIONTAGNUM + DT_EXTRANUM + DT_VALNUM] = dyn;
// or
// info[(0x6ffffeff - dyn->d_tag) + 66] = dyn;
}
gcc infers that elf_get_dynamic_info_addr_tag() gets a constant _rtld_local._dl_rtld_map (aka _rtld_local+2456) as it’s first argument l and specialises function into 1-argument variant:
void __attribute__ ((unused, noinline))
(ElfW(Dyn) *dyn)
elf_get_dynamic_info_addr_tag_constprop {
// p is of type __attribute__ ((section (".sdata")))
static const ElfW(Dyn) ** p = &_rtld_local._dl_rtld_map.l_info[0x6ffffeff + 66];
[-dyn->d_tag] = dyn;
p}
To compile that code gcc infers the following facts about p:
- offset from _rtld_local to p exceeds 22-bit value (limit of addl <imm22>, gp instruction): (0x6ffffeff + 66) * 8 + 2520 = 0x3800003e0 = 0x380000000 + 992
- gcc decides to push p address to .got and load it from there
The workaround to avoid .got reload is simple (but fragile): force gcc to compute final offset first:
void __attribute__ ((unused, noinline))
(struct link_map *l, ElfW(Dyn) *dyn)
elf_get_dynamic_info_addr_tag {
(Dyn) **info = l->l_info;
ElfW
long o = DT_ADDRTAGIDX (dyn->d_tag) + DT_NUM + DT_THISPROCNUM
+ DT_VERSIONTAGNUM + DT_EXTRANUM + DT_VALNUM;
[o] = dyn;
info}
I’ve tried it on a toy example first. Hack made gcc avoid ltoffx and use small gprel offset:
f.constprop.0:
.prologue
.body
.mlxr15 = [r32]
ld8 r14 = 1879048001
movl ;;
.miisub r14 = r14, r15
r15 = @gprel(_l#+2520), gp
addl ;;
r14 = r14, 3, r15
shladd ;;
.mib[r14] = r32
st8 nop 0
.ret.sptk.many rp br
or the same in final binary:
.constprop.0>:
<f[MLX] ld8 r15=[r32]
05 78 00 40 18 d0 r14=0x6fffff41;;
6f 00 00 00 00 c0 movl 67
11 f4 fb ] sub r14=r14,r15
03 70 38 1e 05 20 [MII60 07 12 48 c0 addl r15=1260,r1;;
f0 78 48 80 shladd r14=r14,3,r15;;
e1 ] st8 [r14]=r32
11 00 80 1c 98 11 [MIBnop.i 0x0
00 00 00 02 00 80 .ret.sptk.many b0;;
08 00 84 00 br] nop.m 0x0
08 00 00 00 01 00 [MMInop.m 0x0
00 00 00 02 00 00 nop.i 0x0 00 00 04 00
No memory loads, only a single store \o/.
A workaround is sent to libc-alpha ML for review.
wrapping up
Random observations:
LDFLAGS=-Wl,--hash-style=sysv is a simple way to get modern glibc work on ia64
uninlining things did not make bug disappear
dynamic linkers are simple yet delicate at bootstrap phase when no relocations are adjusted
bug does not happen on O1 optimisation level (triggered by -fipa-cp knob)
the workaround is weak and can break at any time in future
perhaps gcc could be smarter to use MLX instruction to embed large offset:
= @gprel(_l#+15032385536) movl r32 add r32 = r32, gp
But separate section for _l makes things more complicated.
it took me a day to get -Wl,--hash-style=sysv workaroud and 6 days to figure out why it works
ld.so uses ELF relocations extensively and sets them up in C code
gdb’s output is misleading for specialised functions (the argument order is flipped)
Have fun!