ghc and static linkage

TL;DR variant:

# force ld to inject binary-local search path
$ cabal configure --ghc-option=-optl-Wl,-rpath='$ORIGIN'
$ cabal build
$ cp /usr/lib/$YOURONE dist/build/.../

Some thoughts:

If you ever tried to distribute binaries built with ghc, you should know what I’m talking about. ghc is a huge compiler with huge runtime, so there is some things to note.

Let’s explore minimal binary a bit before linking more advanced project.

-- minimal.hs
main = print 1
$ ghc --make minimal.hs
    [1 of 1] Compiling Main             ( minimal.hs, minimal.o )
    Linking minimal ...
$ ./minimal
$ ldd minimal =>  (0x00007fffc0dff000) => /usr/lib64/ (0x00007f1dd6233000) => /lib64/ (0x00007f1dd5fb1000) => /lib64/ (0x00007f1dd5da8000) => /lib64/ (0x00007f1dd5ba4000) => /lib64/ (0x00007f1dd581d000) => /lib64/ (0x00007f1dd5600000)
    /lib64/ (0x00007f1dd64a2000)

Looks good. All the libraries are GNU libc’s libraries except for one. GMP changed it’s ABI recently and it’s reflected it in it’s SONAME:

As gmp is an LGPL library, so we would like to ship the lib separately. Sometimes user would like to use host’s lib instead.

$ ldd /usr/lib/ =>  (0x00007fff2dbff000) => /lib64/ (0x00007f1383d44000)
    /lib64/ (0x00007f1384367000)

As we see it’s feasible as GMP does not use weird things. We see only libc in deps as well.

When some binary is loaded by the kernel, kernel either maps the binary and handovers control to it’s entry point (simple a.out case), or passes control to the initerpreter. For most of dinamically linked programs the interpreter is stored in .interp section (INTERP program header):

$ readelf -l /bin/ls
  INTERP         0x0000000000000270 0x0000000000400270 0x0000000000400270
  [Requesting program interpreter: /lib64/]

So, my interpreter is libc’s /lib64/ loader. It’s called indirectly when i run binary as

$ /bin/ls

and directly when I run it as:

$ /lib64/ /bin/ls

Interpreter can take different commandline arguments (like —-library-path / —-library-rpath), It can also understand environment variables to adjust it’s behavior, like (LD_LIBRARY_PATH, LD_DEBUG, LD_PRELOAD).

Those allow us override library searchpath defined in /etc/, force-inejct other libraries to hook some libc function (see tiny tsocks project as an example).

But we can also inject search paths to an ELF file. It’s an:

You can either pass absolute path (or relative to current working directory) or a special magic value describing the directory where our ran binary lies: the ‘$ORIGIN’ value.

This technique is used commonly by relocateable software with shared libraries, like wine:

$ readelf -a `which wine` | grep PATH
    0x0000000f (RPATH)      Library rpath: [$ORIGIN/../lib32]
    0x0000001d (RUNPATH)    Library runpath: [$ORIGIN/../lib32]

It allows bin/wine to load it’s libs from sibling directory: bin/../lib.

So, the cabal configure –ghc-option=-optl-Wl,-rpath=’$ORIGIN’ trick allows us distribute all needed shared libraries with built binary.

Another aproach would be to attempt to link everything staically, but glibc uses dlopen() for encoding conversion (the iconv() call). And ghc uses iconv heavily when performs I/O in String types.

$ ghc --make -fforce-recomp -optl-static -optl-pthread  minimal.hs
    [1 of 1] Compiling Main             ( minimal.hs, minimal.o )
    Linking minimal ...
    /usr/lib64/ghc-7.0.4/libHSrts.a(Linker.o): In function `internal_dlopen':
    Linker.c:(.text+0x11f4): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
$ ./minimal
    minimal: mkTextEncoding: invalid argument (Invalid argument)

The error comes from ghc-7.0.4/libraries/base/GHC/IO/Encoding/Iconv.hs : newIConv: it calls iconv_open().

As we see static linking does not even work. It seems to be a long hanging bug in glibc though, as it’s manual explicitely allows static linking, but requires target system would have charset conversion shared libraries.

Just use -rpath thing and/or distibute your stuff via hackage :]

Posted on July 16, 2011
comments powered by Disqus