Chapter 7 Linking
Linking is the process of collecting and combining various pieces of code and data into a single file that can be loaded (copied) into memory and executed.
Linking can be performed
- At compile time, when the source code is translated into machine code
- At load time, when the program is loaded into memory and executed by the loader
- At run time, by application programs
Linkers play a crucial role in software development because they enable separate compilation. We can saperate a hugh software into different modules. When we change one of these modules, we simply recompile it and relink the application, without having to recompile the other files.
Why bother learning about linking?
- Understanding linkers will help you build large programs
- Understanding linkers will help you avoid dangerous programming errors
- Understanding linking will help you understand how language scoping rules are implemented.
- Understanding linking will help you understand other important systems concepts
- Understanding linking will enable you to exploit shared libraries
This chapter provides a thorough discussion of all aspects of linking, from traditional static linking, to dynamic linking of shared libraries at load time, to dynamic linking of shared libraries at run time.
7.1 Compiler Drivers
Most compilation systems provide a compiler driver that invokes the language preprocessor, compiler, assembler, and linker, as needed on behalf of the user
Let’s look at a concrete example
1 | //source of main.c |
1 | //source of sum.c |
To compile these two C source file into a executable file, we will simply use gcc like this:
1 | $ gcc -Og -o program main.c sum.c |
In fact a lot of things happened silently, let’s dig deeper
1 | $ gcc -Og -o program main.c sum.c --verbose 2>compilelog |
(In some version of gcc, the cpp
is integrated into the compile driver)
1 | $ /usr/lib/gcc/x86_64-linux-gnu/7/cc1 -quiet -v -imultiarch x86_64-linux-gnu main.c -quiet -dumpbase main.c -mtune=generic -march=x86-64 -auxbase main -Og -version -fstack-protector-strong -Wformat -Wformat-security -o /tmp/cc4yMXEf.s |
1 | $ as -v --64 -o /tmp/ccJm9qgp.o /tmp/cc4yMXEf.s |
1 | $ /usr/lib/gcc/x86_64-linux-gnu/7/cc1 -quiet -v -imultiarch x86_64-linux-gnu sum.c -quiet -dumpbase sum.c -mtune=generic -march=x86-64 -auxbase sum -Og -version -fstack-protector-strong -Wformat -Wformat-security -o /tmp/cc4yMXEf.s |
1 | $ as -v --64 -o /tmp/ccCtbZSy.o /tmp/cc4yMXEf.s |
1 | $ /usr/lib/gcc/x86_64-linux-gnu/7/collect2 |
7.2 Static Linking
Static linkers such as the Linux ld
program take as input a collection of relocatable object files and command-line arguments and generate as output a fully linked executable object file that can be loaded and run.
To build the executable, the linker must perform two main tasks:
Symbol resolution
Object files define and reference symbols, where each symbol corresponds to a function, a global variable, or a static variable.
The purpose of symbol resolution is to associate each symbol reference with exactly one symbol definition
Relocation
Compilers and assemblers generate code and data sections that start at address 0
The linker relocates these sections by associating a memory location with each symbol definition, and then modifying all of the references to those symbols so that they point to this memory location.
The linker blindly performs these relocations using detailed instructions, generated by the assembler, called relocation entries.
7.3 Object Files
Object files come in three forms:
Relocatable object file.
Contains binary code and data in a form that can be combined with other relocatable object files at compile time to create an executable object file.
Executable object file
Contains binary code and data in a form that can be copied directly into memory and executed
Shared object file
A special type of relocatable object file that can be loaded into memory and linked dynamically, at either load time or run time
Compilers and assemblers generate relocatable object files
Linkers generate executable object files.
Object files are organized according to specific object file formats, which vary from system to system.
- The first Unix systems from Bell Labs used the
a.out
format. - Windows uses the Portable Executable
PE
format - Mac OS-X uses the
Mach-O
format. - Modern x86-64 Linux and Unix systems use Executable and Linkable Format
ELF
.
7.4 Relocatable Object Files
The ELF header begins with a 16-byte sequence that describes the word size and byte ordering of the system that generated the file
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22# -h stand for --file-header
$ readelf -h sum.o
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 528 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 11
Section header string table index: 10The rest of the ELF header contains information that allows a linker to parse and interpret the object file.
- size of the ELF header
- the object file type (e.g., relocatable, executable, or shared)
- the machine type (e.g., x86-64)
- ….
The locations and sizes of the various sections are described by the
section header table
, which contains a fixed-size entry for each section in the object file.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34# -S stand for --section-headers
$ readelf -S sum.o
There are 11 section headers, starting at offset 0x210:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000000 00000040
000000000000001b 0000000000000000 AX 0 0 1
[ 2] .data PROGBITS 0000000000000000 0000005b
0000000000000000 0000000000000000 WA 0 0 1
[ 3] .bss NOBITS 0000000000000000 0000005b
0000000000000000 0000000000000000 WA 0 0 1
[ 4] .comment PROGBITS 0000000000000000 0000005b
000000000000002a 0000000000000001 MS 0 0 1
[ 5] .note.GNU-stack PROGBITS 0000000000000000 00000085
0000000000000000 0000000000000000 0 0 1
[ 6] .eh_frame PROGBITS 0000000000000000 00000088
0000000000000030 0000000000000000 A 0 0 8
[ 7] .rela.eh_frame RELA 0000000000000000 000001a0
0000000000000018 0000000000000018 I 8 6 8
[ 8] .symtab SYMTAB 0000000000000000 000000b8
00000000000000d8 0000000000000018 9 8 8
[ 9] .strtab STRTAB 0000000000000000 00000190
000000000000000b 0000000000000000 0 0 1
[10] .shstrtab STRTAB 0000000000000000 000001b8
0000000000000054 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)A typical ELF relocatable object file contains the following sections:
.text
The machine code of the compiled program..rodata
Read-only data such as the format strings in printf statements, and jump tables for switch statements..data
==Initialized== global and static C variables. Local C variables are maintained at run time on the stack and do not appear in either the.data
or.bss
sections..bss
Uninitialized global and static C variables, along with any global or static variables that are initialized to zeroThis section occupies no actual space in the object file, uninitialized variables do not have to occupy any actual disk space in the object file. At run time, these variables are allocated in memory with an initial value of zero.
.symtab
A symbol table with information about functions, static local variables and global variables(NO nonstatic local variables) that are defined and referenced in the program.Every relocatable object file has a symbol table in
.symtab
, unless the programmer has specifically removed it with thestrip
command.1
2
3
4
5
6
7
8
9
10
11
12
13
14# -s stand for --symbols
$ readelf -s sum.o
Symbol table '.symtab' contains 9 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS sum.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 2
4: 0000000000000000 0 SECTION LOCAL DEFAULT 3
5: 0000000000000000 0 SECTION LOCAL DEFAULT 5
6: 0000000000000000 0 SECTION LOCAL DEFAULT 6
7: 0000000000000000 0 SECTION LOCAL DEFAULT 4
8: 0000000000000000 27 FUNC GLOBAL DEFAULT 1 sum1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23# -R stand for remove
$ strip -R symtab -o stripped_sum.o sum.o
$ readelf -s stripped_sum.o
$ readelf -S stripped_sum.o
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000000 00000040
000000000000001b 0000000000000000 AX 0 0 1
[ 2] .data PROGBITS 0000000000000000 0000005b
0000000000000000 0000000000000000 WA 0 0 1
[ 3] .bss NOBITS 0000000000000000 0000005b
0000000000000000 0000000000000000 WA 0 0 1
[ 4] .comment PROGBITS 0000000000000000 0000005b
000000000000002a 0000000000000001 MS 0 0 1
[ 5] .note.GNU-stack PROGBITS 0000000000000000 00000085
0000000000000000 0000000000000000 0 0 1
[ 6] .eh_frame PROGBITS 0000000000000000 00000088
0000000000000030 0000000000000000 A 0 0 8
[ 7] .shstrtab STRTAB 0000000000000000 000000b8
000000000000003f 0000000000000000 0 0 1.rel.text
A list of locations in the.text
section that will need to be modified when the linker combines this object file with others.In general, any instruction that calls an external function or references a global variable will need to be modified.
1
2
3
4
5
6# -r stand for --relocs
$ readelf -r sum.o
Relocation section '.rela.eh_frame' at offset 0x1a0 contains 1 entry:
Offset Info Type Sym. Value Sym. Name + Addend
000000000020 000200000002 R_X86_64_PC32 0000000000000000 .text + 0.rel.data
Relocation information for any global variables that are referenced or defined by the module. In general, any initialized global variable whose initial value is the address of a global variable or externally defined function will need to be modified..debug
A debugging symbol table with entries for local variables and typedefs defined in the program, global variables defined and referenced in the program, and the original C source file..line
A mapping between line numbers in the original C source program and machine code instructions in the.text
sectionThey are only present if the compiler driver is invoked with the
-g
option..strtab
This section holds strings, most commonly the strings that represent the names associated with symbol table entries. A string table is a sequence of null-terminated character strings.
If you want to know more about ELF
: man elf
The use of the term
.bss
to denote uninitialized data is universal. It was originally an acronym for the “block started by symbol” directive from the IBM 704 assembly language (circa 1957) and the acronym has stuck. A simple way to remember the difference between the .data and .bss sections is to think of “bss” as an abbreviation for “Better Save Space!”
7.5 Symbols and Symbol Tables
Each relocatable object module, m, has a symbol table that contains information about the symbols that are defined and referenced by m.
In the context of a linker, there are three different kinds of symbols:
Global symbols that are defined by module m and that can be referenced by other modules.
Global linker symbols correspond to nonstatic C functions and global variables
Global symbols that are referenced by module m but defined by some other module.
Such symbols are called externals and correspond to nonstatic C functions and global variables that are defined in other modules.
Local symbolsthat are defined and referenced exclusively by module m. These correspond to static C functions and global variables that are defined with the static attribute. These symbols are visible anywhere within module m, but cannot be referenced by other modules.
For example
1 | //source of fun.c |
1 | $ readelf -s fun.o |
Notice that each static
variable is given a unique name, such as x.1795
and x.1800
, they are different variables!
C programmers use the
static
attribute to hide variable and function declarations inside modules, much as you would usepublic
andprivate
declarations in Java and C++. In C, source files play the role of modules. Any global variable or function declared with the static attribute is private to that module. Similarly, any global variable or function declared without the static attribute is public and can be accessed by any other module. It is good programming practice to protect your variables and functions with the static attribute wherever possible.
An ELF symbol table is contained in the .symtab
section. It contains an array of entries
1 | typedef struct { |
The
name
is a byte offset into the string table that points to the null-terminated string name of the symbol.(For relocatable modules, the value is an offset from the beginning of the section where the object is defined. )
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22$ readelf -p .strtab fun.o
String dump of section '.strtab':
[ 1] fun.c
[ 7] staticlocal.1796
[ 18] x.1795
[ 1f] x.1800
[ 26] globalvariable
[ 35] f
[ 37] g
$ readelf -x .symtab fun.o
.....
0x00000130 0e000000 00000000 37000000 12000100
0x00000140 0e000000 00000000 0e000000 00000000
.....
# beware that data is in little-endian mode
# int name = 0x00000037; (g)
# char typebinding = 0b00010010; (FUNC GLOBAL)
# char reserved = 0x00;
# short section = 0x0001; (in .text section)
# long value = 0x000000000000000e; (000000000000000e)
# long value = 0x000000000000000e; (14 bytes)
Each symbol is assigned to some section of the object file, denoted by the section field, which is an index into the section header table
There are three special pseudosections that don’t have entries in the section header table:
ABS
(absolute) is for symbols that should not be relocated.UNDEF
is for undefined symbols—that is, symbols that are referenced in this object module but defined elsewhere.COMMON
is for uninitialized data objects that are not yet allocated
For COMMON
symbols, the value field gives the alignment requirement, and size gives the minimum size.
Note that these pseudosections exist only in relocatable object files; they do not exist in executable object files.
The distinction between COMMON
and .bss
is subtle. Modern versions of gcc assign symbols in relocatable object files to COMMON
and .bss
using the following convention:
COMMON
Uninitialized global variables.bss
Uninitialized static variables and global or static variables that are initialized to zero
The distinction stems from the way the linker performs symbol resolution
Practice Problem 7.1
This problem concerns the
m.o
andswap.o
modules from the following code.For each symbol that is defined or referenced in
swap.o
, indicate whether or not it will have a symbol table entry in the.symtab
section in moduleswap.o
.If so, indicate the module that defines the symbol (swap.o or m.o), the symbol type (local, global, or extern), and the section (.text, .data, .bss, or COMMON) it is assigned to in the module.
Symbol .symtab entry? Symbol type Module where defined Section Buf Bufp0 bufp1 swap temp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 //source of swap.c
extern int buf[];
int *bufp0 = &buf[0];
int *bufp1;
void swap()
{
int temp;
bufp1 = &buf[1];
temp = *bufp0;
*bufp0 = *bufp1;
*bufp1 = temp;
}
1
2
3
4
5
6
7
8
9
10 //source of m.c
void swap();
int buf[2] = {1, 2};
int main()
{
swap();
return 0;
}
My solution : :white_check_mark:
Symbol | .symtab entry? | Symbol type | Module where defined | Section |
---|---|---|---|---|
Buf | YES | Extern | m.o | UNDEF |
Bufp0 | YES | Global | swap.o | .data |
bufp1 | YES | Global | swap.o | COMMON |
swap | YES | Global | swap.o | .text |
temp | NO | local | swap.o | NO |
Verification:
1 | $ readelf -s swap.o |
7.6 Symbol Resolution
The linker resolves symbol references by associating each reference with exactly one symbol definition from the symbol tables of its input relocatable object files
The compiler allows only one definition of each local symbol per module.
The compiler also ensures that static local variables, which get local linker symbols, have unique names.(
x.1780
andx.1775
, for example)When the compiler encounters a symbol (either a variable or function name) that is not defined in the current module, it assumes that it is defined in some other module, generates a linker symbol table entry, and leaves it for the linker to handle.
If the linker is unable to find a definition for the referenced symbol in any of its input modules, it prints an error message and terminates
For instance
1
2
3
4
5void fun(void);
int main(){
fun();
return 0;
}1
2
3
4$ gcc -o error error.c
/tmp/cc10TnoD.o: In function `main':
error.c:(.text+0x5): undefined reference to `fun'
collect2: error: ld returned 1 exit statusMultiple object modules might define global symbols with the same name, the linker must report an error or choose one as the definition and discard the rest.
7.6.1 How Linkers Resolve Duplicate Symbol Names
Here is the approach that Linux compilation systems use.
At compile time, the compiler exports each global symbol to the assembler as either
strong
orweak
, and the assembler encodes this information implicitly in the symbol table of the relocatable object file.Functions and initialized global variables get strong symbols. Uninitialized global variables get weak symbols.
Linux linkers use the following rules for dealing with duplicate symbol names:
- Rule 1. Multiple strong symbols with the same name are not allowed
- Rule 2. Given a strong symbol and multiple weak symbols with the same name, choose the strong symbol.
- Rule 3. Given multiple weak symbols with the same name, choose any of the weak symbols.(Can be viewed as an undefined behavior!!!)
Tons of bugs can happen when using global variables between different modules, that is why someone say that it is a bad idea to use global variables, be careful when you using it!!! Compile with the following parameters
-fno-common
: report error if encounters multiply defined global symbols-Werror
: turns all warnings into errors.-Wall
: enables all the warnings
And this is the reason for COMMON
and .bss
section
- When compiler encounters a
static
local symbol, it can confidently say that it is the definition. - When the compiler encounters global symbos which is initialized to 0, it can say that it is a strong symbol and put it in
.bss
- Only when the compiler encounters an uninitialized global symbol, it can’t guarantee that it is a definition, since it is a weak symbol, so put it in
COMMON
Practice Problem 7.2
In this problem, let
REF(x.i) -> DEF(x.k)
denote that the linker will associate an arbitrary reference to symbol x in module i to the definition of x in module k.For each example that follows, use this notation to indicate how the linker would resolve references to the multiply-defined symbol in each module.
If there is a link-time error (rule 1), write “error”. If the linker arbitrarily chooses one of the definitions (rule 3), write “unknown”.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30 //A
//Module 1
int main(){}
//Module 2
int main;
int p2(){}
/*
(a)REF(main.1) -> DEF(______)
(b)REF(main.2) -> DEF(______)
*/
//B
//Module 1
void main(){}
//Module 2
int main = 1;
int p2(){}
/*
(a)REF(main.1) -> DEF(_____)
(b)REF(main.2) -> DEF(_____)
*/
//C
//Module 1
int x;
void main(){}
//Module 2
double x = 1.0;
int p2(){}
/*
(a)REF(x.1) -> DEF(_____)
(b)REF(x.2) -> DEF(_____)
My solution : :white_check_mark:
1 | A: |
7.6.2 Linking with Static Libraries
In practice, all compilation systems provide a mechanism for packaging related object modules into a single file called a static library, which can then be supplied as input to the linker
When it builds the output executable, the linker copies only the object modules in the library that are referenced by the application program.
Library functions can be compiled into separate object modules and then packaged in a single static library file.
Application programs can then use any of the functions defined in the library by specifying a single filename on the command line
1
2# program use C standard library
$ gcc main.c /usr/lib/libc.aAt link time, the linker will only copy the object modules that are referenced by the program, which reduces the size of the executable on disk and in memory.
On Linux systems, static libraries are stored on disk in a particular file format known as an archive
.
- An archive is a collection of concatenated relocatable object files, with a header that describes the size and location of each member object file.
- Archive filenames are denoted with the
.a
suffix.
Let’s make our own little static library for illustration
1 | //source of addvec.c |
1 | //source of mulvec.c |
1 | $ gcc -c addvec.c mulvec.c |
1 | //source of vector.h |
1 | //source of main.c |
1 | $ gcc -c main.c |
- The
-static
argument tells the compiler driver that the linker should build a fully linked executable object file that can be loaded into memory and run without any further linking at load time - Since the program doesn’t reference any symbols defined by
mulvec.o
, the linker does not copy this module into the executable - The linker also copies the
printf.o
module fromlibc.a
, along with a number of other modules from the C run-time system.
7.6.3 How Linkers Use Static Libraries to Resolve References
During the symbol resolution phase, the linker scans the relocatable object files and archives left to right in the same sequential order that they appear on the compiler driver’s command line.
During this scan, the linker maintains
- a set E of relocatable object files that will be merged to form the executable,
- a set U of unresolved symbols (i.e., symbols referred to but not yet defined),
- a set D of symbols that have been defined in previous input files.
- Initially, E, U, and D are empty.
1 | # scan process |
This algorithm can result in some baffling link-time errors because the ordering of libraries and object files on the command line is significant.
If the library that defines a symbol appears on the command line before the object file that references that symbol, then the reference will not be resolved and linking will fail(Library should always comes after object files)
For example
1
2
3
4
5
6$ gcc -static -o prog ./libvec.a main.o
main.o: In function `main':
main.c:(.text+0x1f): undefined reference to `addvec'
collect2: error: ld returned 1 exit status
# U is initialized to empty, so if libraries come first, no object in archive will be added to E.
# Then the `addevec` cannot be solved laterlyLibraries can be repeated on the command line if necessary to satisfy the dependence requirements.
(Object files do not need to be repeated since they are added to E immediately after appearance)
Practice Problem 7.3
Let
a
andb
denote object modules or static libraries in the current directory, and leta→b
denote that a depends on b, in the sense that b defines a symbol that is referenced by a.For each of the following scenarios, show the minimal command line (i.e., one with the least number of object file and library arguments) that will allow the static linker to resolve all symbol references.
1
2
3 A. p.o → libx.a
B. p.o → libx.a → liby.a
C. p.o → libx.a → liby.a and liby.a → libx.a → p.o
My solution : :white_check_mark:
1 | # A |
7.7 Relocation
After symbol resolution, the linker knows the exact sizes of the code and data sections in its input object modules. It is now ready to begin the relocation step, where it merges the input modules and assigns run-time addresses to each symbol
Relocation consists of two steps
- Relocating sections and symbol definitions
- The linker merges all sections of the same type into a new aggregate section of the same type
- The linker then assigns run-time memory addresses to the new aggregate sections, to each section defined by the input modules, and to each symbol defined by the input modules
- Relocating symbol references within sections
- the linker modifies every symbol reference in the bodies of the code and data sections so that they point to the correct run-time addresses.
- To perform this step, the linker relies on data structures in the relocatable object modules known as relocation entries, which we describe next.
7.7.1 Relocation Entries
Whenever the assembler encounters a reference to an object whose ultimate location is unknown, it generates a relocation entry that tells the linker how to modify the reference when it merges the object file into an executable
Relocation entries for code are placed in .rel.text
. Relocation entries for data are placed in .rel.data
.
1 | typedef struct { |
1 | $ readelf -r main.o |
- The
offset
is the section offset of the reference that will need to be modified. - The
symbol
identifies the symbol that the modified reference should point to. - The
type
tells the linker how to modify the new reference. - The
addend
is a signed constant that is used by some types of relocations to bias the value of the modified reference.
ELF defines 32 different relocation types, we will see two of them
R_X86_64_PC32
- Relocate a reference that uses a 32-bit PC-relative address
R_X86_64_32
- Relocate a reference that uses a 32-bit absolute address
7.7.2 Relocating Symbol References
Relocation algorithm
1 | foreach section s { |
Relocating PC-Relative References
1 | # sum here is a reference |
The assembler will generate a relocate entry for sum
in .rela.text
section
Later the when the linker executing the relocation algorithm:
- Calculate the address of
sum
in symbol table
1 | refptr = s + r.offset; |
- Modify it using PC relative principle
1 | *refptr = (unsigned) (ADDR(r.symbol) + r.addend - refaddr); |
- The symbol’s address is determined and stored in
Elf64_Symbol.value
1 | Symbol table '.symtab' contains 12 entries: |
- After that, every call to
sum
will use the address find in.symtab
Relocating Absolute References
Almost the same with PC relative method, just put the ADDR(s.symbol)
directly into the Elf64_Symbol.value
field
1 | *refptr = (unsigned) (ADDR(r.symbol) + r.addend); |
7.8 Executable Object Files
Our example C program, which began life as a collection of ASCII text files, has been transformed into a single binary file that contains all of the information needed to load the program into memory and run it.
- The format of an executable object file is similar to that of a relocatable object file.(But NOT identical !)
- It includes the program’s entry point, which is the address of the first instruction to execute when the program runs
- The
.init
section defines a small function, called_init
, that will be called by the program’s initialization code - Since the executable is fully linked (relocated), it needs no
.rel
sections.
ELF executables are designed to be easy to load into memory, with contiguous chunks of the executable file mapped to contiguous memory segments.
This mapping is described by the
program header table
.For any segment
s
in objective file, the linker must choose a starting address,vaddr
, such that1
vaddr % align == off % align
off
is the offset of the segment’s first section in the object filealign
is the alignment specified in the program header $(2^{21} = (200000)_{16})$
This alignment requirement is an optimization that enables segments in the object file to be transferred efficiently to memory when the program executes
Both readelf
and objdump
is useful
1 | # -l stand for --program-headers |
7.9 Loading Executable Object Files
If you want to run a non-built-in program, you type /path/program
to run it
The shell
runs for us by invoking some memory-resident operating system code known as the loader
- Any Linux program can invoke the
loader
by calling theexecve
function - The
loader
copies the code and data in the executable object file from disk into memory and then runs the program by jumping to its first instruction, orentry point
.
- When the
loader
runs, it creates a memory image - Guided by the
program header table
, it copies chunks of the executable object file into the code and data segments. - Next, the
loader
jumps to the program’sentry point
, which is always the address of the_start
function- This function is defined in the system object file
crt1.o
and is the same for all C programs. - The
_start
function calls thesystem startup function
,__libc_start_ main
, which is defined inlibc.so
. - It initializes the execution environment, calls the user-level
main
function, handles its return value, and if necessary returns control to the kernel.
- This function is defined in the system object file
7.10 Dynamic Linking with Shared Libraries
Static libraries still have some significant disadvantages
- Static libraries, like all software, need to be maintained and updated periodically(relinking required in every update)
- Almost every C program uses standard I/O functions. At run time, the code for these functions is duplicated in the text segment of each running process
Shared libraries are modern innovations that address the disadvantages of static libraries.
A shared library is an object module that, at either run time or load time, can be loaded at an arbitrary memory address and linked with a program in memory.
- This process is known as
dynamic linking
and is performed by a program called adynamic linker
- Shared libraries are also referred to as shared objects, and on Linux systems they are indicated by the
.so
suffix
Shared libraries are “shared” in two different ways.
- In any given file system, there is exactly one
.so
file for a particular library.- The code and data in this
.so
file are shared by all of the executable object files that reference the library - opposed to the contents of static libraries, which are copied and embedded in the executables that reference them
- The code and data in this
- A single copy of the
.text
section of a shared library in memory can be shared by different running processes
A concrete example:
1 | $ gcc -shared -fpic -o libvector.so addvec.c mulvec.c |
- The
-fpic
flag directs the compiler to generate position-independent code - The
-shared
flag directs the linker to create a shared object file
1 | $ gcc -o prog2l main2.c ./libvector.so |
The basic idea is to do some of the linking statically when the executable file is created, and then complete the linking process dynamically when the program is loaded.
The linker ONLY copies some relocation and symbol table information that will allow references to code and data in libvector.so to be resolved at load time
Execution
- When the loader loads and runs the executable
prog2l
, it loads the partially linked executable - Next, it notices that
prog2l
contains a.interp
section, which contains the path name of the dynamic linker (e.g.,ld-linux.so
on Linux systems) - the loader loads and runs the dynamic linker
- The dynamic linker then finishes the linking task by performing the following relocations:
- Relocating the
text
anddata
oflibc.so
into some memory segment - Relocating the
text
anddata
oflibvector.so
into another memory segment - Relocating any references in
prog2l
to symbols defined bylibc.so
andlibvector.so
- Relocating the
- Finally, the dynamic linker passes control to the application
7.11 Loading and Linking Shared Libraries from Applications
It is also possible for an application to request the dynamic linker to load and link arbitrary shared libraries while the application is running.
Linux systems provide a simple interface to the dynamic linker that allows application programs to load and link shared libraries at run time.
1 |
|
The
dlopen
function loads and links the shared library filename.The external symbols in filename are resolved using libraries previously opened with the
RTLD_ GLOBAL
flag.If the current executable was compiled with the
-rdynamic
flag, then its global symbols are also available for symbol resolution.The flag argument must include either
RTLD_NOW
, which tells the linker to resolve references to external symbols immediately, or theRTLD_LAZY
flag, which instructs the linker to defer symbol resolution until code from the library is executed.Either of these values can be
|
with theRTLD_GLOBAL
flag.
1 |
|
- The
dlsym
function takes a handle to a previously opened shared library and a symbol name and returns the address of the symbol, if it exists, orNULL
otherwise.
1 |
|
The dlclose
function unloads the shared library if no other shared libraries are still using it.
1 |
|
An example of using this interface to dynamically link libvector.so
shared library at run time and then invoke its addvec
routine
1 |
|
1 | $ gcc -rdynamic -o prog2r dll.c -ldl |
7.12 Position-Independent Code (PIC)
A key purpose of shared libraries is to allow multiple running processes to share the same library code in memory and thus save precious memory resources
Modern systems compile the code segments of shared modules so that they can be loaded anywhere in memory without having to be modified by the linker.
Users direct GNU compilation systems to generate PIC code with the
-fpic
option to gcc. Shared libraries must always be compiled with this option.
PIC Data References
Important fact : no matter where we load an object module (including shared object modules) in memory, the data segment is always the same distance from the code segment.
- Compilers that want to generate PIC references to global variables exploit this fact by creating a table called the
global offset table (GOT)
at the beginning of the data segment. - The GOT contains an 8-byte entry for each global data object (procedure or global variable) that is referenced by the object module.
- The compiler also generates a relocation record for each entry in the GOT.
- At load time, the dynamic linker relocates each GOT entry so that it contains the absolute address of the object
PIC Function Calls
The compiler has no way of predicting the run-time address of the function, since the shared module could be loaded anywhere at run time.
GNU compilation systems solve this problem using an interesting technique, called lazy binding
, that defers the binding of each procedure address until the first time the procedure is called.
Lazy binding is implemented with a compact yet somewhat complex interaction between two data structures: the GOT
and the procedure linkage table (PLT)
.
- The GOT is part of the data segment. The PLT is part of the code segment.
- The
PLT
is an array of 16-byte code entries.PLT[0]
is a special entry that jumps into the dynamic linkerPLT[1]
invokes the system startup function (__libc_start_main
), which initializes the execution environment, calls themain
function, and handles its return value.- Entries starting at
PLT[2]
invoke functions called by the user code.
- When used in conjunction with the
PLT
,GOT[0]
andGOT[1]
contain information that the dynamic linker uses when it resolves function addressesGOT[2]
is the entry point for the dynamic linker in theld-linux.so
module- Each of the remaining entries corresponds to a called function whose address needs to be resolved at run time. Each has a matching PLT entry.
- Initially, each GOT entry points to the second instruction in the corresponding PLT entry.
- Instead of directly calling
addvec
, the program calls intoPLT[2]
, which is thePLT
entry foraddvec
. - The first
PLT
instruction does an indirect jump throughGOT[4]
. Since eachGOT
entry initially points to the second instruction in its correspondingPLT
entry, the indirect jump simply transfers control back to the next instruction inPLT[2]
. - After pushing an ID for
addvec,(0x1)
onto the stack,PLT[2]
jumps toPLT[0]
PLT[0]
pushes an argument for the dynamic linker indirectly throughGOT[1]
and then jumps into the dynamic linker indirectly throughGOT[2]
.- The dynamic linker uses the two stack entries to determine the runtime location of
addvec
, overwritesGOT[4]
with this address, and passes control toaddvec
.
Most of the job are done by the dynamic linker during run time, including fill in the GOT
7.13 Library Interpositioning
Linux linkers support a powerful technique, called library interpositioning, that allows you to intercept calls to shared library functions and execute your own code instead.
Using interpositioning, you could trace the number of times a particular library function is called, validate and trace its input and output values, or even replace it with a completely different implementation.
Basic idea: Given some target function to be interposed on, you create a wrapper function whose prototype is identical to the target function.
Using some particular interpositioning mechanism, you then trick the system into calling the wrapper function instead of the target function. The wrapper function typically executes its own logic, then calls the target function and passes its return value back to the caller.
Interpositioning can occur at compile time, link time, or run time as the program is being loaded and executed.
Example program
1 | //int.c |
1 | //malloc.h |
1 | //malloc.c |
7.13.1 Compile-Time Interpositioning
use the C preprocessor to interpose at compile time.
The local malloc.h
header file instructs the preprocessor to replace each call to a target function with a call to its wrapper.
1 | $ gcc -DCOMPILETIME -c mymalloc.c |
Notice that the wrappers in mymalloc.c
are compiled with the standard malloc.h
header file(No -I
option specified)
7.13.2 Link-Time Interpositioning
The Linux static linker supports link-time interpositioning with the --wrap f
flag
- This flag tells the linker to resolve references to symbol
f
as__wrap_f
- and to resolve references to symbol
__real_f
asf
.
1 | $ gcc -D LINKTIME -c mymalloc.c |
- The
-Wl,option
flag passes option to the linker. - Each comma in option will be replaced with a space.
- So
-Wl,--wrap,malloc
passes--wrap malloc
to the linker
7.13.3 Run-Time Interpositioning
Compile-time interpositioning requires access to a program’s source files. Linktime interpositioning requires access to its relocatable object files
However, there is a mechanism for interpositioning at run time that requires access only to the executable object file
This fascinating mechanism is based on the dynamic linker’s LD_PRELOAD
environment variable.
If the LD_PRELOAD
environment variable is set to a list of shared library pathnames (separated by spaces or colons), then when you load and execute a program, the dynamic linker (ld-linux.so)
will search the LD_PRELOAD
libraries first, before any other shared libraries, when it resolves undefined references.
1 | //mymalloc.c |
1 | $ gcc -D RUNTIME -shared -fpic -o mymalloc.so mymalloc.c -ldl |
Notice that you can use LD_PRELOAD
to interpose on the library calls of any executable program!
1 | $ LD_PRELOAD="./mymalloc.so" /usr/bin/uptime |
7.14 Tools for Manipulating Object Files
ar
Creates static libraries, and inserts, deletes, lists, and extracts members.strings
Lists all of the printable strings contained in an object filestrip
Deletes symbol table information from an object filenm
Lists the symbols defined in the symbol table of an object filesize
Lists the names and sizes of the sections in an object filereadelf
Displays the complete structure of an object file, including all of the information encoded in the ELF header. Subsumes the functionality ofsize
andnm
objdump
The mother of all binary tools. Can display all of the information in an object file.ldd
Lists the shared libraries that an executable needs at run time.
7.15 Summary
Library interposition is very powerful