Linking
#lecture note based on 15-213 Introduction to Computer Systems
“The most useless useful lecture… Almost nothing in this lecture matter for the course… however this is the most useful lecture for you as a computer scientist”
This is when we want to compile multiple files together.
gcc: translators turn each file into object file, linker takes both object files to make executable object file.
The compile process, in which linking is the last step:
H2 Why?
- Modularity
- Breaking programs into smaller files
- Header files: listing the signature of functions in the code, without needing to provide the implementations
#include <thing.h>
simply copiesthing.h
into the source (equivalent to putting signature in.c
source file, telling the compiler there’s this function somewhere and linker will figure out where)
- Efficiency
- Separate compilation of different modules - save time recompiling small portion of file, relink them
- Concurrent compilation of different source files
H2 Linking methods
- Static linking - executable includes library code that will be used
- Dynamic linking - don’t include actual library code. Share some library across multiple executables. Also allows loading new version of lib without recompiling itself
H2 Linking process
H6 Linker’s jobs
- Resolve symbols
- Look at symbol tables and find stuff
- Relocate code
- Merge separate code and data
- Change relative location in object files into absolute location in executable
- Update references to point to right location
Assembler and linker
- Assembler makes symbol table (name, size, location) - linker associates symbol reference when they are referenced.
- Assembler put placeholders in assembly - linker puts in the actual memory address after rearranging pieces.
- Linker expects a
main
function somewhere, it includes a start of program file that callsmain
, presumably assembled by the assembler somewhere from the source.
Object file types
.o
relocatable object file - code from exactly one source.c
filea.out
- code and data needed for direct execution.so
- special type of relocatable object, set up for dynamically linked at run time (calleddll
dynamic link library on Windows)elf
- linux Executable and Linkable Format..ELF
appear on first 8 bytes of the executable file
Linker looks at all the file and figure out what to bring to shared object .so
file
H3 ELF format
- ELF header (starting address 0)
- Segment header table - page size, segment size, virtual address memory segments
.text
- code.rodata
- jump table, string constants (read only data).data
- initialised global vars (that are not 0).bss
- “block started by symbol / better save space” - uninitialised global vars or global vars initialised to 0, doesn’t occupy space.symtab
- symbol table- procedure, static var names
- section names + location
.rel.text
- relocation info- Instruction addresses that needs to be changed, and instruction for making the change
.rel.data
- relocation info- Similar to previous, but for the
.data
section
- Similar to previous, but for the
.debug
- info for debugging- section header table - offsets + size of each section
H3 Symbol types
- Global symbols - aka non-
static
global vars - defined and can be referenced by other modules - External symbols - symbols defined by other module
- Local symbol - symbols only referenced within module (like those with
static
)
(Linker doesn’t see variables inside function)
(Linker doesn’t care type!)
H3 Name collisions?
static within function
static type blah = ...;
within a function is going to be like a global but only accessible within the function. This goes in .bss
or .data
. (whereas non static ones go on the stack)
duplicate symbols
Linker:
- strong symbol - declared and initialised (var with value or func with body)
- weak - unitialised globals
- even weaker - extern
Rules:
- Multiple strong symbol with same name not allowed
- Strong and weak -> use the strong
- Multiple weak -> pick arbitrary one
Good practices:
- Avoid non
static
globals - Initialise to make things strong
- Put type in header, make compiler check type.
- Use
extern
when referencing external global- Treated as weak
- Causes error if not found in some external file