Linking
#lecture note based on 15-213 Introduction to Computer Systems
“The most useless useful lecture… Almost nothing in this lecture matter for the course… however this is the most useful lecture for you as a computer scientist”
This is when we want to compile multiple files together.
gcc: translators turn each file into object file, linker takes both object files to make executable object file.
The compile process, in which linking is the last step:

H2 Why?
- Modularity
- Breaking programs into smaller files
- Header files: listing the signature of functions in the code, without needing to provide the implementations
#include <thing.h>simply copiesthing.hinto the source (equivalent to putting signature in.csource file, telling the compiler there’s this function somewhere and linker will figure out where)
- Efficiency
- Separate compilation of different modules - save time recompiling small portion of file, relink them
- Concurrent compilation of different source files
H2 Linking methods
- Static linking - executable includes library code that will be used
- Dynamic linking - don’t include actual library code. Share some library across multiple executables. Also allows loading new version of lib without recompiling itself
H2 Linking process
H6 Linker’s jobs
- Resolve symbols
- Look at symbol tables and find stuff
- Relocate code
- Merge separate code and data
- Change relative location in object files into absolute location in executable
- Update references to point to right location
Assembler and linker
- Assembler makes symbol table (name, size, location) - linker associates symbol reference when they are referenced.
- Assembler put placeholders in assembly - linker puts in the actual memory address after rearranging pieces.
- Linker expects a
mainfunction somewhere, it includes a start of program file that callsmain, presumably assembled by the assembler somewhere from the source.
Object file types
.orelocatable object file - code from exactly one source.cfilea.out- code and data needed for direct execution.so- special type of relocatable object, set up for dynamically linked at run time (calleddlldynamic link library on Windows)elf- linux Executable and Linkable Format..ELFappear on first 8 bytes of the executable file
Linker looks at all the file and figure out what to bring to shared object .so file
H3 ELF format
- ELF header (starting address 0)
- Segment header table - page size, segment size, virtual address memory segments
.text- code.rodata- jump table, string constants (read only data).data- initialised global vars (that are not 0).bss- “block started by symbol / better save space” - uninitialised global vars or global vars initialised to 0, doesn’t occupy space.symtab- symbol table- procedure, static var names
- section names + location
.rel.text- relocation info- Instruction addresses that needs to be changed, and instruction for making the change
.rel.data- relocation info- Similar to previous, but for the
.datasection
- Similar to previous, but for the
.debug- info for debugging- section header table - offsets + size of each section
H3 Symbol types
- Global symbols - aka non-
staticglobal vars - defined and can be referenced by other modules - External symbols - symbols defined by other module
- Local symbol - symbols only referenced within module (like those with
static)
(Linker doesn’t see variables inside function)
(Linker doesn’t care type!)
H3 Name collisions?
static within function
static type blah = ...; within a function is going to be like a global but only accessible within the function. This goes in .bss or .data. (whereas non static ones go on the stack)
duplicate symbols
Linker:
- strong symbol - declared and initialised (var with value or func with body)
- weak - unitialised globals
- even weaker - extern
Rules:
- Multiple strong symbol with same name not allowed
- Strong and weak -> use the strong
- Multiple weak -> pick arbitrary one
Good practices:
- Avoid non
staticglobals - Initialise to make things strong
- Put type in header, make compiler check type.
- Use
externwhen referencing external global- Treated as weak
- Causes error if not found in some external file