Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

ld(1)

ld [opts] files...
    -T <script>        use <script> as linker script
    --trace            report each file the linker touches
    --start-group archives --end-group
                       search archives repearepeatedly until no new
                       undefined  references are created
                       (eg helpfull with list of static libraries)
    --verbose          dump the default linker script

Linker Script

Generally speaking, the linker script describes how to map a set of input sections from different input files to a set of output sections in the output file.

The output sections are defined as follows (full description at output section and input section).

section_name [vaddr] : [AT(paddr)] {
    file_pattern (section_pattern)
}

The following gives an example of an output section with two input section rules.

.foo : {
    abc.o (.foo)
    *.o (.foo.*)
}

The first rule includes the section .foo from the object file abc.o. The second rule is a wildcard rule, which includes all sections match the glob .foo.* from all object files matching the *.o glob.

Common linker script commands and functions

The OUTPUT_FORMAT defines the format of the output file the linker is creates. This command takes up to three arguments and possible values can be found by running objdump -i.

OUTPUT_FORMAT(default, little, big)

The ENTRY command takes a symbols name, which will be used as entry point.

ENTRY(sym)

Linker script snippets

Using the INSERT command, one can insert linker script snippets without overwriting the default linker script.

SECTIONS
{
        . = ALIGN(8);
        HIDDEN(_fntab_start = .);
        .fntab : {
               /* Sort entries for numerical priorities in section names. */
               KEEP(*(SORT(.fntab*)));
        }
        HIDDEN(_fntab_end = .);
}
INSERT AFTER .data

The following shows an example program using two linker script snippets generating two different sections in the final binary.

#include <stdio.h>

#define SECTION(S) __attribute__((section(S)))
#define USED       __attribute__((used))

// -- FUNCTON POINTER TABLE ----------------------------------------------------

typedef void (*fntab_t)();

// Define a function table entry with a priotiy.
#define FNTAB_ENTRY(E, P) \
  SECTION(".fntab" #P) USED static const fntab_t fnentry_##E##P = E;

// Iterate the function table.
#define FNTAB_FOREACH(V)    \
  extern char _fntab_start; \
  extern char _fntab_end;   \
  for (fntab_t* V = (fntab_t*)&_fntab_start; V < (fntab_t*)&_fntab_end; ++V)

// func10 & func11 are defined with the same prority, order not guaranteed.
// func20 has a lower priority, only runs after func10 & func11.

void func10() {
  puts("func10 called");
}
FNTAB_ENTRY(func10, 1);

void func11() {
  puts("func11 called");
}
FNTAB_ENTRY(func11, 1);

void func20() {
  puts("func20 called");
}
FNTAB_ENTRY(func20, 2);

// -- DATA TABLE ---------------------------------------------------------------

struct datatab_t {
  const char* name;
  void (*fn)();
};

#define DATATAB_ENTRY(E) \
  SECTION(".datatab")    \
  USED static const struct datatab_t dentry_##E = {.name = #E, .fn = E};

#define DATATAB_FOREACH(V)                                               \
  extern char _datatab_start;                                            \
  extern char _datatab_end;                                              \
  for (struct datatab_t* V = (struct datatab_t*)&_datatab_start; \
       V < (struct datatab_t*)&_datatab_end; ++V)

DATATAB_ENTRY(func10);
DATATAB_ENTRY(func20);

// -- RUN EXAMPLE --------------------------------------------------------------

int main() {
  FNTAB_FOREACH(f) {
    printf("call fntab entry @%p\n", *f);
    (*f)();
  }

  DATATAB_FOREACH(d) {
    printf("datatab entry d->name=%s call d->fn=%p\n", d->name, d->fn);
    (*d->fn)();
  }
}

This can be compiled like gcc -o handler handler.c -T handler-fntab.ld -T handler-datatab.ld. All the sources can be found here.

> readelf -W -S handler
# Section Headers:
#   [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
#    ..
#   [23] .data             PROGBITS        0000000000404000 003000 000010 00  WA  0   0  8
#   [24] .datatab          PROGBITS        0000000000404010 003010 000020 00  WA  0   0 16
#   [25] .fntab            PROGBITS        0000000000404030 003030 000018 00  WA  0   0  8

Example: virtual vs physical (load) address

Sometimes code is initially located at a different location as when being run. For example in embedded cases, where code may initially resides in a rom and startup code will copy a section with writable data into ram. Code accessing the writable data accesses the data in the ram.

In this case we need different addresses for the same data.

  • The virtual or runtime address, this is the address used when the linker resolves accesses to the data. Hence, this is the address the data will have when the code is running.
  • The physical or load address, this is the address the data is stored at initially. Startup code typically copies the initial values from the physical to the virtual address.

The following shows an example linker script which uses virtual and physical addresses. The full source files can be found here.

OUTPUT_FORMAT(elf64-x86-64)
ENTRY(_entry)

SECTIONS {
    /* Set the initial location counter (vaddr) */
    . = 0x00800000;

    /* Create .text output section at current vaddr */
    .text : {
        *(.text*)
    }

    ASSERT(. == 0x00800000 + SIZEOF(.text), "inc loc counter automatically")

    /* Create .data section at location counter aligned to the next 0x100 (vaddr) */
    /* Set the load address to  0x00100000 (paddr) */
    .data ALIGN(0x100) : AT(0x00100000) {
        HIDDEN(_data_vaddr = .);
        HIDDEN(_data_paddr = LOADADDR(.data));
        *(.data*)
    }

    /* Create .rodata with explicit vaddr */
    /* Re-adjust the paddr location counter */
    .rodata 0x00804000 : AT(ADDR(.rodata)) {
        *(.rodata*)
    }

    ASSERT(. == 0x00804000 + SIZEOF(.rodata), "inc loc counter automatically")

    .stack ALIGN (0x1000) : {
        . += 0x1000;
        HIDDEN(_stack_top = .);
    }

    /DISCARD/ : {
        *(.*)
    }
}

/* Some example assertions */
ASSERT(ADDR(.data) != LOADADDR(.data), "DATA vaddr and paddr must be different")
ASSERT(ADDR(.rodata) == LOADADDR(.rodata), "RODATA vaddr and paddr must be euqal")
ASSERT(ADDR(.stack) == 0x00805000, "STACK section must aligned to 0x1000")
ASSERT(SIZEOF(.stack) == 0x1000, "STACK section must be 0x1000")

We can use the following assembly snippet to explore the linker script.

.section .text, "ax", @progbits
.global _entry
_entry:
    mov $_stack_top, %rsp
    mov $asm_array, %rax
    mov (asm_len), %eax

    hlt
    jmp _entry

.section .data.asm, "aw", @progbits
asm_array:
    .4byte 0xa
    .4byte 0xb
    .4byte 0xc
    .4byte 0xd
.rept 4
    .4byte 0xff
.endr

.section .rodata.asm, "a", @progbits
asm_len:
    .4byte 8

gcc -c data.S && ld -o link-nomem -T link-nomem.ld data.o

The elf load segments show the difference in physical and virtual address for the segment containing the .data section.

> readelf -W -l link-nomem
# There are 4 program headers, starting at offset 64
#
# Program Headers:
#   Type   Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
#   LOAD   0x001100 0x0000000000800100 0x0000000000100000 0x000020 0x000020 RW  0x1000
#   LOAD   0x002000 0x0000000000800000 0x0000000000800000 0x000018 0x000018 R E 0x1000
#   LOAD   0x003000 0x0000000000804000 0x0000000000804000 0x000004 0x000004 R   0x1000
#   LOAD   0x000000 0x0000000000805000 0x0000000000805000 0x000000 0x001000 RW  0x1000
#
#  Section to Segment mapping:
#   Segment Sections...
#   00     .data
#   01     .text
#   02     .rodata
#   03     .stack

Startup code could copy data from _data_paddr to _data_vaddr.

> nm link-nomem
# 0000000000800100 d asm_array
# 0000000000804000 r asm_len
# 0000000000100000 a _data_paddr
# 0000000000800100 d _data_vaddr
# 0000000000800000 T _entry
# 0000000000806000 b _stack_top

The linker resolves symbols to their virtual address, this can be seen by the access to the asm_array variable.

> objdump -d link-nomem
# Disassembly of section .text:
#
# 0000000000800000 <_entry>:
#   800000:	48 c7 c4 00 60 80 00 	mov    $0x806000,%rsp
#   800007:	48 c7 c0 00 01 80 00 	mov    $0x800100,%rax   ;; mov $asm_array, %rax
#   80000e:	8b 04 25 00 40 80 00 	mov    0x804000,%eax
#   800015:	f4                   	hlt
#   800016:	eb e8                	jmp    800000 <_entry>

The following linker script shows an example with the MEMORY command.

OUTPUT_FORMAT(elf64-x86-64)
ENTRY(_entry)

MEMORY {
    ROM : ORIGIN = 0x00100000, LENGTH = 0x4000
    RAM : ORIGIN = 0x00800000, LENGTH = 0x4000
}

SECTIONS {
    /* Create .text output section at ROM (vaddr) */
    .text : {
        *(.text*)
    } > ROM

    ASSERT(. == ORIGIN(ROM) + SIZEOF(.text), "inc loc counter automatically")

    /* Create .data output section at RAM (vaddr) */
    /* Set load addr to ROM, right after .text (paddr) */
    .data : {
        HIDDEN(_data_vaddr = .);
        HIDDEN(_data_paddr = LOADADDR(.data));
        *(.data*)
    } > RAM AT > ROM

    /* Append .rodata output section at ROM (vaddr) */
    .rodata : {
        *(.rodata*)
    } > ROM

    /* Append .stack output section at RAM (vaddr) aligned up to next 0x1000 */
    .stack : ALIGN (0x1000) {
        . += 0x1000;
        HIDDEN(_stack_top = .);
    } > RAM

    /DISCARD/ : {
        *(.*)
    }
}

/* Some example assertions */
ASSERT(ADDR(.data) != LOADADDR(.data), "DATA vaddr and paddr must be different")
ASSERT(ADDR(.rodata) == LOADADDR(.rodata), "RODATA vaddr and paddr must be euqal")
ASSERT(ADDR(.stack) == ORIGIN(RAM) + 0x1000, "STACK section must aligned to 0x1000")
ASSERT(SIZEOF(.stack) == 0x1000, "STACK section must be 0x1000")

References