Intro

This page gives some information on the various mechanisms to protect against buffer overflow exploits. Details depend on the kernel version, the vendor, the compiler version, the binutils version.

Address Space Layout Randomization

A recent kernel will by default randomize the address of the stack, the base address for memory areas allocated by mmap, the brk base address, and the address of the vdso page of a program, at exec() time. (In particular, shared libraries will be loaded at randomized addresses.) If this is not desired, the feature can be disabled system-wide by booting the kernel with the norandmaps boot parameter, or by
# echo 0 > /proc/sys/kernel/randomize_va_space
Can it be disabled for a specific program by modifying the ELF headers? Maybe not. But it can be disabled for a specific program invocation (and forked off children) by
% setarch `uname -m` -R program program_args
The utility setarch uses the personality() system call to set the ADDR_NO_RANDOMIZE personality flag, and then execs the given program. However, for setuid programs the kernel again strips the ADDR_NO_RANDOMIZE (and ADDR_COMPAT_LAYOUT. MMAP_PAGE_ZERO, READ_IMPLIES_EXEC) flags from the personality, even for the root user.

This randomize_va_space variable can have the values 0 (do not randomize), 1 (randomize stack and vdso page and mmap), and 2 (also randomize brk base address). Initially it is 0 when the kernel is booted with the norandmaps boot parameter, 1 when the kernel was compiled with CONFIG_COMPAT_BRK set, and 2 otherwise. When the brk base address is not randomized, it is the first address past the executable code.

Problems

Some programs have an expensive initialization, and do a dump/undump for faster startup. Some programs have a breakpoint/restart feature. In such cases memory is dumped directly to disk. Reading this data back may fail when addresses are randomized.

Non-executable stack

Buffer overflow exploits often put some code in a program's data area or stack, and then jump to it. If all writable addresses are non-executable, such an attack is prevented. This is OpenBSD's W^X. The implementation is straightforward when an NX bit is provided by the hardware. And it is, on most architectures. On i386 software schemes are needed, like Pax or Exec Shield. NX is turned off by the noexec=off boot parameter.

PT_GNU_STACK

PT_GNU_STACK is an ELF header item that indicates whether an executable stack is needed. If this item is missing, we have no information and must assume that an executable stack is needed. By default, gcc will mark the stack non-executable, unless an executable stack is needed for function trampolines. The gcc marking can be overridden via the -z execstack or -z noexecstack compiler flags.
% cc -z execstack prog.c -o prog
% objdump -p prog | grep -i -A1 stack
   STACK off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**3
         filesz 0x0000000000000000 memsz 0x0000000000000000 flags rwx
% readelf -l prog | grep -i -A1 stack
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RWE    8
% scanelf -e prog
 TYPE   STK/REL/PTL FILE 
ET_EXEC RWX R-- RW- prog 
It can be examined and changed on an existing binary using the execstack utility.
% cc prog.c -o prog
% execstack prog
- prog
% execstack -s prog
% execstack prog
X prog

Some more detail

Trampolines. Gcc supports local functions in its dialect of C. When a local function is called locally, there is no problem. When the address of a local function is taken, and this address is passed around, and then the local function is called via this pointer, and if moreover this local function uses non-local non-global variables, then these variables must be addressed via a separate frame pointer, and upon call this separate frame pointer must be set. The small stub of code setting it lives on the stack. Therefore, the very few programs that use this construction must have an executable stack.

Example:

% cat trampoline.c
#include <stdio.h>
int main(int ac, char **av) {
        int localfn(int a) {
                return a+ac;
        }
        int (*fptr)(int) = localfn;

        printf("%d\n", fptr(-1));
        return 0;
}
% cc -S -Os trampoline.c
% cat trampoline.s
	.file	"trampoline.c"
	.text
	.type	localfn.1658, @function
localfn.1658:
	pushl	%ebp			// save frame pointer
	movl	(%ecx), %eax		// get the non-local variable ac
	movl	%esp, %ebp		// 
	addl	8(%ebp), %eax		// add the local variable a = -1
	popl	%ebp			// restore frame pointer
	ret
	.size	localfn.1658, .-localfn.1658
	.section	.rodata.str1.1,"aMS",@progbits,1
.LC0:
	.string	"%d\n"
	.text
.globl main
	.type	main, @function
main:
	pushl	%ebp			// save frame pointer
	movl	%esp, %ebp		// set new frame pointer
	subl	$16, %esp		// stack space for vars and trampoline
	movl	8(%ebp), %eax		// get parameter ac
	leal	-16(%ebp), %edx		// 
	movl	%eax, -16(%ebp)		// copy to a local variable
	movl	$localfn.1658+2, %eax	//
	subl	%ebp, %eax		// %eax = $localfn.1658+2-%ebp
	leal	-12(%ebp), %ecx		// %ecx = trampoline address
	movl	%edx, -11(%ebp)		// 2nd half of following: D = %edx
	movb	$-71, -12(%ebp)		// B9: movl D %ecx
	movb	$-23, -7(%ebp)		// E9: jmp .+A
	movl	%eax, -6(%ebp)		// 2nd half of previous: A = %eax
	pushl	$-1
	call	*%ecx			// call fptr(-1)
	pushl	%eax
	pushl	$.LC0
	call	printf
	xorl	%eax, %eax
	leave
	ret
	.size	main, .-main
	.ident	"GCC: (GNU) 4.0.2 20050901 (prerelease) (SUSE Linux)"
	.section	.note.GNU-stack,"x",@progbits
%
This sets up a stack area S, with first a copy of the variable ac, and then a 2-instruction trampoline: (i) the instruction movl D %ecx, where D is the address of the stack area S, and (ii) the instruction jmp .+A where A is computed as $localfn.1658 + 2 - %ebp where . = %ebp - 2, so that the jmp jumps to $localfn.1658. The call fptr(-1) is translated as call *%ecx with -1 on the stack. The %ecx register points to the start of the code (the movl) on the stack trampoline. The result is that $localfn.1658 is called with %ecx pointing to an area containing the necessary non-local variables.

Compiler. In its assembler output gcc outputs a line

        .section        .note.GNU-stack,"x",@progbits
when a trampoline was generated, and
        .section        .note.GNU-stack,"",@progbits
otherwise.
% cc -S prog.c
% tail -1 prog.s
        .section        .note.GNU-stack,"",@progbits
Maybe there are no gcc flags to override.

Assembler. The assembler will do the right thing (and can be guided using the --execstack or --noexecstack flags). E.g.:

% cc -Wa,--execstack -c prog.c
% objdump -h ./prog.o | tail -2
  6 .note.GNU-stack 00000000  0000000000000000  0000000000000000  000000cd  2**0
                  CONTENTS, READONLY, CODE

Linker. The linker will do the right thing (and can be guided using the -z execstack or -z noexecstack flags).

Stack Smashing Protection

Typically, a buffer overflow exploit overwrites a return address so that a function will return to an attacker-chosen address. ASLR makes it difficult for the attacker to find an address to jump to. A W^X setup makes it difficult for the attacker to put his code somewhere. A third type of protection checks the stack integrity before returning from a function. E.g.,
% cat ff.c
#include <string.h>

int main(int ac, char **av)
{
        char buf[10];
        strcpy(buf, av[1]);
        return buf[5];
}
% cc -o ff ff.c
% ./ff xxxxxxxxxxxxxx
% ./ff xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
*** stack smashing detected ***: ./ff terminated
Segmentation fault
% ./ff xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
*** stack smashing detected ***: ./ff terminated
======= Backtrace: =========
/lib/libc.so.6(__fortify_fail+0x37)[0x2b1f97f6c887]
/lib/libc.so.6(__fortify_fail+0x0)[0x2b1f97f6c850]
./ff[0x4005c9]
/lib/libc.so.6(__libc_start_main+0xe6)[0x2b1f97e8b466]
./ff[0x4004b9]
======= Memory map: ========
...
Here stack smashing is detected when a value on the stack is overwritten, and __stack_chk_fail is called. The presence of this check can be tested:
% objdump -d ff | grep __stack_chk_fail    
0000000000400468 <__stack_chk_fail@plt>:
  4005c4:       e8 9f fe ff ff          callq  400468 <__stack_chk_fail@plt>
The check can be disabled:
% cc -fno-stack-protector -o ff ff.c
% ./ff xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Segmentation fault
% objdump -d ff | grep __stack_chk_fail
%