Nothing Special   »   [go: up one dir, main page]

Low Level Security by Example

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Low-Level Software Security

by Example 30
Úlfar Erlingsson, Yves Younan, and Frank Piessens

Contents ties. By exploiting such flaws, these low-level attacks


can subvert the execution of the software and gain
30.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633 control over its behavior. The combined effects of
30.1.1 The Difficulty of Eliminating
these attacks make them one of the most pressing
Low-Level Vulnerabilities . . . . . . . . . . . . 634
30.1.2 The Assumptions Underlying challenges in computer security. As a result, in re-
Software, Attacks, and Defenses . . . . . . 634 cent years, many mechanisms have been proposed
30.1.3 The Presentation of Technical Details . 635 for defending against these attacks.
This chapter aims to provide insight into low-
30.2 A Selection of Low-Level Attacks
on C Software . . . . . . . . . . . . . . . . . . . . . . . . . . . 635 level software attack and defense techniques by dis-
30.2.1 Attack 1: Corruption of a Function cussing four examples that are representative of the
Return Address on the Stack . . . . . . . . . 636 major types of attacks on C and C++ software, and
30.2.2 Attack 2: Corruption of Function four examples of defenses selected because of their
Pointers Stored in the Heap . . . . . . . . . . 638 effectiveness, wide applicability, and low enforce-
30.2.3 Attack 3: Execution of Existing Code
ment overhead. Attacks and defenses are described
via Corrupt Pointers . . . . . . . . . . . . . . . . . 640
30.2.4 Attack 4: Corruption of Data Values
in enough detail to be understood even by read-
that Determine Behavior . . . . . . . . . . . . . 643 ers without a background in software security, and
without a natural inclination for crafting malicious
30.3 Defenses that Preserve High-Level attacks.
Language Properties . . . . . . . . . . . . . . . . . . . . . 645
30.3.1 Defense 1: Checking Stack Canaries
Throughout, the attacks and defenses are placed
on Return Addresses . . . . . . . . . . . . . . . . 645 in perspective by showing how they are both facili-
30.3.2 Defense 2: Making Data tated by the gap between the semantics of the high-
not Executable as Machine Code . . . . . . 648 level language of the software under attack, and the
30.3.3 Defense 3: Enforcing Control-Flow low-level semantics of machine code and the hard-
Integrity on Code Execution . . . . . . . . . 649 ware on which the software executes.
30.3.4 Defense 4: Randomizing the Layout
of Code and Data in Memory . . . . . . . . 652
30.1 Background
30.4 Summary and Discussion . . . . . . . . . . . . . . . . . 655
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656 Software vulnerabilities are software bugs that can
be triggered by an attacker with possibly disastrous
The Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658 consequences. This introductory section provides
more background about such vulnerabilities, why
Computers are often subject to external attacks that they are so hard to eliminate, and how they can be
aim to control software behavior. Typically, such at- introduced in a software system. Both attacking such
tacks arrive as data over a regular communication vulnerabilities, and defending against such attacks
channel and, once resident in program memory, depend on low-level details of the software and ma-
trigger pre-existing, low-level software vulnerabili- chine under attack, and this section ends with a note

Peter Stavroulakis, Mark Stamp (Eds.), Handbook of Information and Communication Security 633
© Springer 2010
634 30 Low-Level Software Security by Example

on the presentation of such low-level details in this function. In this case, the strcmp comparison will
chapter. be against the unmodified array t, if both strings a
and b are longer than MAX_LEN.
Thus, a slight omission from Fig. 30.1b would
30.1.1 The Difficulty of Eliminating leave open the possibility of an exploitable vulner-
Low-Level Vulnerabilities ability as a result of the function reporting that the
concatenation of the inputs strings is "abc", even
Figure 30.1 is representative of the attacks and in cases when this is false. In particular, this may oc-
defenses presented in this chapter. The attacks in cur when, on entry to the function, the array t con-
Sect. 30.2 all exploit vulnerabilities similar to that in tains "abc" as a residual data value from a previous
Fig. 30.1a, where a buffer overflow may be possible. invocation of the function.
For the most part, the defenses in Sect. 30.3 use Low-level software security vulnerabilities con-
techniques like those in Fig. 30.1b and prevent tinue to persist due to technical reasons, as well
exploits by maintaining additional information, as practical engineering concerns such as the diffi-
validating that information with runtime checks, culties involved in modifying legacy software. The
and halting execution if such a check fails. state of the art in eliminating these vulnerabilities
Unfortunately, unlike in Fig. 30.1, it is often not makes use of code review, security testing, and other
so straightforward to modify existing source code manual software engineering processes, as well as
to use new, safer methods of implementing its func- automatic analyses that can discover vulnerabili-
tionality. For most code there may not be a direct ties [30.1]. Furthermore, best practice also acknowl-
correspondence between well-known, unsafe library edges that some vulnerabilities are likely to remain,
functions and their newer, safer versions. Indeed, and make those vulnerabilities more difficult to ex-
existing code can easily be unsafe despite not us- ploit by applying defenses like those in this tutorial.
ing any library routines, and vulnerabilities are of-
ten obscured by pointer arithmetic or complicated
data-structure traversal. (To clarify this point, it is
30.1.2 The Assumptions Underlying
worth comparing the code in Fig. 30.1 with the code
in Fig. 30.3, on p. 636, where explicit loops imple- Software, Attacks,
ment the same functionality.) and Defenses
Furthermore, manual attempts to remove soft-
ware vulnerabilities may give a false sense of secu- Programmers make many assumptions when creat-
rity, since they do not always succeed and can some- ing software, both implicitly and explicitly. Some of
times introduce new bugs. For example, a program- these assumptions are valid, based on the semantics
mer that intends to eliminate buffer overflows in the of the high-level language. For instance, C program-
code of Fig. 30.1a might change the strcpy and mers may assume that execution does not start at an
strcat function calls as in Fig. 30.1b, but fail to arbitrary place within a function, but at the start of
initialize t to be the empty string at the start of the that function.

int unsafe( char* a, char* b ) int safer( char* a, char* b )


{ {
char t[MAX_LEN]; char t[MAX_LEN] = { ’\0’ };
strcpy( t, a ); strcpy_s( t, _countof(t), a );
strcat( t, b ); strcat_s( t, _countof(t), b );
return strcmp( t, "abc" ); return strcmp( t, "abc" );
} }
(a) An unchecked C function (b) A safer version of the function
Fig. 30.1 Two C functions that both compare whether the concatenation of two input strings is the string “abc.” The
first, unchecked function (a) contains a security vulnerability if the inputs are untrusted. The second function (b) is not
vulnerable in this manner, since it uses new C library functions that perform validity checks against the lengths of buffers.
Modern compilers will warn about the use of older, less safe library functions, and strongly suggest the use of their newer
variants
30.2 A Selection of Low-Level Attacks on C Software 635

Programmers may also make questionable as- 30.1.3 The Presentation


sumptions, such as about the execution environ- of Technical Details
ment of their software. For instance, software may be
written without concurrency in mind, or in a man- The presentation in this chapter assumes a basic
ner that is dependent on the address encoding in knowledge of programming languages like C, and
pointers, or on the order of heap allocations. Any their compilation, as might be acquired in an in-
such assumptions hinder portability, and may result troductory course on compilers. For the most part,
in incorrect execution when the execution environ- relevant technical concepts are introduced when
ment changes even slightly. needed.
Finally, programmers may make invalid, mis- As well as giving a number of examples of vul-
taken assumptions. For example, in C, programmers nerable C software, this chapter shows many details
may assume that the int type behaves like a true, relating to software execution, such as machine code
mathematical integer, or that a memory buffer is and execution stack content. Throughout, the details
large enough for the size of the content it may ever shown will reflect software execution on one partic-
need to hold. All of the above types of assumptions ular hardware architecture – a 32-bit x86, such as the
are relevant to low-level software security, and each IA-32 [30.3] – but demonstrate properties that also
may make the software vulnerable to attack. apply to most other hardware platforms. The exam-
At the same time, attackers also make assump- ples show many concrete, hexadecimal values and in
tions, and low-level software attacks rely on a great order to avoid confusion, the reader should remem-
number of specific properties about the hardware ber that on the little-endian x86, when four bytes
and software architecture of their target. Many of are displayed as a 32-bit integer value, their printed
these assumptions involve details about names and order will be reversed from the order of the bytes
the meaning of those names, such as the exact mem- in memory. Thus, if the hexadecimal bytes 0xaa,
ory addresses of variables or functions and how they 0xbb, 0xcc, and 0xdd occur in memory, in that
are used in the software. These assumptions also re- order, then those bytes encode the 32-bit integer
late to the software’s execution environment, such 0xddccbbaa.
as the hardware instruction set architecture and its
machine-code semantics. For example, the Inter-
net worm of 1988 was successful in large part be- 30.2 A Selection of Low-Level Attacks
cause of an attack that depended on the particulars on C Software
of the commonly deployed VAX hardware architec-
ture, the 4 BSD operating system, and the fingerd This section presents four low-level software attacks
service. On other systems that were popular at the in full detail and explains how each attack invali-
time, that same attack failed in a manner that only dates a property of target software written in the C
crashed the fingerd service, due to the differences language. The attacks are carefully chosen to be rep-
in instruction sets and memory layouts [30.2]. In resentative of four major classes of attacks: stack-
this manner, attack code is often fragile to the point based buffer overflows, heap-based buffer overflows,
where even the smallest change prevents the attacker jump-to-libc attacks, and data-only attacks.
from gaining control, but crashes the target soft- No examples are given below of a format-string
ware – effecting a denial-of-service attack. attack or of an integer-overflow vulnerability.
Defense mechanisms also have assumptions, in- Format-string vulnerabilities are particularly simple
cluding assumptions about the capabilities of the at- to eliminate [30.4]; therefore, although they have
tacker, about the likelihood of different types of at- received a great deal of attention in the past, they
tacks, about the properties of the software being de- are no longer a significant, practical concern in
fended, and about its execution environment. In the well-engineered software. Integer-overflow vulner-
attacks and defenses that follow, a note will be made abilities [30.5] do still exist, and are increasingly
of the assumptions that apply in each case. Also, being exploited, but only as a first step towards
many defenses (including most of the ones in this attacks like those described below. In this section,
tutorial) assume that denial-of-service is not the at- Attack 4 is one example where an integer overflow
tacker’s goal, and halt the execution of the target soft- might be the first step in the exploit crafted by the
ware upon the failure of runtime validity checks. attacker.
636 30 Low-Level Software Security by Example

As further reading, the survey of Pincus and trigger such corruption, and change the function
Baker gives a good general overview of low-level return address to an arbitrary value. In this case,
software attacks like those described here [30.6]. when the function returns, the attacker can direct
execution to code of their choice and gain full con-
trol over subsequent behavior of the software. Fig-
30.2.1 Attack 1: Corruption ures 30.2 and 30.3 show examples of C functions that
of a Function Return are vulnerable to this attack. This attack, sometimes
Address on the Stack referred to as return-address clobbering, is probably
the best known exploit of a low-level software se-
It is natural for C programmers to assume that, if curity vulnerability; it dates back to before 1988,
a function is invoked at a particular call site and runs when it was used in the fingerd exploit of the
to completion without throwing an exception, then Internet worm. Indeed, until about a decade ago,
that function will return to the instruction immedi- this attack was seen by many as the only significant
ately following that same, particular call site. low-level attack on software compiled from C and
Unfortunately, this may not be the case in the C++, and stack-based buffer overflow were widely
presence of software bugs. For example, if the in- considered a synonym for such attacks. More re-
voked function contains a local array, or buffer, and cently, this attack has not been as prominent, in part
writes into that buffer are not correctly guarded, because other methods of attack have been widely
then the return address on the stack may be over- publicized, but also in part because the underlying
written and corrupted. In particular, this may hap- vulnerabilities that enable return-address clobber-
pen if the software copies to the buffer data whose ing are slowly being eliminated (e.g., through the
length is larger than the buffer size, in a buffer over- adoption of newer, safer C library functions).
flow. To give a concrete example of this attack, Fig. 30.4
Furthermore, if an attacker controls the data used shows a normal execution stack for the functions in
by the function, then the attacker may be able to Figs. 30.2 and 30.3, and Fig. 30.5 shows an execution

int is_file_foobar( char* one, char* two )


{
// must have strlen(one) + strlen(two) < MAX_LEN
char tmp[MAX_LEN];
strcpy( tmp, one );
strcat( tmp, two );
return strcmp( tmp, "file://foobar" );
}

Fig. 30.2 A C function that compares the concatenation of two input strings against “file://foobar.” This function contains
a typical stack-based buffer overflow vulnerability: if the input strings can be chosen by an attacker, then the attacker can
direct machine-code execution when the function returns

int is_file_foobar_using_loops( char* one, char* two )


{
// must have strlen(one) + strlen(two) < MAX_LEN
char tmp[MAX_LEN];
char* b = tmp;
for( ; *one != ’\0’; ++one, ++b ) *b = *one;
for( ; *two != ’\0’; ++two, ++b ) *b = *two;
*b = ’\0’;
return strcmp( tmp, "file://foobar" );
}

Fig. 30.3 A version of the C function in Fig. 30.2 that copies and concatenates strings using pointer manipulation and
explicit loops. This function is also vulnerable to the same stack-based buffer overflow attacks, even though it does not
invoke strcpy or strcat or other C library functions that are known to be difficult to use safely
30.2 A Selection of Low-Level Attacks on C Software 637

address content
0x0012ff5c 0x00353037 ; argument two pointer
0x0012ff58 0x0035302f ; argument one pointer
0x0012ff54 0x00401263 ; return address
0x0012ff50 0x0012ff7c ; saved base pointer
0x0012ff4c 0x00000072 ; tmp continues ’r’ ’\0’ ’\0’ ’\0’
0x0012ff48 0x61626f6f ; tmp continues ’o’ ’o’ ’b’ ’a’
0x0012ff44 0x662f2f3a ; tmp continues ’:’ ’/’ ’/’ ’f’
0x0012ff40 0x656c6966 ; tmp array: ’f’ ’i’ ’l’ ’e’
Fig. 30.4 A snapshot of an execution stack for the functions in Figs. 30.2 and 30.3, where the size of the tmp array
is 16 bytes. This snapshot shows the stack just before executing the return statement. Argument one is “file://”, and
argument two is “foobar,” and the concatenation of those strings fits in the tmp array. (Stacks are traditionally displayed
with the lowest address at the bottom, as is done here and throughout this chapter)

address content
0x0012ff5c 0x00353037 ; argument two pointer
0x0012ff58 0x0035302f ; argument one pointer
0x0012ff54 0x00666473 ; return address ’s’ ’d’ ’f’ ’\0’
0x0012ff50 0x61666473 ; saved base pointer ’s’ ’d’ ’f’ ’a’
0x0012ff4c 0x61666473 ; tmp continues ’s’ ’d’ ’f’ ’a’
0x0012ff48 0x61666473 ; tmp continues ’s’ ’d’ ’f’ ’a’
0x0012ff44 0x612f2f3a ; tmp continues ’:’ ’/’ ’/’ ’a’
0x0012ff40 0x656c6966 ; tmp array: ’f’ ’i’ ’l’ ’e’
Fig. 30.5 An execution-stack snapshot like that in Fig. 30.4, but where argument one is “file://” and argument two
is “asdfasdfasdfasdf.” The concatenation of the argument strings has overflowed the tmp array and the function return
address is now determined by the last few characters of the two string

machine code
opcode bytes assembly-language version of the machine code
0xcd 0x2e int 0x2e ; system call to the operating system
0xeb 0xfe L: jmp L ; a very short, direct infinite loop
Fig. 30.6 The simple attack payload used in this chapter; in most examples, the attacker’s goal will be to execute this ma-
chine code. Of these four bytes, the first two are an x86 int instruction which performs a system call on some platforms,
and the second two are an x86 jmp instruction that directly calls itself in an infinite loop. (Note that, in the examples, these
bytes will sometimes be printed as the integer 0xfeeb2ecd, with the apparent reversal a result of x86 little-endianness)

stack for the same code just after an overflow of the In the example under discussion, an attacker
local array, potentially caused by an attacker that can would choose their input data so that the machine
choose the contents of the two string provided as code for an attack payload would be present at ad-
input. dress 0x0012ff48. When the vulnerable function
Of course, an attacker would choose their input returns, and execution of the attack payload begins,
such that the buffer overflow would not be caused the attacker has gained control of the behavior of the
by “asdfasdfasdfasdf,” but another string of bytes. In target software. (The attack payload is often called
particular, the attacker might choose 0x48, 0xff, shellcode, since a common goal of an attacker is to
and 0x12, in order, as the final three character bytes launch a “shell” command interpreter under their
of the two argument string, and thereby arrange control.)
for the function return address to have the value In Fig. 30.5, the bytes at 0x0012ff48 are those
0x0012ff48. In this case, as soon as the function of the second to fifth characters in the string “as-
returns, the hardware instruction pointer would be dfasdfasdfasdf ” namely ’s’, ’d’, ’f’, and ’a’.
placed at the second character of the two argument When executed as machine code, those bytes do
string, and the hardware would start executing the not implement an attack. Instead, as described in
data found there (and chosen by the attacker) as ma- Fig. 30.6, an attacker might choose 0xcd, 0x2e,
chine code. 0xeb, and 0xfe as a very simple attack payload.
638 30 Low-Level Software Security by Example

Thus, an attacker might call the operating system tion may return as expected to its caller function,
to enable a dangerous feature, or disable security but, when that caller itself returns, it uses a return
checks, and avoid detection by keeping the target address that has been chosen by the attacker [30.9].
software running (albeit in a loop). Another notable variant of this attack targets C and
Return-address clobbering as described above C++ exception-handler pointers that reside on the
has been a highly successful attack technique. For stack, and ensures that the buffer overflow causes an
example, in 2003 it was used to implement the exception – at which point a function pointer of the
Blaster worm, which affected a majority of Internet attacker’s choice may be executed [30.10].
users [30.7]. In the case of Blaster, the vulnerable
code was written using explicit loops, much as in
Fig. 30.3. (This was one reason why the vulnerabil- 30.2.2 Attack 2: Corruption
ity had not been detected and corrected through au- of Function Pointers
tomatic software analysis tools, or by manual code Stored in the Heap
reviews.)
Software written in C and C++ often combines data
Attack 1: Constraints and Variants buffers and pointers into the same data structures,
or objects, with programmers making a natural
Low-level attacks are typically subject to a number of assumption that the data values do not affect the
such constraints, and must be carefully written to be pointer values. Unfortunately, this may not be the
compatible with the vulnerability being exploited. case in the presence of software bugs. In particular,
For example, the attack demonstrated above re- the pointers may be corrupted as a result of an
lies on the hardware being willing to execute the overflow of the data buffer, regardless of whether
data found on the stack as machine code. However, the data structures or objects reside on the stack,
on some systems the stack is not executable, e.g., or in heap memory. Figure 30.7 shows C code with
because those systems implement the defenses de- a function that is vulnerable to such an attack.
scribed later in this chapter. On such systems, an at- To give a concrete example of this attack,
tacker would have to pursue a more indirect attack Fig. 30.8 shows the contents of the vulnerable
strategy, such as those described later, in Attacks 3 data structure after the function in Fig. 30.7 has
and 4. copied data into the buff array using the strcpy
Another important constraint applies to the and strcmp library functions. Figure 30.8 shows
above buffer-overflow attacks: the attacker-chosen three instances of the data structure contents: as
data cannot contain null bytes, or zeros, since such might occur during normal processing, as might
bytes terminate the buffer overflow and prevent occur in an unintended buffer overflow, and, finally,
further copying onto the stack. This is a common as might occur during an attack. These instances can
constraint when crafting exploits of buffer over- occur both when the data structure is allocated on
flows, and applies to most of the attacks in this the stack, and also when it is allocated on the heap.
chapter. It is so common that special tools exist In the last instance of Fig. 30.8, the attacker has
for creating machine code for attack payloads that chosen the two input strings such that the cmp func-
do not contain any embedded null bytes, newline tion pointer has become the address of the start of
characters, or other byte sequences that might the data structure. At that address, the attacker has
terminate the buffer overflow (one such tool is arranged for an attack payload to be present. Thus,
Metasploit [30.8]). when the function in Fig. 30.7 executes the return
There are a number of attack methods similar statement, and invokes s->cmp, it transfers control
to return-address clobbering, in that they exploit to the start of the data structure, which contains data
stack-based buffer overflow vulnerabilities to target of the attacker’s choice. In this case, the attack pay-
the function-invocation control data on the stack. load is the four bytes of machine code 0xcd, 0x2e,
Most of these variants add a level of indirection to 0xeb, and 0xfe described in Fig. 30.6, and used
the techniques described above. One notable attack throughout this chapter.
variant corrupts the base pointer saved on the stack It is especially commonplace for C++ code to
(see Figs. 30.4 and 30.5) and not the return address store object instances on the heap and to combine –
sitting above it. In this variant, the vulnerable func- within a single object instance – both data buffers
30.2 A Selection of Low-Level Attacks on C Software 639

typedef struct _vulnerable_struct


{
char buff[MAX_LEN];
int (*cmp)(char*,char*);
} vulnerable;

int is_file_foobar_using_heap( vulnerable* s, char* one, char* two )


{
// must have strlen(one) + strlen(two) < MAX_LEN
strcpy( s->buff, one );
strcat( s->buff, two );
return s->cmp( s->buff, "file://foobar" );
}

Fig. 30.7 A C function that sets a heap data structure as the concatenation of two input strings, and compares the result
against “file://foobar” using the comparison function for that data structure. This function is vulnerable to a heap-based
buffer overflow attack if an attacker can choose either or both of the input strings

buff (char array at start of the struct) cmp


address: 0x00353068 0x0035306c 0x00353070 0x00353074 0x00353078
content: 0x656c6966 0x662f2f3a 0x61626f6f 0x00000072 0x004013ce
(a) A structure holding “file://foobar” and a pointer to the strcmp function

buff (char array at start of the struct) cmp


address: 0x00353068 0x0035306c 0x00353070 0x00353074 0x00353078
content: 0x656c6966 0x612f2f3a 0x61666473 0x61666473 0x00666473
(b) After a buffer overflow caused by the inputs “file://” and “asdfasdfasdf ”

buff (char array at start of the struct) cmp


address: 0x00353068 0x0035306c 0x00353070 0x00353074 0x00353078
content: 0xfeeb2ecd 0x11111111 0x11111111 0x11111111 0x00353068
(c) After a malicious buffer overflow caused by attacker-chosen inputs
Fig. 30.8 Three instances of the vulnerable data structure pointed to by s in Fig. 30.7, where the size of the buff
array is 16 bytes. Both the address of the structure and its 20 bytes of content are shown. In the first instance (a), the buffer
holds “file://foobar” and cmp points to the strcmp function. In the second instance (b), the pointer has been corrupted
by a buffer overflow. In the third instance (c), an attacker has selected the input strings so that the buffer overflow has
changed the structure data so that the simple attack payload of Fig. 30.6, will be executed

that may be overflowed and potentially exploitable of those addresses may constrain the attacker, e.g., if
pointers. In particular, C++ object instances are the exploited vulnerability is that of a string-based
likely to contain vtable pointers: a form of indirect buffer overflow, in which case the address data can-
function pointers that allow dynamic dispatch of not contain null bytes.
virtual member functions. As a result, C++ soft- The examples above demonstrate attacks where
ware may be particularly vulnerable to heap-based heap-based buffer overflow vulnerabilities are ex-
attacks [30.11]. ploited to corrupt pointers that reside within the
same data structure or object as the data buffer that
Attack 2: Constraints and Variants is overflowed. There are two important attack vari-
ants, not described above, where heap-based buffer
Heap-based attacks are often constrained by their overflows are used to corrupt pointers that reside in
ability to determine the address of the heap memory other structures or objects, or in the heap metadata.
that is being corrupted, as can be seen in the exam- In the first variant, two data structures or ob-
ples above. This constraint applies in particular, to jects reside consecutively in heap memory, the initial
all indirect attacks, where a heap-based pointer-to- one containing a buffer that can be overflowed, and
a-pointer is modified. Furthermore, the exact bytes the subsequent one containing a direct, or indirect,
640 30 Low-Level Software Security by Example

function pointer. Heap objects are often adjacent in chine code to be executed, but in an unexpected or-
memory like this when they are functionally related der, or with unexpected data arguments.
and are allocated in order, one immediately after the This class of attacks is typically referred to as
other. Whenever these conditions hold, attacks sim- jump-to-libc or return-to-libc (depending on
ilar to the above examples may be possible, by over- whether a function pointer or return address is cor-
flowing the buffer in the first object and overwriting rupted by the attacker), because the attack often in-
the pointer in the second object. volves directing execution towards machine code in
In the second variant, the attack is based on the libc standard C library.
corrupting the metadata of the heap itself through Jump-to-libc attacks are especially attractive
a heap-based buffer overflow, and exploiting that when the target software system is based on an ar-
corruption to write an arbitrary value to an arbitrary chitecture where input data cannot be directly ex-
location in memory. This is possible because heap ecuted as machine code. Such architectures are be-
implementations contain doubly linked lists in their coming commonplace with the adoption of the de-
metadata. An attacker that can corrupt the metadata fenses such as those described later in this chapter.
can thereby choose what is written where. The at- As a result, an increasingly important class of at-
tacker can then use this capability to write a pointer tacks is indirect code injection: the selective execu-
to the attack payload in the place of any soon-to-be- tion of the target software’s existing machine code
used function pointer sitting at a known address. in a manner that enables attacker-chosen input data
to be subsequently executed as machine code. Fig-
ure 30.9 shows a C function that is vulnerable to such
30.2.3 Attack 3: Execution of Existing an attack.
Code via Corrupt Pointers The function in Fig. 30.9 actually contains
a stack-based buffer overflow vulnerability that can
If software does not contain any code for a certain be exploited for various attacks, if an attacker is
functionality such as performing floating-point cal- able to choose the number of input integers, and
culations, or making system calls to interact with the their contents. In particular, attackers can perform
network, then the programmers may naturally as- return-address clobbering, as described in Attack 1.
sume that execution of the software will not result However, for this particular function, an attacker
in this behavior, or functionality. can also corrupt the comparison-function pointer
Unfortunately, for C or C++ software, this as- cmp before it is passed to qsort. In this case, the
sumption may not hold in the face of bugs and ma- attacker can gain control of machine-code execu-
licious attacks, as demonstrated by attacks like those tion at the point where qsort calls its copy of
in this chapter. As in the previous two examples of the corrupted cmp argument. Figure 30.10 shows
attacks, the attacker may be able to cause arbitrary the machine code in the qsort library function
behavior by direct code injection: by directly mod- where this, potentially corrupted function pointer
ifying the hardware instruction pointer to execute is called.
machine code embedded in attacker-provided input To give a concrete example of a jump-to-libc
data, instead of the original software. However, there attack, consider the case when the function in
are other means for an attacker to cause software to Fig. 30.9 is executed on some versions of the
exhibit arbitrary behavior, and these alternatives can Microsoft Windows operating system. On these
be the preferred mode of attack. systems, the qsort function is implemented
In particular, an attacker may find it preferable to as shown in Fig. 30.10 and the memory address
craft attacks that execute the existing machine code 0x7c971649 holds the four bytes of executable
of the target software in a manner not intended by machine code, as shown in Fig. 30.11.
its programmers. For example, the attacker may cor- On such a system, the buffer overflow may leave
rupt a function pointer to cause the execution of the stack looking like that shown in the “malicious
a library function that is unreachable in the origi- overflow contents” column of Fig. 30.12. Then, when
nal C or C++ source code written by the program- the qsort function is called, it is passed a copy
mers – and should therefore, in the compiled soft- of the corrupted cmp function-pointer argument,
ware, be never-executed, dead code. Alternatively, which points to a trampoline found within exist-
the attacker may arrange for reachable, valid ma- ing, executable machine code. This trampoline is
30.2 A Selection of Low-Level Attacks on C Software 641

int median( int* data, int len, void* cmp )


{
// must have 0 < len <= MAX_INTS
int tmp[MAX_INTS];
memcpy( tmp, data, len*sizeof(int) ); // copy the input integers
qsort( tmp, len, sizeof(int), cmp ); // sort the local copy
return tmp[len/2]; // median is in the middle
}

Fig. 30.9 A C function that computes the median of an array of input integers by sorting a local copy of those integers.
This function is vulnerable to a stack-based buffer overflow attack, if an attacker can choose the set of input integers

...
push edi ; push second argument to be compared onto the stack
push ebx ; push the first argument onto the stack
call [esp+comp_fp] ; call comparison function, indirectly through a pointer
add esp, 8 ; remove the two arguments from the stack
test eax, eax ; check the comparison result
jle label_lessthan ; branch on that result
...

Fig. 30.10 Machine code fragment from the qsort library function, showing how the comparison operation is called
through a function pointer. When qsort is invoked in the median function of Fig. 30.9, a stack-based buffer overflow
attack can make this function pointer hold an arbitrary address

machine code
address opcode bytes assembly-language version of the machine code
0x7c971649 0x8b 0xe3 mov esp, ebx ; change the stack location to ebx
0x7c97164b 0x5b pop ebx ; pop ebx from the new stack
0x7c97164c 0xc3 ret ; return based on the new stack

Fig. 30.11 Four bytes found within executable memory, in a system library. These bytes encode three machine-code
instructions that are useful in the crafting of jump-to-libc attacks. In particular, in an attack on the median function
in Fig. 30.9, these three instructions may be called by the qsort code in Fig. 30.10, which will change the stack pointer
to the start of the local tmp buffer that has been overflowed by the attacker

the code found at address 0x7c971649, which is Figure 30.13 shows, as C source code, the se-
shown in Fig. 30.11. The effect of calling the tram- quence of function calls that occur when the stack is
poline is to, first, set the stack pointer esp to the unwound. The figure shows both the name and ad-
start address of the tmp array, (which is held in reg- dress of the Windows library functions that are in-
ister ebx), second, read a new value for ebx from voked, as well as their arguments. The effect of these
the first integer in the tmp array, and, third, per- invocations is to create a new, writable page of ex-
form a return that changes the hardware instruction ecutable memory, to write machine code of the at-
pointer to the address held in the second integer in tacker’s choice to that page, and to transfer control
the tmp array. to that attack payload.
The attack subsequently proceeds as follows. The After the trampoline code executes, the hard-
stack is “unwound” one stack frame at a time, as ware instruction pointer address is 0x7c809a51,
functions return to return addresses. The stack holds which is the start of the Windows library func-
data, including return addresses, that has been cho- tion VirtualAlloc, and the address in the
sen by the attacker to encode function calls and stack pointer is 0x0012ff10, the third in-
arguments. As each stack frame is unwound, the teger in the tmp array in Fig. 30.12. As a re-
return instruction transfers control to the start of sult, when VirtualAlloc returns, execution
a particular, existing library function, and provides will continue at address 0x7c80978e, which
that function with arguments. is the start of the Windows library function
642 30 Low-Level Software Security by Example

normal benign malicious


stack stack overflow overflow
address contents contents contents
0x0012ff38 0x004013e0 0x1111110d 0x7c971649 ; cmp argument
0x0012ff34 0x00000001 0x1111110c 0x1111110c ; len argument
0x0012ff30 0x00353050 0x1111110b 0x1111110b ; data argument
0x0012ff2c 0x00401528 0x1111110a 0xfeeb2ecd ; return address
0x0012ff28 0x0012ff4c 0x11111109 0x70000000 ; saved base pointer
0x0012ff24 0x00000000 0x11111108 0x70000000 ; tmp final  bytes
0x0012ff20 0x00000000 0x11111107 0x00000040 ; tmp continues
0x0012ff1c 0x00000000 0x11111106 0x00003000 ; tmp continues
0x0012ff18 0x00000000 0x11111105 0x00001000 ; tmp continues
0x0012ff14 0x00000000 0x11111104 0x70000000 ; tmp continues
0x0012ff10 0x00000000 0x11111103 0x7c80978e ; tmp continues
0x0012ff0c 0x00000000 0x11111102 0x7c809a51 ; tmp continues
0x0012ff08 0x00000000 0x11111101 0x11111101 ; tmp buffer starts
0x0012ff04 0x00000004 0x00000040 0x00000040 ; memcpy length argument
0x0012ff00 0x00353050 0x00353050 0x00353050 ; memcpy source argument
0x0012fefc 0x0012ff08 0x0012ff08 0x0012ff08 ; memcpy destination arg.

Fig. 30.12 The address and contents of the stack of the median function of Fig. 30.9, where tmp is eight integers in
size. Three versions of the stack contents are shown, as it would appear just after the call to memcpy: a first for input data
of the single integer zero, a second for a benign buffer overflow of consecutive integers starting at 0x11111101, and
a third for a malicious jump-to-libc attack that corrupts the comparison function pointer to make qsort call address
0x7c971649 and the machine code in Fig. 30.11

// call a function to allocate writable, executable memory at 0x70000000


VirtualAlloc(0x70000000, 0x1000, 0x3000, 0x40); // function at 0x7c809a51

// call a function to write the four-byte attack payload to 0x70000000


InterlockedExchange(0x70000000, 0xfeeb2ecd); // function at 0x7c80978e

// invoke the four bytes of attack payload machine code


((void (*)())0x70000000)(); // payload at 0x70000000

Fig. 30.13 The jump-to-libc attack activity caused by the maliciously corrupted stack in Fig. 30.12, expressed as C
source code. As the corrupted stack is unwound, instead of returning to call sites, the effect is a sequence of function calls,
first to functions in the standard Windows library kernel32.dll, and then to the attack payload

InterlockedExchange. Finally, the Inter- chine code that is useful to the attack. An attacker
lockedExchange function returns to the address may have difficulty in reliably determining these ad-
0x70000000, which at that time holds the attack dresses, for instance because of variability in the ver-
payload machine code in executable memory. sions of the target software and its libraries, or be-
(This attack is facilitated by two Windows par- cause of variability in the target software’s execution
ticulars: all Windows processes load the library environment. Artificially increasing this variability
kernel32.dll into their address space, and the is a useful defense against many types of such at-
Windows calling convention makes library func- tacks, as discussed later in this chapter.
tions responsible for popping their own arguments Traditionally, jump-to-libc attacks have tar-
off the stack. On other systems, the attacker would geted the system function in the standard sys-
need to slightly modify the details of the attack.) tem libraries, which allows the execution of an ar-
bitrary command with arguments, as if typed into
Attack 3: Constraints and Variants a shell command interpreter. This strategy can also
be taken in the above attack example, with a few sim-
A major constraint on jump-to-libc attacks is that ple changes. However, an attacker may prefer indi-
the attackers must craft such attacks with a knowl- rect code injection, because it requires launching no
edge of the addresses of the target-software ma- new processes or accessing any executable files, both
30.2 A Selection of Low-Level Attacks on C Software 643

of which may be detected or prevented by system de- discovered “animated cursor vulnerability” in Win-
fenses. dows [30.13]. Despite existing, deployed defenses,
For software that may become the target of jump- that vulnerability is subject to a jump-to-libc at-
to-libc attacks, one might consider eliminating tack similar to that in the above example.
any fragment of machine code that may be useful
to the attacker, such as the trampoline code shown
in Fig. 30.11. This can be difficult for many prac- 30.2.4 Attack 4: Corruption of Data
tical reasons. For instance, it is difficult to selec- Values that Determine Behavior
tively eliminate fragments of library code while, at
the same time, sharing the code memory of dy- Software programmers make many natural assump-
namic libraries between their instances in different tions about the integrity of data. As one example, an
processes; however, eliminating such sharing would initialized global variable may be assumed to hold
multiply the resource requirements of dynamic li- the same, initial value throughout the software’s ex-
braries. Also, it is not easy to remove data constants ecution, if it is never written by the software. Unfor-
embedded within executable code, which may form tunately, for C or C++ software, such assumptions
instructions useful to an attacker. (Examples of such may not hold in the presence of software bugs, and
data constants include the jump tables of C and C++ this may open the door to malicious attacks that cor-
switch statements.) rupt the data that determine the software’s behavior.
Those difficulties are compounded on hardware Unlike the previous attacks in this chapter, data
architectures that use variable-length sequences of corruption may allow the attacker to achieve their
opcode bytes for encoding machine-code instruc- goals without diverting the target software from its
tions. For example, on some versions of Windows, expected path of machine-code execution – either
the machine code for a system call is encoded us- directly or indirectly. Such attacks are referred to
ing a two-byte opcode sequence, 0xcd, 0x2e, while as data-only attacks or non-control-data attacks
the five-byte sequence 0x25, 0xcd, 0x2e, 0x00, [30.14]. In some cases, a single instance of data cor-
and 0x00 corresponds to an arithmetic operation ruption can be sufficient for an attacker to achieve
(the operation and eax, 0x2ecd, in x86 assem- their goals. Figure 30.14 shows an example of a C
bly code). Therefore, if an instruction for this partic- function that is vulnerable to such an attack.
ular and operation is present in the target software, As a concrete example of a data-only attack, con-
then jumping to its second byte can be one way of sider how the function in Fig. 30.14 makes use of
performing a system call. Similarly, any x86 instruc- the environment string table by calling the getenv
tion, including those that read or write memory, may routine in the standard C library. This routine re-
be executed through a jump into the middle of the turns the string that is passed to another standard
opcode-byte sequence for some other x86 machine- routine, system, and this string argument deter-
code instruction. mines what external command is launched. An at-
Indeed, for x86 Linux software, it has been re- tacker that is able to control the function’s two inte-
cently demonstrated that it is practical for elaborate ger inputs is able to write an arbitrary data value to
jump-to-libc attacks to perform arbitrary func- a nearly arbitrary location in memory. In particular,
tionality while executing only machine-code found this attacker is able to corrupt the table of the envi-
embedded within other instructions [30.12]. Much ronment strings to launch an external command of
as in the above example, these elaborate attacks pro- their choice.
ceed through the unwinding of the stack, but they Figure 30.15 gives the details of such an attack
may also “rewind” the stack in order to encode loops on the function in Fig. 30.14, by selectively show-
of activity. However, unlike in the above example, ing the address and contents of data and code mem-
these elaborate attacks may allow the attacker to ory. In this case, before the attack, the environment
achieve their goals without adding any new, exe- string table is an array of pointers starting at ad-
cutable memory or machine code the to target soft- dress 0x00353610. The first pointer in that table
ware under attack. is shown in Fig. 30.15, as are its contents: a string
Attacks like these are of great practical concern. that gives a path to the “all users profile.” In a cor-
For example, the flaw in the median function of rect execution of the function, some other pointer
Fig. 30.9 is in many ways similar to the recently in the environment string table would be to a string,
644 30 Low-Level Software Security by Example

void run_command_with_argument( pairs* data, int offset, int value )


{
// must have offset be a valid index into data
char cmd[MAX_LEN];
data[offset].argument = value;
{
char valuestring[MAX_LEN];
itoa( value, valuestring, 10 );
strcpy( cmd, getenv("SAFECOMMAND") );
strcat( cmd, " " );
strcat( cmd, valuestring );
}
data[offset].result = system( cmd );
}

Fig. 30.14 A C function that launches an external command with an argument value, and stores in a data structure that
value and the result of the command. If the offset and value can be chosen by an attacker, then this function is vulnerable
to a data-only attack that allows the attacker to launch an arbitrary external command

address attack command string data as integers as characters


0x00354b20 0x45464153 0x4d4d4f43 0x3d444e41 0x2e646d63 SAFECOMMAND=cmd.
0x00354b30 0x20657865 0x2220632f 0x6d726f66 0x632e7461 exe /c "format.c
0x00354b40 0x63206d6f 0x3e20223a 0x00000020 om c:" >

address first environment string pointer


0x00353610 0x00353730

address first environment string data as integers as characters


0x00353730 0x554c4c41 0x53524553 0x464f5250 0x3d454c49 ALLUSERSPROFILE=
0x00353740 0x445c3a43 0x6d75636f 0x73746e65 0x646e6120 C:Documents and
0x00353750 0x74655320 0x676e6974 0x6c415c73 0x7355206c SettingsAll Us
0x00353760 0x00737265 ers

address opcode bytes machine code as assembly language


0x004011a1 0x89 0x14 0xc8 mov [eax+ecx*8], edx ; write edx to eax+ecx*8

Fig. 30.15 Some of the memory contents for an execution of the function in Fig. 30.14, including the machine code
for the data[offset].argument = value; assignment. If the data pointer is 0x004033e0, the attacker can
choose the inputs offset = 0x1ffea046 and value = 0x00354b20, and thereby make the assignment instruction
change the first environment string pointer to the “format” command string at the top

such as SAFECOMMAND=safecmd.exe, that de- tacker is able to write the address of their chosen
termines a safe, external command to be launched attack command string, 0x00354b20, at that lo-
by the system library routine. cation. Then, when getenv is called, it will look
However, before reading the command string to no further than the first pointer in the environment
launch, the machine-code assignment instruction string table, and return a command string that, when
shown in Fig. 30.15 is executed. By choosing the launched, may delete data on the “C:” drive of the
offset and value inputs to the function, the at- target system.
tacker can make ecx and edx hold arbitrary val- Several things are noteworthy about this data-
ues. Therefore, the attacker can make the assignment only attack and the function in Fig. 30.14. First,
write any value to nearly any address in memory, note that there are multiple vulnerabilities that may
given knowledge of the data pointer. If the data allow the attacker to choose the offset integer
pointer is 0x004033e0, then that address plus input, ranging from stack-based and heap-based
80x1ffea046 is 0x00353610, the address of buffer overflows, through integer overflow errors,
the first environment string pointer. Thus, the at- to a simple programmer mistake that omitted
30.3 Defenses that Preserve High-Level Language Properties 645

any bounds check. Second, note that although 30.3 Defenses that Preserve
0x1ffea046 is a positive integer, it effectively High-Level Language Properties
becomes negative when multiplied by eight, and the
assignment instruction writes to an address before This section presents, in detail, four effective, prac-
the start of the data array. Finally, note that this tical defenses against low-level software attacks on
attack succeeds even when the table of environment x86 machine-code software, and explains how each
strings is initialized before the execution starts, and defense is based on preserving a property of tar-
the table is never modified by the target software get software written in the C or C++ languages.
– and when the table should therefore logically be These defenses are stack canaries, non-executable
read-only given the semantics of the target software. data, control-flow integrity, and address-space lay-
out randomization. They have been selected based
Attack 4: Constraints and Variants on their efficiency, and ease-of-adoption, as well as
their effectiveness.
There are two major constraints on data-only at- In particular, this section describes neither
tacks. First, the vulnerabilities in the target soft- defenses based on instruction-set randomiza-
ware are likely to allow only certain data, or a cer- tion [30.16], nor defenses based on dynamic infor-
tain amount of data to be corrupted, and poten- mation flow tracking, or tainting, or other forms of
tially only in certain ways. For instance, as in the data-flow integrity enforcement [30.17, 18]. Such
above example, a vulnerability might allow the at- techniques can offer strong defenses against all the
tacker to change a single, arbitrary four-byte integer attacks in Sect. 30.2, although, like the defenses be-
in memory to a value of their choice. (Such vulner- low, they also have limitations and counterattacks.
abilities exist in some heap implementations, as de- However, these defenses have drawbacks that make
scribed on p. 640; there, an arbitrary write is possi- their deployment difficult in practice.
ble through the corruption of heap metadata, most For example, unless they are supported by spe-
likely caused by the overflow of a buffer stored in cialized hardware, they incur significant overheads.
the heap. Many real-world attacks have exploited On unmodified, commodity x86 hardware, defenses
this vulnerability, including the GDI+JPEG attack based on data-flow integrity may double the mem-
on Windows [30.14, 15].) ory requirements, and may make execution up to
Second, even when an attacker can replace any 37 times slower [30.18]. Because these defenses also
amount of data with arbitrary values, and that data double the number of memory accesses, even the
may be located anywhere, a data-only attack will be most heavily optimized mechanism is still likely to
constrained by the behavior of the target software run software twice as slow [30.17]. Such overheads
when given arbitrary input. For example, if the tar- are likely to be unacceptable in many scenarios, e.g.,
get software is an arithmetic calculator, a data-only for server workloads where a proportional increase
attack might only be able to cause an incorrect re- in cost may be expected. Therefore, in practice, these
sult to be computed. However, if the target software defenses may never see widespread adoption, espe-
embeds any form of an interpreter that performs po- cially since equally good protection may be achiev-
tentially dangerous operations, then a data-only at- able using a combination of the below defenses.
tack could control the input to that interpreter, al- This section does not attempt a comprehensive
lowing the attacker to perform the dangerous oper- survey of the literature on these defenses. The survey
ations. The system standard library routine is an by Younan, Joosen and Piessens provides an overview
example of such an interpreter; many applications, of the state of the art of countermeasures for attacks
such as Web browsers and document viewers, em- like those discussed in this chapter [30.19, 20].
bed other interpreters for scripting languages.
To date, data-only attacks have not been promi-
nent. Rather, data corruption has been most fre- 30.3.1 Defense 1: Checking Stack
quently utilized as one step in other types of attacks, Canaries on Return Addresses
such as direct code injection, or a jump-to-libc at-
tack. This may change with the increased deploy- The C and C++ languages do not specify how func-
ment of defenses, including the defenses described tion return addresses are represented in stack mem-
below. ory. Rather, these, and many other programming
646 30 Low-Level Software Security by Example

languages, hold abstract most elements of a func- been corrupted, given the assumption that attacks
tion’s invocation stack frame in order to allow for are only possible through stack corruption based on
portability between hardware architectures and to the overflow of a string buffer. For improved de-
give compilers flexibility in choosing an efficient fenses, this public canary may contain other bytes,
low-level representation. This flexibility enables an such as newline characters, that frequently termi-
effective defense against some attacks, such as the nate the copying responsible for string-based buffer
return-address clobbering of Attack 1. overflows. For example, some implementations have
In particular, on function calls, instead of stor- used the value 0x000aff0d as the canary [30.21].
ing return addresses directly onto the stack, C and Stack-canary defenses may be improved by in-
C++ compilers are free to generate code that stores cluding in the canary value some bits that should be
return addresses in an encrypted and signed form, unknown to the attacker. For instance, this may help
using a local, secret key. Then, before each function defend against return-address clobbering with an in-
return, the compiler could emit code to decrypt and teger overflow, such as is enabled by the memcpy
validate the integrity of the return address about to vulnerability in Fig. 30.9. Therefore, some imple-
be used. In this case, assuming that strong cryptog- mentations of stack canary defenses, such as Mi-
raphy is used, an attacker that did not know the key crosoft’s /GS compiler option [30.22], are based on
would be unable to cause the target software to re- a random value, or cookie.
turn to an address of their choice as a result of a stack Figure 30.17 shows the machine code for a func-
corruption, even when the target software contains tion compiled with Microsoft’s /GS option. The
an exploitable buffer overflow vulnerability that al- function preamble and postamble each have three
lows such corruption. new instructions that set and check the canary,
In practice, it is desirable to implement an ap- respectively. With /GS, the canary placed on the
proximation of the above defense, and get most of stack is a combination of the function’s base pointer
the benefits without incurring the overwhelming and the function’s module cookie. Module cookies
cost of executing cryptography code on each func- are generated dynamically for each process, using
tion call and return. good sources of randomness (although some of
One such approximation requires no secret, but those sources are observable to an attacker running
places a public canary value right above function- code on the same system). Separate, fresh module
local stack buffers. This value is designed to warn cookies are used for the executable and each dy-
of dangerous stack corruption, much as a coal mine namic library within a process address space (each
canary would warn about dangerous air conditions. has its own copy of the __security_cookie
Figure 30.16 shows an example of a stack with an all- variable in Fig. 30.17). As a result, in a stack with
zero canary value. Validating the integrity of this ca- multiple canary values, each will be unique, with
nary is an effective means of ensuring that the saved more dissimilarity where the stack crosses module
base pointer and function return address have not boundaries.

address content
0x0012ff5c 0x00353037 ; argument two pointer
0x0012ff58 0x0035302f ; argument one pointer
0x0012ff54 0x00401263 ; return address
0x0012ff50 0x0012ff7c ; saved base pointer
0x0012ff4c 0x00000000 ; all-zero canary
0x0012ff48 0x00000072 ; tmp continues ’r’ ’\0’ ’\0’ ’\0’
0x0012ff44 0x61626f6f ; tmp continues ’o’ ’o’ ’b’ ’a’
0x0012ff40 0x662f2f3a ; tmp continues ’:’ ’/’ ’/’ ’f’
0x0012ff3c 0x656c6966 ; tmp array: ’f’ ’i’ ’l’ ’e’

Fig. 30.16 A stack snapshot like that shown in Fig. 30.4 where a “canary value” has been placed between the tmp array
and the saved base pointer and return address. Before returning from functions with vulnerabilities like those in Attack 1,
it is an effective defense to check that the canary is still zero: an overflow of a zero-terminated string across the canary’s
stack location will not leave the canary as zero
30.3 Defenses that Preserve High-Level Language Properties 647

function_with_gs_check:
; function preamble machine code
push ebp ; save old base pointer on the stack
mov ebp, esp ; establish the new base pointer
sub esp, 0x14 ; grow the stack for buffer and cookie
mov eax, [_ _security_cookie] ; read cookie value into eax
xor eax, ebp ; xor base pointer into cookie
mov [ebp-4], eax ; write cookie above the buffer
...
; function body machine code
...
; function postamble machine code
mov ecx, [ebp-4] ; read cookie from stack, into ecx
xor ecx, ebp ; xor base pointer out of cookie
call _ _security_check_cookie ; check ecx is cookie value
mov esp, ebp ; shrink the stack back
pop ebp ; restore old, saved base pointer
ret ; return

_ _security_check_cookie:
cmp ecx, [_ _security_cookie] ; compare ecx and cookie value
jnz ERR ; if not equal, go to an error handler
ret ; else return
ERR: jmp _ _report_gsfailure ; report failure and halt execution

Fig. 30.17 The machine code for a function with a local array in a fixed-size, 16-byte stack buffer, when compiled using
the Windows /GS implementation of stack cookies in the most recent version of the Microsoft C compiler [30.22, 23].
The canary is a random cookie value, combined with the base pointer. In case the local stack buffer is overflowed, this
canary is placed on the stack above the stack buffer, just below the return address and saved base pointer, and checked
before either of those values are used

Defense 1: Overhead, Limitations, Variants, function exit. Thus, they offer no defense against At-
and Counterattacks tacks 2, 3, and 4, which are based on corruption of
the heap, function-pointer arguments, or global data
There is little enforcement overhead from stack ca- pointers.
nary defenses, since they are only required in func- Stack canaries are a widely deployed defense
tions with local stack buffers that may be overflowed. mechanism. In addition to Microsoft’s /GS, Stack-
(An overflow in a function does not affect the in- Guard [30.21] and ProPolice [30.24] are two other
vocation stack frames of functions it calls, which notable implementations. Given its simple nature,
are lower on the stack; that function’s canary will it is somewhat surprising that there is significant
be checked before any use of stack frames that are variation between the implementations of this de-
higher on the stack, and which may have been cor- fense, and these implementations have varied over
rupted by the overflow.) For most C and C++ soft- time [30.22, 25]. This reflects the ongoing arms
ware this overhead amounts to a few percent [30.21, race between attackers and defenders. Stack canary
24]. Even so, most implementations aim to reduce defenses are subject to a number of counterat-
this overhead even further, by only initializing and tacks. Most notably, even when the only exploitable
checking stack canaries in functions that contain vulnerability is a stack-based buffer overflow, the
a local string char array, or meet other heuristic re- attackers may be able to craft an attack that is not
quirements. As a result, this defense is not always ap- based on return-address clobbering. For example,
plied where it might be useful, as evidenced by the the attack may corrupt a local variable, an argument,
recent ANI vulnerability in Windows [30.13]. or some other value that is used before the function
Stack canaries can be an efficient and effective exits.
defense against Attack 1, where the attacker cor- Also, the attacker may attempt to guess, or learn
rupts function-invocation control data on the stack. the cookie values, which can lead to a successful at-
However, stack canaries only check for corruption at tack given enough luck or determination. The suc-
648 30 Low-Level Software Security by Example

cess of this counterattack will depend on the ex- 30.3.2 Defense 2: Making Data not
ploited vulnerability, the attacker’s access to the tar- Executable as Machine Code
get system, and the particulars of the target software.
(For example, if stack canaries are based on random Many high-level languages allow code and data to
cookies, then the attacker may be able to exploit cer- reside in two, distinct types of memory. The C and
tain format-string vulnerabilities to learn which ca- C++ languages follow this tradition, and do not
nary values to embed in the data of the buffer over- specify what happens when code pointers are read
flow.) and written as data, or what happens when a data
Due to the counterattack where attackers over- pointer is invoked as if it were a function pointer.
write a local variable other than the return address, This under-specification brings important benefits
most implementations have been extended to re- to the portability of C and C++ software, since it
order organization of the stack frame. must sometimes run on systems where code and
Most details about the function-invocation stack data memory are truly different. It also enables a par-
frame are left unspecified in the C and C++ lan- ticularly simple and efficient defense against direct-
guages, to give flexibility in the compilation of those code-injection exploits, such as those in Attacks 1
language aspects down to a low-level representation. and 2. If data memory is not executable, then At-
In particular, the compiler is free to lay out function- tacks 1 and 2 fail as soon as the hardware instruction
local variables in any order on the stack, and to gen- pointer reaches the first byte of the attack payload
erate code that operates not on function arguments, (e.g., the bytes 0xfeeb2ecd described in Fig. 30.6,
but on copies of those arguments. and used throughout this chapter). Even when the
This is the basis of the variant of this coun- attacker manages to control the flow of execution,
termeasure. In this defense, the compiler places they cannot simply make control proceed directly to
arrays and other function-local buffers above all their attack payload. This is a simple, useful barrier
other function-local variables on the stack. Also, to attack, which can be directly applied to most soft-
the compiler makes copies of function arguments ware, since, in practice, most software never treats
into new, function-local variables that also sit below data as code.
any buffers in the function. As a result, these vari- (Some legacy software will execute data as a mat-
ables and arguments are not subject to corruption ter of course; other software uses self-modifying
through an overflow of those buffers. code and writes to code memory as a part of reg-
The stack cookie will also provide detection ular, valid execution. For example, this behavior can
of attacks that try to overwrite data of previous be seen in some efficient, just-in-time interpreters.
stack frames. Besides the guessing attack described However, such software can be treated as a special
earlier, two counterattacks still exist to this ex- case, since it is uncommon and increasingly rare.)
tended defense. In a first attack, an attacker can still
overwrite the contents of other buffers that may be Defense 2: Overhead, Limitations, Variants,
stored above the buffer that overflows. A second and Counterattacks
attack occurs when an attacker overwrites informa-
tion of any other stack frames or other information In its implementation on modern x86 systems, non-
that is stored above the current stack frame. If this executable data has some performance impact be-
information is used before the current function cause it relies on double-size, extended page tables.
returns (i.e., before the cookie is checked), then The NX page-table-entry bit, which flags memory
an attack may be possible. An example of such an as non-executable, is only found in PAE page ta-
attack is described in [30.22]: an attacker would bles, which are double the size of normal tables, and
overwrite the exception-handler pointers, which are are otherwise not commonly used. The precise de-
stored on the stack above the function stack frames. tails of page-table entries can significantly impact
The attacker would then cause an exception (e.g., the overall system performance, since page tables are
a stack overflow exception or a cookie mismatch ex- a frequently consulted part of the memory hierarchy,
ception), which would result in the attacker’s code with thousands of lookups a second and, in some
being executed [30.10]. This specific attack was cases, a lookup every few instructions. However, for
countered by applying Defense 3 to the exception most workloads, the overhead should be in the small
handler. percentages, and will often be close to zero.
30.3 Defenses that Preserve High-Level Language Properties 649

Non-executable data defends against direct code on control-flow between code are much more fine-
injection attacks, but offers no barrier to exploits grained
such as those in Attacks 3 and 4. For any given direct For example, the behavior of function calls is
code-injection attack, it is likely that an attacker can only defined when the callee code is the start of
craft an indirect jump-to-libc variant, or a data- a function, even when the caller invokes that code
only exploit [30.14]. Thus, although this defense can through a function pointer. Also, it is not valid to
be highly useful when used in combination with place a label into an expression, and goto to that
other defenses, by itself, it is not much of a stumbling label, or otherwise transfer control into the middle
block for attackers. of an expression being evaluated. Transferring con-
On Microsoft Windows, and most other plat- trol into the middle of a machine-code instruction is
forms, software will typically execute in a mode certainly not a valid, defined operation, in any high-
where writing to code memory generates a hardware level language, even though the hardware may allow
exception. In the past, some systems have also gener- this, and this may be useful to an attacker (see At-
ated such an exception when the hardware instruc- tack 3, p. 643).
tion pointer is directed to data memory, i.e., upon an Furthermore, within the control flow that a lan-
attempt to execute data as code. However, until re- guage permits in general, only a small fraction will,
cently, commodity x86 hardware has only supported in fact, be possible in the semantics of a particular
such exceptions through the use of segmented mem- piece of software written in that language. For most
ory, which runs counter to the flat memory model software, control flow is either completely static (e.g.,
that is fundamental to most modern operating sys- as in a C goto statement), or allows only a small
tems. (Despite being awkward, x86 segments have number of possibilities during execution.
been used to implement non-executable memory, Similarly, for all C or C++ software, any indirect
e.g., stacks, but these implementations are limited, control transfers, such as through function point-
for instance in their support for multi-threading and ers or at return statements, will have only a small
dynamic libraries.) number of valid targets. Dynamic checks can en-
Since 2003, and Windows XP SP2, commodity sure that the execution of low-level software does not
operating systems have come to support the x86 ex- stray from a restricted set of possibilities allowed by
tended page tables where any given memory page the high-level software. The runtime enforcement of
may be marked as non-executable, and x86 vendors such a control-flow integrity (CFI) security policy is
have shipped processors with the required hardware a highly effective defense against low-level software
support. Thus, it is now the norm for data memory attacks [30.26, 27].
to be non-executable. There are several strategies possible in the im-
Indirect code injection, jump-to-libc attacks, plementation of CFI enforcement. For instance, CFI
and data-only attacks are all effective counterattacks may be enforced by dynamic checks that compare
to this defense. Even so, non-executable data can the target address of each computed control-flow
play a key role in an overall defense strategy; for in- transfer to a set of allowed destination addresses.
stance, when combined with Defense 4 below, this Such a comparison may be performed by the
defense can prevent an attacker from knowing the machine-code equivalent of a switch statement over
location of any executable memory bytes that could a set of constant addresses. Programmers can even
be useful to an attack. make CFI checks explicitly in their software, as
shown in Fig. 30.18. However, unlike in Fig. 30.18,
it is not possible to write software that explicitly
30.3.3 Defense 3: Enforcing performs CFI checks on return addresses, or other
Control-Flow Integrity inaccessible pointers; for these, CFI checks must
on Code Execution be added by the compiler, or some other mech-
anism. Also, since the set of allowed destination
As in all high-level languages, it is not possible for addresses may be large, any such sequence of ex-
software written in the C and C++ languages to per- plicit comparisons is likely to lead to unacceptable
form arbitrary control-flow transfers between any overhead.
two points in its code. Compared to the exclusion One efficient CFI enforcement mechanism, de-
of data from being executed as code, the policies scribed in [30.26], modifies according to a given
650 30 Low-Level Software Security by Example

int is_file_foobar_using_heap( vulnerable* s, char* one, char* two )


{
// ... elided code ...
if( (s->cmp == strcmp) || (s->cmp == stricmp) ) {
return s->cmp( s->buff, "file://foobar" );
} else {
return report_memory_corruption_error();
}
}

Fig. 30.18 An excerpt of the C code in Fig. 30.7 with explicit CFI checks that only allow the proper comparison methods
to be invoked at runtime, assuming only strcmp and stricmp are possible. These CFI checks prevent the exploit on
this function in Attack 2

sort2 (): sort (): lt ():


bool lt(int x, int y) { label 17
return x < y;
}
call sort call 17, R ret 23
bool gt(int x, int y) {
return x > y; label 55 label 23 label 17
}
sort2(int a[ ], int b[ ], int len) call sort ret 55 ret 23
{
sort( a, len, lt ); label 55

sort( b, len, gt );
} ret ...

Fig. 30.19 Three C functions and an outline of their possible control flow, as well as how an CFI enforcement mech-
anism based on CFI labels might apply to the functions. In the outline, the CFI labels 55, 17, and 23 are found at the
valid destinations of computed control-flow instructions; each such instruction is also annotated with a CFI label that
corresponds to its valid destinations

control-flow graph (CFG), both the source and desti- arrows. In this example, sort can return to two
nation instructions of computed control-flow trans- different places in sort2. Therefore, there are two
fers. Two destinations are equivalent, when the CFG CFI labels in the body of sort2, and a CFI check
contains edges to each from the same set of sources. when returning from sort, using 55 as the CFI
At each destination, a CFI label is inserted, that iden- label. (Note that CFI enforcement does not guaran-
tifies equivalent destinations, i.e., destinations with tee to which of the two call sites sort must return;
the same set of possible sources. The CFI labels em- for this, other defenses, such as Defense 1, must be
bed a value, or bit pattern, that distinguishes each; employed.)
these values need not be secret. Before each source Also, in Fig. 30.19, because sort can call ei-
instruction, a dynamic CFI check is inserted that en- ther lt or gt, both comparison functions start with
sures that the runtime destination has the proper the CFI label 17, and the call instruction, which
CFI label. uses a function pointer in register R, performs a CFI
Figure 30.19 shows a C program fragment check for 17. Finally, the CFI label 23 identifies the
demonstrating this CFI enforcement mechanism. block that follows the comparison call site in sort,
In this figure, a function sort2 calls a qsort-like so both comparison functions return with a CFI
function sort twice, first with lt and then with check for 23.
gt as the pointer to the comparison function. The Figure 30.20 shows a concrete example of how
right side of Fig. 30.19 shows an outline of the CFI enforcement based on CFI labels can look, in the
machine-code blocks for these four functions and case of x86 machine-code software. Here, the CFI
all control-flow-graph edges between them. In the label 0x12345678 identifies all comparison rou-
figure, edges for direct calls are drawn as light, tines that may be invoked by qsort, and the CFI
dotted arrows, edges from source instructions are label 0xaabbccdd identifies all of their valid call
drawn as solid arrows, and return edges as dashed sites. This style of CFI enforcement has good perfor-
30.3 Defenses that Preserve High-Level Language Properties 651

machine-code opcode bytes machine code in assembly


... ...
0x57 push edi
0x53 push ebx
0x8b 0x44 0x24 0x24 mov eax, [esp+comp_fp]
0x81 0x78 0xfc 0x78 0x56 0x34 0x12 cmp [eax-0x4], 0x12345678
0x75 0x13 jne cfi_error_label
0xff 0xd0 call eax
0x0f 0x18 0x80 0xdd 0xcc 0xbb 0xaa prefetchnta [0xaabbccdd]
0x83 0xc4 0x08 add esp, 0x8
0x85 0xc0 test eax, eax
0x7e 0x02 jle label_lessthan
... ...

Fig. 30.20 A version of Fig. 30.10, showing how CFI checks as in [30.26] can be added to the qsort library function
where it calls the comparison function pointer. Before calling the pointer, it is placed in a register eax, and a compari-
son establishes that the four bytes 0x12345678 are found immediately before the destination code, otherwise execution
goes to a security error. After the call instruction, an executable, side-effect-free instruction embeds the constant 0xaab-
bccdd; by comparing against this constant, the comparison function can establish that it is returning to a valid call site

mance, and also gives strong guarantees. By choos- a mean of 16%. Even so, this overhead is significant
ing the bytes of CFI labels carefully, so they do not enough that CFI enforcement has, to date, seen only
overlap with code, even an attacker that controls all limited adoption. However, a form of CFI is en-
of data memory cannot divert execution from the forced by the Windows SafeSEH mechanism, which
permitted control-flow graph, assuming that data is limits dispatching of exceptions to a set of statically
also non-executable. declared exception handlers; this mechanism does
The CFI security policy dictates that software ex- not incur measurable overheads.
ecution must follow a path of a control-flow graph, CFI enforcement offers no protection against At-
determined ahead of time, that represents all pos- tack 4 or other data-only attacks. However, CFI can
sible valid executions of the software. This graph be a highly effective defense against all attacks based
can be defined by analysis: source-code analysis, bi- on controlling machine-code execution, including
nary analysis, or execution profiling. This graph does Attacks 1, 2, and 3.
not need to be perfectly accurate, but needs only In particular, CFI enforcement is likely to pre-
be a conservative approximation of the control-flow vent all variants of Attack 3, i.e., jump-to-libc at-
graph possible in the software, as written in its high- tacks that employ trampolines or opportunistic ex-
level programming language. To be conservative, the ecutable byte sequences such as those found em-
graph must err on the side of allowing all valid ex- bedded within machine-code instructions. This is
ecutions of the software, even if this may entail al- the case even if CFI enforces only a coarse-grained
lowing some invalid executions as well. For instance, approximation of the software control-flow graph,
the graph might conservatively permit the start of such as allowing function-pointer calls to the start
a few-too-many functions as the valid destinations of any function with the same argument types, and
of a source instruction where a function pointer is allowing functions to return to any of their possible
invoked. call sites [30.26].
CFI enforcement mechanisms vary both in their
Defense 3: Overhead, Limitations, Variants, mechanisms and in their policy. Some mechanisms
and Counterattacks establish the validity of each computed control
transfer by querying a separate, static data struc-
CFI enforcement incurs only modest overhead. ture, which can be a hash table, a bit vector, or
With the CFI enforcement mechanism in [30.26], a structure similar to multi-level page tables [30.28].
which instruments x86 machine code much as is Other mechanisms execute the software in a fast
shown in Fig. 30.20, the reported code-size increase machine-code interpreter that enforces CFI on
is around 8%, and execution slowdown ranges from control flow [30.29]. Finally, a coarse-grained form
0% to 45% on a set of processor benchmarks, with of CFI can be enforced by making all computed-
652 30 Low-Level Software Security by Example

control-flow destinations be aligned on multi-word fling, may be public information on the system
boundaries. (However, in this last case, any “basic where the software executes. However, low-level
block” is effectively a valid destination, so trampo- software attacks, including most worms, viruses,
lines and elaborate jump-to-libc attacks are still adware, spyware, and malware, are often per-
feasible.) The complexity and overheads of these formed by remote attackers that have no existing
CFI mechanisms vary, but are typically greater than means of running code on their target system, or
that described above, based on CFI labels. otherwise inspect the addresses utilized on that
In a system with CFI enforcement, any exploit system. To overcome ASLR defenses, such attack-
that does not involve controlling machine-code ex- ers will have to craft attacks that do not depend
ecution is a likely counterattack; this includes not on addresses, or somehow guess or learn those
only data-only attacks, such as Attack 4, but also addresses.
other, higher-level attacks, such as social engineer- ASLR is not intended to defend against attack-
ing and flaws in programming interfaces [30.30]. In ers that are able to control the software execution,
addition, depending on the granularity of CFI en- even to a very small degree. Like many other de-
forcement policy, and how it is used in combination fenses that rely on secrets, ASLR is easily circum-
with other defenses, there may still exist possibili- vented by an attacker that can read the software’s
ties for certain jump-to-libc attacks, for instance memory. Once an attacker is able to execute even
where a function is made to return to a dynamically the smallest amount of code of their choice (e.g.,
incorrect, but statically possible, call site. in a jump-to-libc attack), it should be safely as-
sumed that the attacker can read memory and, in
particular, that ASLR is no longer an obstacle. Fortu-
30.3.4 Defense 4: Randomizing nately, ASLR and the other defenses in this chapter
the Layout of Code and Data can be highly effective in preventing attackers from
in Memory successfully executing even a single machine-code
instruction of their choice.
The C and C++ languages specify neither where As a concrete example of ASLR, Fig. 30.21 shows
code is located in memory, nor the location of two execution stacks for the median function of
variables, arrays, structures, or objects. For software Fig. 30.9, taken from two executions of that func-
compiled from these languages, the layout of code tion on Windows Vista, which implements ASLR
and data in memory is decided by the compiler and defenses. These stacks contain code addresses, in-
execution environment. This layout directly deter- cluding a function pointer and return address; they
mines all concrete addresses used during execution; also include addresses in data pointers that point
attacks, including all of the attacks in Sect. 30.2, into the stack, and in the data argument which
typically depend on these concrete addresses. points into the heap. All of these addresses are dif-
Therefore, a simple, pervasive form of address ferent in the two executions; only the integer inputs
“encryption” can be achieved by shuffling, or ran- remain the same.
domizing the layout of software in the memory On many software platforms, ASLR can be ap-
address space, in a manner that is unknown to the plied automatically, in manner that is compatible
attacker. Defenses based on such address-space even with legacy software. In particular, ASLR
layout randomization (ASLR) can be a highly prac- changes only the concrete values of addresses, not
tical, effective barrier against low-level attacks. how those addresses are encoded in pointers; this
Such defenses were first implemented in the PaX makes ASLR compatible with common, legacy pro-
project [30.31] and have recently been deployed in gramming practices that depend on the encoding of
Windows Vista [30.23, 32]. addresses.
ASLR defenses can be used to change the ad- However, ASLR is both easier to implement, and
dresses of all code, global variables, stack variables, is more compatible with legacy software, when data
arrays, and structures, objects, and heap alloca- and code is shuffled at a rather coarse granularity.
tions; with ASLR those addresses are derived from For instance, software may simultaneously use more
a random value, chosen for the software being than a million heap allocations; however, on a 32-
executed and the system on which it executes. bit system, if an ASLR mechanism randomly spread
These addresses, and the memory-layout shuf- those allocations uniformly throughout the address
30.3 Defenses that Preserve High-Level Language Properties 653

stack one stack two


address contents address contents
0x0022feac 0x008a13e0 0x0013f750 0x00b113e0 ; cmp argument
0x0022fea8 0x00000001 0x0013f74c 0x00000001 ; len argument
0x0022fea4 0x00a91147 0x0013f748 0x00191147 ; data argument
0x0022fea0 0x008a1528 0x0013f744 0x00b11528 ; return address
0x0022fe9c 0x0022fec8 0x0013f740 0x0013f76c ; saved base pointer
0x0022fe98 0x00000000 0x0013f73c 0x00000000 ; tmp final four bytes
0x0022fe94 0x00000000 0x0013f738 0x00000000 ; tmp continues
0x0022fe90 0x00000000 0x0013f734 0x00000000 ; tmp continues
0x0022fe8c 0x00000000 0x0013f730 0x00000000 ; tmp continues
0x0022fe88 0x00000000 0x0013f72c 0x00000000 ; tmp continues
0x0022fe84 0x00000000 0x0013f728 0x00000000 ; tmp continues
0x0022fe80 0x00000000 0x0013f724 0x00000000 ; tmp continues
0x0022fe7c 0x00000000 0x0013f720 0x00000000 ; tmp buffer starts
0x0022fe78 0x00000004 0x0013f71c 0x00000004 ; memcpy length argument
0x0022fe74 0x00a91147 0x0013f718 0x00191147 ; memcpy source argument
0x0022fe70 0x0022fe8c 0x0013f714 0x0013f730 ; memcpy destination arg.

Fig. 30.21 The addresses and contents of the stacks of two different executions of the same software, given the same
input. The software is the median function of Fig. 30.9, the input is an array of the single integer zero, and the stacks
are snapshots taken at the same point as in Fig. 30.12. The snapshots are taken from two executions of that function on
Windows Vista, with a system restart between the executions. As a result of ASLR defenses, only the input data remains
the same in the two executions. All addresses are different; even so, some address bits remain the same since, for efficiency
and compatibility with existing software, ASLR is applied only at a coarse granularity

space, then only small contiguous memory regions system libraries applies to all software that executes
would remain free. Then, if that software tried to al- on the Vista operating system; the next time the
locate an array whose size is a few tens of kilobytes, system restarts, the libraries are located from a new
that allocation would most likely fail, even though, random starting point.
without this ASLR mechanism, it might certainly When a Windows Vista process is launched, sev-
have succeeded. On the other hand, without caus- eral other addresses are chosen randomly for that
ing incompatibility with legacy software, an ASLR process instance, if the main executable opts in to
mechanism could change the base address of all heap ASLR defenses. For instance, the base of the initial
allocations, and otherwise leave the heap implemen- heap is chosen from 32 possibilities. The stacks of
tation unchanged. (This also avoids triggering latent process threads are randomized further: the stack
bugs, such as the software’s continued use of heap base is chosen from 32 possibilities, and a pad of un-
memory after deallocation, which are another po- used memory, whose size is random, is placed on top
tential source of incompatibility.) of the stack, for a total of about 16 thousand pos-
In the implementation of ASLR on Windows sibilities for the address of the initial stack frame.
Vista, the compilers and the execution environ- In addition, the location of some other memory re-
ment have been modified to avoid obstacles faced gions is also chosen randomly from 32 possibilities,
by other implementations, such as those in the including thread control data and the process envi-
PaX project [30.31]. In particular, the software ronment data (which includes the table corrupted in
executables and libraries of all operating system Attack 4). For processes, the ASLR implementation
components and utilities have been compiled with chooses new random starting points each time that
information that allows their relocation in memory a process instance is launched.
at load time. When the operating system starts, An ASLR implementation could be designed
the system libraries are located sequentially in to shuffle the memory layout at a finer granular-
memory, in the order they are needed, at a starting ity than is done in Windows Vista. For instance,
point chosen randomly from 256 possibilities; thus a pad of unused memory could be inserted within
a jump-to-libc attack that targets the concrete the stack frame of all (or some) functions; also,
address of a library function will have less than the inner memory allocation strategy of the heap
a 0.5% chance of succeeding. This randomization of could be randomized. However, in Windows Vista,
654 30 Low-Level Software Security by Example

such an ASLR implementation would incur greater Furthermore, at least on 32-bit systems, the num-
overhead, would cause more software compatibility ber of possible ASLR shuffles is insufficient to pro-
issues, and might be likely to thwart mostly attacks vide a defense against scenarios where the attacker
that are already covered by other deployed defenses. is able to retry their attack repeatedly, with new ad-
In particular, there can be little to gain from shuf- dresses [30.33]. Such attacks are realistic. For exam-
fling the system libraries independently for each ple, because a failed attack did not crash the soft-
process instance [30.33]; such an ASLR implemen- ware in the case of the recent ANI vulnerability
tation would be certain to cause large performance in Windows [30.13], an attack, such as a script in
and resource overheads. a malicious Web page, could try multiple addresses
until a successful exploit was found. However, in
the normal case, when failed attacks crash the tar-
Defense 4: Overhead, Limitations, Variants, get software, attacks based on retrying can be miti-
and Counterattacks gated by limiting the number of times the software is
restarted. In the ASLR implementation in Windows
The enforcement overhead of ASLR defenses will Vista, such limits are in place for many system com-
vary greatly depending on the implementation. In ponents.
particular, implementations where shared libraries ASLR defenses provide one form of software di-
may be placed at different addresses in different versity, which has been long known to provide se-
processes will incur greater overhead and consume curity benefits. One way to achieve software diver-
more memory resources. sity is to deploy multiple, different implementations
However, in its Windows Vista implementation, of the same functionality. However, this approach is
ASLR may actually slightly improve performance. costly and may offer limited benefits: its total cost is
This improvement is a result of ASLR causing library proportional to the number of implementations and
code to be placed contiguously into the address programmers are known to make the same mistakes
space, in the order that the code is actually used. when implementing the same functionality [30.35].
This encourages a tight packing of frequently used ASLR has a few counterattacks other than the
page-table entries, which has performance benefits data-only, content-based attacks, and the per-
(cf., the page-table changes for non-executable data, sistent guessing of an attacker, which are both
discussed on p. 648). discussed above. In particular, an otherwise harm-
ASLR can provide effective defenses against all of less information-disclosure vulnerability may allow
the attacks in Sect. 30.2 of this chapter, because it ap- an attacker to learn how addresses are shuffled, and
plies to the addresses of both code and data. Even circumvent ASLR defenses. Although unlikely, such
so, some data-only attacks remain possible, where a vulnerability may be present because of a format-
the attacks do not depend on concrete addresses, but string bug, or because the contents of uninitialized
rely on corrupting the contents of the data being pro- memory are sent on the network when that memory
cessed by the target software. contains residual addresses.
The more serious limitation of ASLR is the small Another type of counterattack to ASLR defenses
number of memory layout shuffles that are possible is based on overwriting only the low-order bits of
on commodity 32-bit hardware, especially given the addresses, which are predictable because ASLR is
coarse shuffling granularity that is required for effi- applied at a coarse granularity. Such overwrites are
ciency and compatibility with existing software. As sometimes possible through buffer overflows on
a result, ASLR creates only at most a few thousand little-endian architectures, such as the x86. For ex-
possibilities that an attacker must consider, and any ample, in Fig. 30.21, if there were useful trampoline
given attack will be successful against a significant machine-codes to be found seven bytes into the
(albeit small) number of target systems. The num- cmp function, then changing the least-significant
ber of possible shuffles in an ASLR implementation byte of the cmp address on the stack from 0xe0 to
can be greatly increased on 64-bit platforms, which 0xe7 would cause that code to be invoked. An at-
are starting to be adopted. However, current 64-bit tacker that succeeded in such corruption might well
hardware is limited to 48 usable bits and can there- be able to perform a jump-to-libc attack much
fore offer at most a 64-thousand-fold increase in the like that in Attack 3. (However, for this particular
number of shuffles possible [30.34]. stack, the attacker would not succeed, since the
30.4 Summary and Discussion 655

cmp address will always be overwritten completely bugs, such as buffer-overflow vulnerabilities. Some
when the vulnerability in the median function in compiler techniques, such as bounds checking, can
Fig. 30.9 is exploited.) reduce or eliminate the problem of buffer-overflow
Despite the above counterattacks, ASLR is an vulnerabilities. However, due to the existence of
effective barrier to attack, especially when com- programmer-manipulated pointers, applying such
bined with the defenses described previously in this checks to C is a hard problem. As a result, this type
section. of checking comes at a hefty cost to performance,
lacks scalability or results in code incompatibility
[30.38]. While recent advances have been made
30.4 Summary and Discussion with respect to performance and compatibility,
these newer approaches still suffer from scalability
The distinguishing characteristic of low-level soft- problems [30.39], or achieve higher performance
ware attacks is that they are dependent on the low- by being less accurate [30.40]. These problems are
level details of the software’s executable representa- the main reasons that this type of checking has not
tion and its execution environment. As a result, de- made it into mainstream operating systems and
fenses against such attacks can be based on changing compilers.
those details in ways that are compatible with the Many of the bugs can also be eliminated by us-
software’s specification in a higher-level program- ing other, advanced compiler techniques, like those
ming language. used in the Cyclone [30.41], CCured [30.42], and
As in Defense 1, integrity bits can be added to Deputy [30.43] systems. But these techniques are
the low-level representation of state, to make attacks not widely applicable: they require pervasive source-
more likely to be detected, and stopped. As in De- code changes, runtime memory-management sup-
fenses 2 and 3, the low-level representation can be port, restrictions on concurrency, and result in sig-
augmented with a conservative model of behavior nificant enforcement overhead.
and with runtime checks that ensure execution con- In comparison, the defenses in this chapter
forms to that model. Finally, as in Defenses 1 and 4, have very low overheads, require no source code
the low-level representation can be encoded with changes but at most re-compilation, and are widely
a secret that the attacker must guess, or otherwise applicable to legacy software written in C, C++,
learn, in order to craft functional attacks. and similar languages. For instance, they have been
However, defenses like those in this chapter fall applied pervasively to recent Microsoft software,
far short of a guarantee that the software exhibits including all the components of the Windows Vista
only the low-level behavior that is possible in the operating system. As in that case, these defenses are
software’s higher-level specification. Such guaran- best used as one part of a comprehensive software-
tees are hard to come by. For languages like C and engineering methodology designed to reduce
C++, there are efforts to build certifying compil- security vulnerabilities. Such a methodology should
ers that can provide such guarantees, for correct include, at least, threat analysis, design and code
software [30.36, 37]. Unfortunately, even these com- reviews for security, security testing, automatic anal-
pilers offer few, or no guarantees in the presence of ysis for vulnerabilities, and the rewriting of software

Table 30.1 A table of the relationship between the attacks and defenses in this chapter. None of the defenses completely
prevent the attacks, in all of their variants. The first defense applies only to the stack, and is not an obstacle to the heap-
based Attack 2. Defenses 2 and 3 apply only to the control flow of machine-code execution, and do not prevent the
data-only Attack 4. When combined with each other, the defenses are stronger than when they are applied in isolation

Return address Heap function Jump-to-libc Non-control


corruption (A1) pointer (A3) data (A4)
corruption (A2)

Stack canary (D1) Partial defense Partial defense Partial defense


Non-executable data (D2) Partial defense Partial defense Partial defense
Control-flow integrity (D3) Partial defense Partial defense Partial defense
Address space layout randomization (D4) Partial defense Partial defense Partial defense Partial defense
656 30 Low-Level Software Security by Example

to use safer languages, interfaces, and programming 30.9. klog: The Frame Pointer Overwrite, Phrack 55
practices [30.1]. (1999)
The combination of the defenses in this chap- 30.10. D. Litchfield: Defeating the stack buffer overflow
ter forms a substantial, effective barrier to all low- prevention mechanism of Microsoft Windows
2003 Server, available at http://www.nextgenss.
level attacks; although, as summarized in Table 30.1,
com/papers/defeating-wk-stack-protection.pdf
each offers only partial protection against certain at- (2003)
tacks. In particular, they greatly reduce the likeli- 30.11. rix: Smashing C++ VPTRs, Phrack 56 (2000)
hood that an attacker can exploit a low-level secu- 30.12. H. Shacham: The geometry of innocent flesh on
rity vulnerability for purposes other than a denial- the bone: return-into-libc without function calls
of-service attack. The adoption of these countermea- (on the x86), Proc. 14th ACM Conf. on Computer
sures, along with continuing research in this area and Communications Security (CCS’07) (2007)
which further improves the protection offered by pp. 552–561
such countermeasures and with improved program- 30.13. M. Howard: Lessons learned from the Animated
Cursor Security Bug, available at http://blogs.
ming practices which aim to eliminate buffer over-
msdn.com/sdl/archive//// lessons-
flows and other underlying security vulnerabilities, learned-from-the-animated-cursor-security-
offers some hope that, for C and C++ software, low- bug.aspx (2007)
level software security may become less of a concern 30.14. S. Chen, J. Xu, E.C. Sezer, P. Gauriar, R. Iyer: Non-
in the future. control-data attacks are realistic threats, Proc. 14th
USENIX Security Symp. (2005) pp. 177–192
30.15. E. Florio: GDIPLUS VULN – MS04-028 – CRASH
Acknowledgements Thanks to Martín Abadi for suggest- TEST JPEG, full-disclosure at lists.netsys.com
ing the structure of the original tutorial, and to Yinglian Xie (2004)
for proofreading and for suggesting useful improvements
30.16. G.S. Kc, A.D. Keromytis, V. Prevelakis: Counter-
to the exposition.
ing code-injection attacks with instruction-set ran-
domization, Proc. 10th ACM Conf. on Computer
and Communications Security (CCS’03) (2003)
References pp. 272–280
30.17. M. Castro, M. Costa, T. Harris: Securing software
30.1. M. Howard, S. Lipner: The Security Development by enforcing data-flow integrity, Proc. 7th Symp.
Lifecycle (Microsoft Press, Redmond, Washington on Operating Systems Design and Implementation
2006) (OSDI’06) (2006) pp. 147–160
30.2. E.H. Spafford: The Internet worm program: An 30.18. J. Newsome, D. Song: Dynamic taint analysis for
analysis, SIGCOMM Comput. Commun. Rev. automatic detection, analysis, and signature gen-
19(1), 17–57 (1989) eration of exploits on commodity software, Proc.
30.3. Intel Corporation: Intel IA-32 Architecture, Soft- 12th Annual Network and Distributed System Se-
ware Developer’s Manual, Volumes 1–3, available curity Symp. (NDSS’07) (2005)
at http://developer.intel.com/design/Pentium/
30.19. Y. Younan, W. Joosen, F. Piessens: Code injection in
documentation.htm (2007)
C and C++: a survey of vulnerabilities and counter-
30.4. C. Cowan, M. Barringer, S. Beattie, G. Kroah-
measures, Technical Report CW386 (Departement
Hartman, M. Frantzen, J. Lokier: FormatGuard:
Computerwetenschappen, Katholieke Universiteit
Automatic protection from printf format string
Leuven, 2004)
vulnerabilities, Proc. 10th USENIX Security Symp.
(2001) pp. 191–200 30.20. Y. Younan: Efficient countermeasures for software
30.5. D. Brumley, T. Chiueh, R. Johnson, H. Lin, D. Song: vulnerabilities due to memory management errors,
Efficient and accurate detection of integer-based at- Ph.D. Thesis (2008)
tacks, Proc. 14th Annual Network and Distributed 30.21. C. Cowan, C. Pu, D. Maier, H. Hinton, J. Walpole,
System Security Symp. (NDSS’07) (2007) P. Bakke, S. Beattie, A. Grier, P. Wagle, Q. Zhang:
30.6. J. Pincus, B. Baker: Beyond stack smashing: recent StackGuard: automatic adaptive detection and
advances in exploiting buffer overruns, IEEE Secur. prevention of buffer-overflow attacks, Proc. 7th
Privacy 2(4), 20–27 (2004) USENIX Security Symp. (1998) pp. 63–78
30.7. M. Bailey, E. Cooke, F. Jahanian, D. Watson, 30.22. B. Bray: Compiler security checks in depth, avail-
J. Nazario: The blaster worm: Then and now, IEEE able at http://msdn.microsoft.com/en-us/library/
Secur. Privacy 03(4), 26–31 (2005) aa(vs.).aspx (2002)
30.8. J.C. Foster: Metasploit Toolkit for Penetration Test- 30.23. M. Howard, M. Thomlinson: Windows Vista ISV
ing, Exploit Development, and Vulnerability Re- Security, available at http://msdn.microsoft.com/
search (Syngress Publishing, Burlington, MA 2007) en-us/library/bb.aspx (2007)
References 657

30.24. H. Etoh, K. Yoda: ProPolice: improved stack smash- 30.34. Wikipedia: x86-64, http://en.wikipedia.org/wiki/
ing attack detection, Trans. Inform. Process. Soc. X- (2007)
Japan 43(12), 4034–4041 (2002) 30.35. B. Littlewood, P. Popov, L. Strigini: Modeling soft-
30.25. M. Howard: Hardening stack-based buffer overrun ware design diversity: A review, ACM Comput.
detection in VC++ 2005 SP1, available at Surv. 33(2), 177–208 (2001)
http://blogs.msdn.com/michael_howard/archive/ 30.36. S. Blazy, Z. Dargaye, X. Leroy: Formal verification
///hardening-stack-based-buffer- of a C compiler front-end, Proc. 14th Int. Symp.
overrun-detection-in-vc-2005-sp1.aspx (2007) on Formal Methods (FM’06), Vol. 4085 (2006)
30.26. M. Abadi, M. Budiu, Ú. Erlingsson, J. Ligatti: pp. 460–475
Control-flow integrity, Proc. 12th ACM Conf. 30.37. X. Leroy: Formal certification of a compiler back-
on Computer and Communications Security end, or: programming a compiler with a proof as-
(CCS’05) (2005) pp. 340–353 sistant, Proc. 33rd Symp. on Principles of Program-
30.27. M. Abadi, M. Budiu, Ú. Erlingsson, J. Ligatti: A the- ming Languages (POPL’06) (2006) pp. 42–54
ory of secure control flow, Proc. 7th Int. Conf. on 30.38. R. Jones, P. Kelly: Backwards-compatible bounds
Formal Engineering Methods (ICFEM’05) (2005) checking for arrays and pointers in C programs,
pp. 111–124 Proc. 3rd Int. Workshop on Automatic Debugging
30.28. C. Small: A tool for constructing safe extensible (1997) pp. 13–26
C++ systems, Proc. 3rd Conf. on Object-Oriented 30.39. D. Dhurjati, V. Adve: Backwards-compatible ar-
Technologies and Systems (COOTS’97) (1997) ray bounds checking for C with very low over-
30.29. V. Kiriansky, D. Bruening, S. Amarasinghe: Secure head, Proc. 28th Int. Conf. on Software Engineer-
execution via program shepherding, Proc. 11th ing (ICSE ’06) (2006) pp. 162–171
USENIX Security Symp. (2002) pp. 191–206 30.40. P. Akritidis, C. Cadar, C. Raiciu, M. Costa, M. Cas-
30.30. R.J. Anderson: Security Engineering: A Guide to tro: Preventing memory error exploits with WIT,
Building Dependable Distributed Systems (John Wi- Proc. 2008 IEEE Symp. on Security and Privacy
ley and Sons, New York, 2001) (2008) pp. 263–277
30.31. PaX Project: The PaX Project, 30.41. T. Jim, G. Morrisett, D. Grossman, M. Hicks, J. Ch-
http://pax.grsecurity.net/ (2004) eney, Y. Wang: Cyclone: a safe dialect of C, USENIX
30.32. M. Howard: Alleged bugs in Windows Vista’s Annual Technical Conf. (2002) pp. 275–288
ASLR implementation, available at http://blogs. 30.42. G.C. Necula, S. McPeak, W. Weimer: CCured:
msdn.com/michael_howard/archive/// Type-safe retrofitting of legacy code, Proc. 29th
/Alleged-Bugs-in-Windows-Vista__ ACM Symp. on Principles of Programming Lan-
s-ASLR-Implementation.aspx (2006) guages (POPL’02) (2002) pp. 128–139
30.33. H. Shacham, M. Page, B. Pfaff, E-J. Goh, 30.43. F. Zhou, J. Condit, Z. Anderson, I. Bagrak, R. En-
N. Modadugu, D. Boneh: On the effectiveness nals, M. Harren, G.C. Necula, E. Brewer: SafeDrive:
of address-space randomization, Proc. 11th ACM Safe and recoverable extensions using language-
Conf. on Computer and Communications Security based techniques, Proc. 7th conference on USENIX
(CCS’04) (2004) pp. 298–307 Symp. on Operating Systems Design and Imple-
mentation (OSDI’06) (2006) pp. 45–60
658 30 Low-Level Software Security by Example

The Authors

Úlfar Erlingsson is a researcher at Microsoft Research, Silicon Valley, where he joined in 2003.
Since 2008 he has also had a joint position as an Associate Professor at Reykjavik University,
Iceland. He holds an MSc from Rensselaer Polytechnic Institute, and a PhD from Cornell
University, both in Computer Science. His research interests center on software security and
practical aspects of computer systems, in particular distributed systems. Much of his recent
work has focused on the security of low-level software, such as hardware device drivers.

Úlfar Erlingsson
School of Computer Science
Reykjavík University
Kringlan 1
103 Reykjavík, Iceland
ulfar@ru.is

Yves Younan received a Master in Computer Science from the Vrije Universiteit Brussel
(Free University of Brussels) in 2003 and a PhD in Engineering Computer Science from
the Katholieke Universiteit Leuven in 2008. His PhD focussed on efficient countermeasures
against code injection attacks on programs written in C and C++. He is currently a post-
doctoral researcher at the DistriNet research group, at the Katholieke Universiteit Leuven,
where he continues the research in the area of systems security that was started in his PhD.

Yves Younan
Department of Computer Science
Katholieke Universiteit Leuven
Celestijnenlaan 200A, Leuven 3001
Belgium
Yves.Younan@cs.kuleuven.be

Frank Piessens is a Professor in the Department of Computer Science at the Katholieke Uni-
versiteit Leuven, Belgium. His research field is software security. His research focuses on the
development of high-assurance techniques to deal with implementation-level software vul-
nerabilities and bugs, including techniques such as software verification and run-time moni-
toring.

Frank Piessens
Department of Computer Science
Katholieke Universiteit Leuven
Celestijnenlaan 200A, Leuven 3001
Belgium
Frank.Piessens@cs.kuleuven.be

You might also like