Nothing Special   »   [go: up one dir, main page]

p1138 Cohen

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

3rd IEEE International Workshop on Security and Forensics in Communication Systems 2015

Forensic Analysis of Windows User space


Applications through Heap allocations.

Michael Cohen
Google Inc.,
Brandschenkestrasse 110, Zurich, Switzerland.
Email: scudette@google.com.

Abstract—Memory analysis is now used routinely for incident be able to recognize repeating patterns of object reuse by the
response and forensic applications. Current memory analysis application.
techniques are very effective in finding kernel artifacts of signif-
icance to the forensic investigator. However, the analysis of user Our contributions in this paper fall into a number of areas.
space applications has not received enough attention so far. We First we have written specialized code to be able to acquire and
identify the lack of pagefile support in analysis and acquisition utilize the pagefile in the analysis of the Windows operating
as a major hurdle in the analysis of user space applications. We system. We find that the analysis of the pagefile is crucial to re-
present a set of patches to the Rekall Memory Forensic platform covering sufficient data from userspace applications. Secondly
that enable the analysis of pagefiles on all operating systems. We we wrote a set of plugins that enumerates all heap allocations
then continue by studying the process heaps, and in particular the
in a Windows application, and automatically identify cross
Windows userspace heap allocator. We present a set of plugins
to enumerate heap allocations and discover internal references. allocation references.
We demonstrate that using the heap allocations as a guide, it is Using these plugins it is possible to rapidly understand the
easier to reverse engineer user space private data structure simply
inter-relationship between allocations made by an application,
by observation. Finally, we apply the heap analysis technique to
study the allocations made by the windows DNS client cache. and thus write a parser for the data of relevance without access
to the original source code. We demonstrate this technique by
analyzing a windows application of forensic significance.
I. I NTRODUCTION
Memory analysis has gained popularity in recent years as A. What is the user space heap?
a powerful and effective method for obtaining forensically
relevant information from running computer systems. Modern operating systems make a distinction between
“user space” and “kernel space”. Kernel code runs at Ring
The widespread use and availability of powerful memory 0, while user mode code runs at Ring 3. The kernel provides
analysis frameworks such as Rekall [1] or Volatility [2] has general services to user mode code via a well defined system
enabled this trend. Forensically relevant information, such as call interface [5]. One of the services a kernel may provide is
IP addresses, recent connections and running processes can memory management for the process. A process can request
assist in rapid triaging of forensic acquisition and analysis. For the kernel to augment its address space with new memory or
closed source operating systems, such as Microsoft Windows, that an external file is to be mapped into it’s own address space.
much effort has been devoted into reverse engineering core The kernel keeps data structures that describe the address
parts of the operating systems in order to identify important space layout for a particular process, termed the VAD tree
artifacts in memory, such as important kernel structures. (Virtual Address Descriptor Tree) in order to rapidly manage
However, full reverse engineering is generally time con- the process’s page tables [6].
suming and requires highly specialized skills. As a result, the
The process may request the kernel to allocate new memory
analysis of application memory has been slower to develop,
to its address space by use of the VirtualAllocEx() system
since applications change more frequently and there are more
call. This function will cause the kernel to add an additional
forensically interesting applications. While some of the more
Virtual Address Descriptor to the VAD tree and manipulate
important user space applications have received some interest
the process’s page tables to allow the new memory to be
(e.g. the Service Control Manager [3], or the Windows Com-
accessible. The kernel is only able to extend a process’s address
mand Shell [4]), there remain many user space applications
space in multiples of whole pages.
which have not been analyzed sufficiently.
This paper focuses specifically on the analysis of user space For many applications, however, memory must be allocated
applications running under the windows operating system. We in much smaller sizes from a few bytes to large allocations.
consider some of the challenges encountered in this scenario For this reason, most applications use an internal heap imple-
and in particular examine ways to make the reversing of the mented typically by a library. The heap exposes an API where
application easier through deep inspection of the user mode the application can allocate small, arbitrary sized memory
heap. blocks (We shall term these allocations in this work). For
example, the ANSII C malloc()/free() API, or the C++ new()
By enumerating all user space heap allocations one gains APIs can be used to allocate arbitrarily sized allocations
a view of the application memory with sufficient context to efficiently. The library itself requests page sized allocations

978-1-4673-7194-0/15/$31.00 ©2015 IEEE 1138


3rd IEEE International Workshop on Security and Forensics in Communication Systems 2015

from the kernel to service these smaller requests, carving up into allocation sized units. Every time the process calls the
these larger areas into internally managed chunks. ‘malloc()‘ or ‘new()‘ function, the returned memory can be
viewed as a distinct allocation. Since typically applications
There are many heap implementations. Below we list some allocate memory consistently for their required use, this gives
of the more common implementations: the analyst a strong hint as to the purpose of the allocation.
1) Doug Lee’s dlmalloc [7] is one of the oldest im- Section IV details some examples where heap analysis can be
plementations of a general purpose memory alloca- utilized to recover data from user space applications.
tor. This has been adapted in ptmalloc2 which add
multithreading support [8]. Ptmalloc2 is commonly II. T HE M ICROSOFT H EAP A LLOCATOR
used in many Unix/Linux distributions, but it is
also available in windows. For example windows In this paper we focus on the Microsoft Heap allocator
programs compiled under mingw or cygwin may use which is the most widely used alloctor in Windows. The
ptmalloc2. allocator has been studied extensively by the information
2) The tcmalloc allocator is a high performance allocator security community, and a number of detailed accounts have
written by Google and released as an open source been published (most notably [10], [11], [12]).
alternative [9]. This allocator is typically used with The main interest in the allocator is from a security
the Chrome browser and other Google originated perspective. That is, how the allocator can be exploited as a
products. result of a heap overflow bug. While in order to successfully
3) The Microsoft Visual Studio heap implementation is and reliably exploit the heap allocator, deep understanding
the default implementation linked into binaries built of the allocator operations is required, the forensic analyst is
using the Microsoft Visual Studio Development en- merely interested in enumerating heap allocations.
vironment. All windows applications receive a single
Heap created by ntdll.dll. Heap implementations maintain data structures to allow
fast and efficient operation of the heap (e.g. lookaside lists,
In practice different heap implementations strike a balance fine grained thread locks) but precise knowledge of these
between the sometimes competing needs for security, perfor- mechanisms is not required in order to simply enumerate the
mance and memory efficiency. Ultimately the heap simply heap allocations. Most heaps maintain high level structures that
requests large contiguous regions of memory from the kernel, allow simple enumeration. Even though the literature contains
and subdivides them into smaller sized chunks. very detailed explanation of the heap operation, we include a
It is important to note that the specific heap implementation simple overview in this paper to describe the basic method for
used does not depend on the operating system. It is not the enumerating heap allocations.
case that all windows programs will use the Microsoft heap im- Some existing forensic frameworks do support limited anal-
plementation. For example, the Chrome browser uses tcmalloc ysis of heap allocations. For example the Volatility memory
even on Windows. Nevertheless, the Microsoft allocator is the analysis framework supports extracting edited text from the
most commonly used by Windows applications and therefore notepad applications [2], it does not however, support the
we focus on it in this work. full heap analysis - only the backend allocator. Additionally
The analysis in this work was performed on Windows 7 the existing implementation does not support the heap entry
AMD64 with Service Pack 1 running inside a Virtual Box encoding. Therefore it can not be used for recovering small al-
emulator with 1Gb RAM. locations which are typically handled by the low fragmentation
heap, or work with newer windows versions then XP. Similarly
[13] provides an overview of the basic heap operation but it
B. Why analyse the heap allocator?
does not utilize any front end heap implementations, making
From the kernel’s point of view the application simply it impossible to locate very small allocations.
has large memory regions allocated to it in it’s address space.
To our knowledge our implementation of front end heap
Current memory analysis tools are able to dump or view this
analysis is unique in a memory forensic tool. It, of course,
kernel’s view of memory, or even search the memory for
relies heavily on the detailed descriptions presented by the
certain signatures. However, this coarse view of a process’s
seminal work in [12].
memory is not aligned with how the process itself views
its own memory, and therefore lacks the context required to The following description applies to Windows 7 SP1 heap
properly interpret the data. implementation. Older implementations use slightly different
For example, a process might allocate different objects (e.g. details. Note that the actual heaps used in a process depend on
C++ objects or C structs) on the heap, and treat those as distinct the way the process has been built. Specifically with Microsoft
entities. If we simply examine process memmory as a large Visual Studio, it is possible to make a debug build in which
contiguous amorphous stream of bytes we lose the context of case the process will create debugging heaps with a different
the individaul allocations and must detect the beginning and layout then the regular production heaps. These debugging
end of each object, perhaps by developing validity tests or heaps contain larger heap metadata structs and the allocation
signatures [4]. This increases the likelihood of errors and false layouts are different. Most production software however, is
positives. not built with the debugging configuration. Note also that on
windows if a process is running under WoW64 emulation it
By having a detailed understanding of the heap implemen- will have a 32 bit heap implementation, even when running
tation, the forensic analyst is able to subdivide process memory on a 64 bit operating system.

1139
3rd IEEE International Workshop on Security and Forensics in Communication Systems 2015

Kernel View: VAD 0x350000-0x35ffff Private RW 0x250000-0x34ffff Private RW a new Segment to be allocated using another VirtualAllocEx()
Backend Allocator:
call.
Segments/Allocations
The HEAP ENTRY data structure is encoded in practice,
Segment in order to make it difficult for heap overflow exploits. Before
parsing the struct it must be decoded by XORing it with the
Frontend Allocator (LFH):
constant given at the HEAP.Encoding field. I.e. the Heap
Subsegments/Allocations Sub Segment Encoding is different and random for each heap in the process.
Same size
allocations The backend allocations can be enumerated using this
simple algorithm (On Windows 7):
Fig. 1. A typical heap.
1) The pointer EPROCESS.Peb.ProcessHeaps points at
an array of pointers to HEAP structures.
A process may contain multiple heaps, with new heaps 2) For each HEAP structure, enumerate the segments
created via the CreateHeap() API. When a process is cre- by following the HEAP.SegmentListEntry list.
ated, it is given the first initial heap (created by ntdll.dll). 3) for each HEAP SEGMENT struct,
The process’s heaps are listed in the Process Environment HEAP SEGMENT.FirstEntry is the first
Block ( EPROCESS.Peb.ProcessHeaps) array which contains HEAP ENTRY header.
pointers to a HEAP header. Sometimes different components 4) Decode the HEAP ENTRY header by XOR with the
of the same process will build different heaps for different HEAP.Encoding field.
purposes. As we see in subsequent sections (Section IV), we 5) Find the next entry by adding the allocation size
can use this property to identify related data structures, since HEAP ENTRY.Size multiplied by the heap alloca-
all data from the same component will be grouped in a single tion granularity to this entry.
logical heap.
III. T HE W INDOWS PAGEFILE
The Windows heap allocator is divided into two parts [14]:
When we initially attempted to enumerate process heaps,
1) The backend allocator is responsible for requesting memory forensic tools were unable to process the pagefile
large contiguous memory blocks from the kernel us- or prototype PTEs correctly. We found that many pages
ing VirtualAllocEx(). These blocks (called Segments) in the heaps were invalid making it difficult to follow the
are divided into smaller chunks to service small allo- heap allocations and perform user space analysis. Clearly as
cations. Note that very large allocations are serviced illustrated in Figure 1, if pages are unreadable part way through
directly with VirtualAllocEx() by the backend. a heap segment it is impossible to follow the allocation list
2) The frontend allocator is used to improve perfor- completely to the end of the heap. Thus heap enumeration
mance for small allocations. For current windows ver- becomes incomplete.
sions the only front end implementation available is
Many of the existing plugins are designed to analyze kernel
the Low Fragmentation Heap (LFH) implementation
memory. Although technically the windows kernel is also
(and that is the only one we consider in this work). A
pageable, in practice many of the current plugins which ana-
LFH is only created when the implementation detects
lyze kernel structures are not particularly affected by paging,
the application will benefit from it, hence not all
even though the effects of paging on userspace analysis are
heaps contain a front end [12].
quite profound.
The LFH allocates larger blocks from the backend allocator The windows kernel can allocate memory from one of the
(termed Subsegments) and carves these into identically sized paged or non-paged pools for its own kernel data structures.
allocations. If an allocation request can be served from an We used the Rekall pool tracker plugin [1] to examine from
existing Subsegment, the Front End Heap serves it from this which pool different kernel allocations were made in practice.
Subsegment. Allocations with each LFH Subsegment are not To our surprise we found that most significant kernel struc-
coalesced hence this reduces heap fragmentation. tures are allocated in Nonpaged pools (Allocations such as
processes, threads, TCP connections, VAD entries and many
Figure 1 illustrates an example heap for a simple appli-
more). Some kernel allocations did come from Paged pool (and
cation. The Kernel is only aware of two VAD descriptors
would benefit from our pagefile support), such as the Registry
for large contiguous regions of private process memory. The
structures (Pool tags starting with CM) and NTFS structures,
heap’s backend creates these VAD regions (called Segments)
but on the whole, most plugins are using structures from non-
using VirtualAllocEx() calls.
paged pools.
The backend then divides the segments into single allo- This explains why the pagefile was not sorely missed
cations. Each allocation is preceeded with a HEAP ENTRY when analyzing kernel structures until now. However, when
data struct header (which is 16 bytes). The header contains the analyzing userspace memory, the page file plays a much greater
size of the allocation, and therefore the beginning of the next role.
allocation can be calculated. It is therefore possible to follow
the allocation from the start of the segment to the end in order The literature documents many researchers identifying the
to enumerate all allocations in this segment. When the heap pagefile as an important source of information. For example
can not fit an allocation into an existing Segment, it requests [15] notes a vast increase in the number of available pages

1140
3rd IEEE International Workshop on Security and Forensics in Communication Systems 2015

Return the hardware page


when page translation considers pages in Transition and Pro- Read PTE Value
frame with the low 12 bits of
virtual address.
totype states. References are found in the literature as early
as [16] to implementations of pagefile support in memory True Valid PTE
pte.u.Hard. PTE.u.Hard.PageFrame *
analysis tools, but these claims were never followed through Valid 0x1000 | vadd & 0xFFF

with published code. False Transition PTE


True
The windows page file was studied in the past by [15] pte.u.Proto.
False
pte.u.Trans.
False
and some heuristics were developed by observation. This early Prototype Transition

analysis was confined to 32 bit versions of Windows XP, True


and does not include the analysis of file mappings (Section pte.u.Proto.
VAD PTE Pagefile PTE
objects). To our knowledge this analysis has not been not ProtoAddress
==
True True pte.u.Soft.
PageFileHigh
False
0xFFFFFFFF0 == 0
implemented in any current open source tools, before we 000

developed our patches. [3] claims that currently the Volatility Prototype PTE Retrieve PTE from VAD PTE.u.Soft.PageFileHigh *
memory forensic framework does not support analysis of page 0x1000 | vadd & 0xFFF

files. PF number = PTE.u.Soft.


PageFileLow

Resolve Prototype PTE


[14] explains how PTEs can be parsed in order to recover (Next Figure) PTE resolved from PageFile

memory from the page file. We have developed a set of patches


to the Rekall memory forensic tool to enable the acquisition
Fig. 2. Hardware PTE resolution algorithm.
and analysis of page-files. Detailed information about these
patches are published elsewhere [17] however, in this section
we outline a short summary of the changes as they pertain
specifically to the analysis of userspace applications. be used freely by the operating system, and therefore are OS
specific by nature.
A. Virtual Address Translation
B. OS specific address translation
The Windows operating system utilizes the CPU’s pro-
tected mode. In this mode all memory accesses made by the The hardware generates a page fault while translating a
CPU are done in Virtual Addresses. The hardware’s Memory page, by calling an interrupt into the OS page fault handler,
Management Unit (MMU) uses specially configured Page passing in the faulting PTE. At this point the page fault handler
Tables in order to translate a virtual address into a physical uses a number of flags to determine which state the PTE is in.
memory address (where data can be fetched) [14]. Each Windows uses the MMPT struct to describe the PTE which
process has its own set of page tables which are used to is actually a union of all the possible states the PTE can be
reference its own unique Process Address Space. in (The correct member of the union is chosen based on the
flags).
The details of this translation process are described else-
where [18], but for our purposes we simply point out that Figure 2 shows the algorithm used for resolution of the
the resolution process retrieves a Page Table Entry (PTE) PTE passed into the page fault handler (Sometimes termed the
from the page tables for each virtual page accessed. This PTE Hardware PTE). Our Rekall patches implement this algorithm
contains information encoded in various bitfields about where in order to emulate the page-fault handler and deduce the
the physical page corresponding with the virtual address may correct physical page to use.
be located in the physical memory. In the first stage the PTE might represent one of the
following states:
Note that the address translation process occurs automati-
cally by the MMU hardware. However, if the PTE has it’s least 1) Valid PTE: If Bit 0 is set the PTE refers to a page in
significant bit (bit 0) unset then the PTE is considered Invalid physical memory.
by the hardware, which generates a Page Fault interrupt for 2) If the ProtoType bit is unset and the Transition bit is
the kernel to resolve the page. set the page is in the Transition state. Its content is
A fundamental capability of memory analysis frameworks still valid and therefore we can directly read the data
is to emulate this address translation process in order to present from the image.
an abstraction for the process virtual address (So process 3) If both the ProtoType bit and the Transition bit are
memory can be examined). Earlier frameworks would strictly unset, the PTE refers to a Software PTE (i.e. the data
perform the translation made by the MMU but if the Valid bit exists in the pagefile). The offset into the pagefile can
was 0 would consider that the page is invalid and not available. be calculated from the PageFileHigh field. Except if
This is unfortunately insufficient since in practice many pages the offset into the pagefile is 0, in this case, the VAD
in process memory are invalid (from a hardware point of view), must be consulted and the ProtoType PTE recovered
but can still be resolved using the OS page-fault handler (See from the VAD and analyzed through the second stage
Figure 4). algorithm.
4) If the ProtoType bit is set, and the ProtoAddress
We have extended the Rekall memory forensic framework field is 0xFFFFFFFF0000 then the ProtoType PTE
with an OS specific Virtual Address Translation facility which must be fetched from the VAD entry corresponding
considers the special cases where the Valid flag is unset in a with the relevant address. The ProtoType PTE is
PTE [17]. In these cases the other bits of the PTE are able to then found by using the MMVAD.FirstPrototypePte

1141
3rd IEEE International Workshop on Security and Forensics in Communication Systems 2015

Return the hardware page


Read Prototype PTE Value
frame with the low 12 bits of particular page was used. However, this example demonstrates
virtual address.
that the page file is extremely important to the analysis of
True Valid PTE user-space applications.
pte.u.Hard. PTE.u.Hard.PageFrame *
Valid 0x1000 | vadd & 0xFFF
It is therefore crucial that the pagefile be acquired at the
False Transition PTE
Return Zero Page
same time as the memory. Unfortunately, the pagefile is locked
for shared access when the system is running, thus it can not
True Demand Zero PTE True ordinarily be opened for reading. Despite this, some memory
pte.u.Proto.
False
pte.u.Trans.
False pte.u.Soft.
PageFileHigh
acquisition tools can still acquire the pagefile. For example
Prototype Transition
== 0
judging by the debugging output, [19] searches for the handle
True
Pagefile PTE False for the open pagefile in the System process (Process ID 4) and
then reuses that handle to read the pagefile.
File mapping PTE
PTE.u.Soft.PageFileHigh * Alternatively, it is possible to acquire the pagefile using
_SUBSECTION object 0x1000 | vadd & 0xFFF
instantiated at the fcat tool from the Sleuthkit suite of forensic applications
pte.u.Subsect.Subsection. PF number = PTE.u.Soft.
PageFileLow [20]. This tool parses the NTFS Master File Table (MFT)
PTE resolved from PageFile
and directly extracts the disk clusters that the pagefile resides
on [21]. This approach bypasses the regular kernel file lock
restrictions and enables reading the locked file.
Fig. 3. Prototype PTE resolution algorithm.

C. Smear and pagefile acquisition.

pointer. The retrieved PTE is then processed further While experimenting with the pagefile processing patches
as a ProtoType PTE in the second stage. we found an interesting effect of smear during acquisition.
Smear in memory acquisition is not a new issue [22], [23],
If a ProtoType PTE was retrieved in the first stage, it is but is typically measured in terms of total pages changed from
analyzed again to try to resolve the page. ProtoType PTEs are an “atomic” snapshot (for example obtained though virtual
allocated from system pool and are not part of the hardware machine introspection [24], [25]).
page tables (i.e. The MMU never reads a ProtoType PTE
However, we found that not all page changes are the same.
directly), but they share the same kernel structs and add some
In fact we identified a new type of smear dubbed “Pagetable
additional PTE states. Figure 3 illustrates the algorithm for
smear”, which depends on the order of acquisition. In order to
resolving a prototype PTE. One major difference from the
study the effects of this smear we wrote a program which al-
previous algorithms is the case where the prototype bit is set
locates 800mb of memory (called swapper.exe), marking each
in a ProtoType PTE. The PTE then represents a Subsection
page with a unique sequence number. We ran the program and
Object (i.e. a File Mapping). When the page fault handler
observed that from the total 800mb allocated, the program’s
encounters a Subsection PTE it simply reads the data from the
working set was 347mb prior to acquisition (i.e. 347mb were
file into a new physical page. However, in a memory analysis
present in memory or in the Valid state).
framework we are unable to resolve this page without access
to the corresponding disk image. We then acquired memory using the WinPmem 1.6.2
tool[1]. Afterwards we copied the pagefile from the system
Additionally if the ProtoType PTE is a Software PTE (i.e.
using the fls tool. Within seconds of the acquisition starting, the
refers to the pagefile) but the offset into the pagefile is 0,
working set of swapper.exe was trimmed to 188mb - possibly
the PTE is considered a “Demand Zero PTE”. In this case the
due to the IO operations in writing the image to disk.
kernel will re-purpose a zeroed page and remap it into memory
on demand. We then used our patches for OS specific page translation
to read the known pages from our test program. To our surprise
In order to evaluate the importance of the page file in the we found, among the known marked pages, some corrupted
analysis of usermode applications, we wrote a Rekall plugin pages - even some which appear to contain kernel data (which
called vadmap, that enumerates every process PTE (from the should never appear in a userspace program).
VAD tree) and determines it’s state. As an example, we have
used a memory image of a Windows 7 system and enumerated After further analysis we realized that the order of ac-
the heaps of the svchost.exe process (This process contains quisition matters - as the page tables are usually stored at
forensically significant information which will be analyzed in low memory addresses, in this case the kernel’s DTB was at
a later section). 0x187000 (About 1.6Mb into the image), the page tables will
be copied first into the image. The PTEs for the process might
If we only consider heap owned pages, there were 3286 still indicate that some pages are in a Valid (or Transition)
pages in total, of which 457 were valid and 144 were in state. However, by the time the acquisition tool gets to copy
Transition (i.e. 601 pages were immediately usable). However those physical pages into the disk, the kernel’s working set
there were 431 pages which only exist in the pagefile and trimmer might decide to trim the process’s working set so the
2254 Zero pages (I.e. unused by the heaps). Hence, in this PTEs will be in the Software PTE or ProtoType PTE states,
case, the pagefile actually contains over 40% of the usable while the actual pages are re-purposed.
pages in the process heaps. Clearly the exact ratio of pagefile
pages to valid pages depends greatly on external factors, such This type of smear is very dangerous since the Memory
as available memory, process activity and the time since the analysis framework has no idea that the pages are in any way

1142
3rd IEEE International Workshop on Security and Forensics in Communication Systems 2015

invalid. It would simply go through its normal page translation **************************************************


0xfa8002b4f6c0 svchost.exe 1148
algorithm and use the pages indicated by the original PTE, Heap 4: 0x11d0000 (BACKEND)
Backend Info:
even though these pages are now no longer part of the process
working set. Thus an out of date pagetable page has much Segment End Length Data
------------- -------------- -------- ----
larger consequences to analysis than simply an out of date . 0x11d0040
.. 0x11d0a80
0x1250000 524224
0x11d12f0 2144 00 00 00 00 00 00 00 00 ........
kernel or process memory page. 00 01 00 00 00 00 00 00 ........
.. 0x11d12f0 0x11d1500 512 00 13 1d 01 00 00 00 00 ........
00 13 1d 01 00 00 00 00 ........
Similar effects have been noted (but not experimentally .. 0x11e27f0 0x11e2820 32 73 00 73 00 6c 00 2e 00 s.s.l...
67 00 73 00 74 00 61 00 g.s.t.a.
verified) by [26]. However, in that work the effects were .. 0x11e2820 0x11e2860 48 61 00 2d 00 30 00 30 00 a.-.0.0.
30 00 31 00 2e 00 61 00 0.1...a.
attributed to smear with respect to kernel swap data structures. # This looks like a DNS name.
.. 0x11e2860 0x11e28c0 80 73 00 73 00 6c 00 2d 00 s.s.l.-.
We have determined that this is not necessarily related to swap 62 00 69 00 6e 00 67 00 b.i.n.g.
specifically, or to kernel data structures, since the same effects .. 0x11e28c0 0x11e28f0 32 40 78
20 51
24
24
01
01
00
00
00
00
00
00
00
00
@x$.....
.Q$.....
can be observed for prototype pages and memory mapped .. 0x11e28f0 0x11e2930 48 40 29
80 28
1e
87
01
02
00
00
00
00
00
00
00
00
@)......
.(......
pages. In fact any difference between the acquisition times
Heap 12: 0x5100000 (LOW_FRAG)
of the PTEs and the pages they refer to can be classified as Backend Info:

“pagetable smear”. Segment End Length Data


------------- -------------- -------- ----
. 0x5100040 0x5110000 65472
Further research is required to determine the optimal order .. 0x5100a80 0x51012e0 2128 00 00 00 00 00 00 00 00 ........
of acquisition. Perhaps page tables should be acquired after .. 0x5107fc0 0x5108000 48
00 01 00 00 00
d0 df 36 05 00
00
00
00
00
00
00
........
..6.....
the rest of main memory. Perhaps they should they be acquired .. 0x5108000 0x5110000 32752
f8 00 10 05 00
00 00 00 00 00
00
00
00
00
00
00
........
........
twice - before and after, and the differences noted. . 0x5360040 0x5460000 1048512
00 00 00 00 00 00 00 00 ........

.. 0x5360070 0x536c4d0 50256 40 2a ef 02 00 00 00 00 @*......


ff ff ff ff 00 00 00 00 ........
.. 0x536c4d0 0x536c540 96 00 00 00 00 00 00 00 00 ........
IV. U SERSPACE H EAP A NALYSIS 10 c5 36 05 00 00 00 00 ..6.....
.. 0x536e000 0x5460000 991216 00 00 00 00 00 00 00 00 ........
00 00 00 00 00 00 00 00 ........
The following is an example of applying heap analysis to Low Fragmentation Front End Information:
reversing the windows DNS Client Resolver Cache. Although Entry Alloc Length Data
-------------- ---- ---- ----
we examine a single example here, the technique has been suc- 0x536c760 96 64 00 00 00 00 00 00 00 00
a0 c7 36 05 00 00 00 00
........
..6.....
cessfully applied to a number of other user space applications. 00 00 00 00 03 00 00 00
a0 3b 1e 01 00 00 00 00
........
.;......
00 00 00 00 00 00 00 00 ........
Windows implements a common caching resolver for DNS 00 00 00 00 00 00 00 00
74 00 2e 00 75 00 72 00
........
t...u.r.
client queries. Every program which queries for a DNS resolu- 73 00 2e 00 6d 00 69 00 s...m.i.
0x536c7c0 96 64 00 00 00 00 00 00 00 00 ........
tion will cause an entry to be added to this cache for a period 00 c8 36 05 00 00 00 00 ..6.....
00 00 00 00 03 00 00 00 ........
of time. Being able to inspect the cache is valuable from a c0 51 1e 01 00 00 00 00 .Q......
forensic perspective since DNS activity on the machine is a 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
........
........
strong signal for the presence of network connections (e.g. for 63 00 72 00 6c 00 2e 00
74 00 68 00 61 00 77 00
c.r.l...
t.h.a.w.
malware Command and Control connections).
An early implementation of a plugin to extract the DNS Fig. 4. Enumerating the heap of the svchost.exe process which is running
cache was published on the Volatility issue tracker [27], the dnsrslvr.dll DNS Client Resolver Cache. The process has 12 heaps in this
however the plugin only supported some versions of windows case but only the relevant heaps to the following discussion are shown.
XP 32 bit and used scanning techniques to traverse the heap
(and hence does not work on windows 7).
Additionally we use the show referrer alloc plugin to
The file dnsrslvr.dll describes itself as “DNS Caching search for all references to a memory address from heap
Resolver Service” in its PE Version strings, and is therefore allocations. Matching allocations are then displayed.
responsible for implementing the resolver cache. It is also a
Typically, userspace applications maintain data structures
service hence it is running in svchost.exe.
using pointers to other data structures in a consistent way.
We now apply our userspace heap analysis plugins to learn Given our knowledge of allocations, we are able to identify
more about this service. The first plugin we use is inspect heap similar objects and quickly identify the patterns which hint at
to learn about the allocations in all heaps. the data structures involved.
Figure 4 shows the output of the inspect heap plugin. The So to summarise the analysis process outlined in Figure 5:
svchost.exe process hosts many services, each implemented as
a dll at the same time. Therefore there are quite a number of 1) We identify a domain name in an allocation at
heaps present. The above figure only shows an extract from 0x11e2860, using the show allocation plugin we see
the full output. In it we notice clearly domain names such as that this is a single string (ssl-bing-com.a-0001.a-
ssl.bing.com. msedge.net).
2) Now we ask “which data structures refer to this
Figure 5 demonstrates the analysis process. We use the string?” Using the show referrer alloc plugin we see
show allocation plugin to locate the allocation which contains an allocation at 0x11e2b30. We shall name the data
an address of interest. The plugin dumps the data within this structure DNS RECORD. The string reference is at
allocation. The plugin also iterates over each 8 byte integer and offset 0x08 from the start of the struct.
attempts to resolve it back to a known allocation. The overall 3) We then ask, “which data structure points to the
effect is that we can identify pointers to other allocations. DNS RECORD?” The allocation at 0x1244fc0 looks

1143
3rd IEEE International Workshop on Security and Forensics in Communication Systems 2015

remarkably like a DNS RECORD too, hence we iden- # Who refers to the DNS_RECORD? This looks a lot like another DNS_RECORD.
[1] output.elf.E01 23:59:33> show_referrer_alloc 0x11e2b30
tify a singly linked list at offset 0x00 of this struct. Address 0x1244fc0 is 0 bytes into allocation of
size 56 ( 0x1244fc0 - 0x1244ff8)
In fact each DNS RECORD is storing an additional Offset Data Comment
piece of information about the record. By observation -------------- --------------------------------- --------------------------
0x1244fb0 6f 00 67 00 6c 00 65 00 o.g.l.e. _HEAP_ENTRY
we can see the type of the struct is stored at offset 0x1244fb8 34 16 3e 52 1e 57 00 18 4.>R.W.. _HEAP_ENTRY
0x1244fc0 30 2b 1e 01 00 00 00 00 0+...... 0x11e2b30(72@0x11e2b30)
0x20 of this struct - Type 0x05 refers to a CNAME 0x1244fc8 30 54 24 01 00 00 00 00 0T$..... 0x1245430(40@0x1245430)
0x1244fd0 05 00 08 00 09 30 03 00 .....0..
record and type 0x1 refers to an A record. 0x1244fd8 73 17 04 00 01 00 00 00 s.......
0x1244fe0 d0 2a 1e 01 00 00 00 00 .*...... 0x11e2ad0(88@0x11e2ad0)
4) By repeating the show referrer alloc plugin we even-
# Who refers to that one? This does not look like a DNS_RECORD at all.
tually get to a struct which does not look like a [1] output.elf.E01 23:59:48> show_referrer_alloc 0x1244fc0
DNS RECORD at offset 0x5101f60. In fact we can Address
size 88 (
0x5101f78 is 24 bytes into allocation of
0x5101f60 - 0x5101fb8)
see it is located in a different heap altogether. The Offset Data Comment
-------------- --------------------------------- --------------------------
struct seems to contains a reference to the string 0x5101f58 13 71 d2 2f 89 53 00 16 .q./.S.. _HEAP_ENTRY
0x5101f60 00 00 00 00 00 00 00 00 ........
“ssl.bing.com” at offset 0x08. 0x5101f68 90 1f 10 05 00 00 00 00 ........ +0x40(0x5101f90)
0x5101f70 00 00 00 00 03 00 00 00 ........
5) Following the referrer of this struct seems to be a 0x5101f78 c0 4f 24 01 00 00 00 00 .O$..... 0x1244fc0(56@0x1244fc0)
0x5101f80 00 00 00 00 00 00 00 00 ........
large allocation with sparse pointers to similar such 0x5101f88 00 00 00 00 00 00 00 00 ........
structs mixed with NULL pointers. This looks like 0x5101f90 73 00 73 00 6c 00 2e 00 s.s.l...
0x5101f98 62 00 69 00 6e 00 67 00 b.i.n.g.
a hash table, therefore we name the previous struct 0x5101fa0 2e 00 63 00 6f 00 6d 00 ..c.o.m.

DNS HASHTABLE ENTRY. # The referrer of this record seems to be a hash table (following all non zero
# pointers yields similar records to the above DNS_HASHTABLE_RECORD.
[1] output.elf.E01 00:00:00> show_referrer_alloc 0x5101f60
Figure 7 illustrates the relationships between all the al- Address
size 1688 (
0x5101860 is 1392 bytes into allocation of
0x51012f0 - 0x5101988)
locations diagrammatically. We used this to write a new Offset Data Comment
-------------- --------------------------------- --------------------------
plugin: First locate the relevant heap from a reference inside 0x5101840 00 00 00 00 00 00 00 00 ........
0x5101848 00 00 00 00 00 00 00 00 ........
dnsrslvr.dll, then find the hashtable (which is the largest single 0x5101850 00 00 00 00 00 00 00 00 ........
0x5101858 00 00 00 00 00 00 00 00 ........
allocation). The members are then followed and printed in 0x5101860 60 1f 10 05 00 00 00 00 ‘....... 0x5101f60(88@0x5101f60)
0x5101868 00 00 00 00 00 00 00 00 ........
order. 0x5101870 00 00 00 00 00 00 00 00 ........
0x5101878 00 00 00 00 00 00 00 00 ........
0x5101880 e0 dc 36 05 00 00 00 00 ..6..... 0x536dce0(72@0x536dce0)
0x5101888 00 00 00 00 00 00 00 00 ........
[1] output.elf.E01 23:59:04> show_allocation 0x11e2860
Address 0x11e2870 is 0 bytes into allocation of
[1] output.elf.E01 00:00:24> show_allocation 0x536c9b0
size 88 ( 0x11e2870 - 0x11e28c8)
Address 0x536c9b0 is 0 bytes into allocation of
Offset Data Comment
size 88 ( 0x536c9b0 - 0x536ca08)
-------------- --------------------------------- --------------------------
Offset Data Comment
0x11e2860 2e 00 63 00 31 00 30 00 ..c.1.0. _HEAP_ENTRY
-------------- --------------------------------- --------------------------
0x11e2868 36 16 3e 50 18 57 00 1e 6.>P.W.. _HEAP_ENTRY
0x536c9a0 63 00 6f 00 6d 00 00 00 c.o.m... _HEAP_ENTRY
0x11e2870 73 00 73 00 6c 00 2d 00 s.s.l.-.
0x536c9a8 57 df 1f 73 6b 00 00 88 W..sk... _HEAP_ENTRY
0x11e2878 62 00 69 00 6e 00 67 00 b.i.n.g.
0x536c9b0 00 00 00 00 00 00 00 00 ........
0x11e2880 2d 00 63 00 6f 00 6d 00 -.c.o.m.
0x536c9b8 e0 c9 36 05 00 00 00 00 ..6..... +0x40(0x536c9e0)
0x11e2888 2e 00 61 00 2d 00 30 00 ..a.-.0.
0x536c9c0 00 00 00 00 03 00 00 00 ........
0x11e2890 30 00 30 00 31 00 2e 00 0.0.1...
0x536c9c8 d0 5d 24 01 00 00 00 00 .]$..... 0x1245dd0(56@0x1245dd0)
0x11e2898 61 00 2d 00 6d 00 73 00 a.-.m.s.
0x536c9d0 00 00 00 00 00 00 00 00 ........
0x11e28a0 65 00 64 00 67 00 65 00 e.d.g.e.
0x536c9d8 00 00 00 00 00 00 00 00 ........
0x11e28a8 2e 00 6e 00 65 00 74 00 ..n.e.t.
0x536c9e0 63 00 6c 00 69 00 65 00 c.l.i.e.
0x11e28b0 00 00 6f 00 6f 00 6b 00 ..o.o.k.
0x536c9e8 6e 00 74 00 73 00 33 00 n.t.s.3.
0x11e28b8 2e 00 63 00 6f 00 6d 00 ..c.o.m.
0x536c9f0 2e 00 67 00 6f 00 6f 00 ..g.o.o.
0x536c9f8 67 00 6c 00 65 00 2e 00 g.l.e...
# Lets check who refers to this allocation. We will call the referring struct
0x536ca00 63 00 6f 00 6d 00 00 00 c.o.m...
# DNS_RECORD.
[1] output.elf.E01 23:59:11> show_referrer_alloc 0x11e2870
Address 0x11e2b38 is 8 bytes into allocation of
size 72 ( 0x11e2b30 - 0x11e2b78)
Offset Data Comment Fig. 6. Analysis of allocations in svchost.exe heaps.
-------------- --------------------------------- --------------------------
0x11e2b20 00 00 00 00 00 00 00 00 ........ _HEAP_ENTRY
0x11e2b28 35 16 3e 53 1a 57 00 28 5.>S.W.( _HEAP_ENTRY
0x11e2b30 c0 21 87 02 00 00 00 00 .!...... 0x28721c0(56@0x28721c0)
0x11e2b38 70 28 1e 01 00 00 00 00 p(...... 0x11e2870(88@0x11e2870)
0x11e2b40 05 00 08 00 09 30 00 00 .....0..
0x11e2b48 73 17 04 00 01 00 00 00 s.......
0x11e2b50 30 28 1e 01 00 00 00 00 0(...... 0x11e2830(56@0x11e2830)
common data structures. This type of analysis can be utilized to
quickly extract forensically significant artifacts from memory
images.
Fig. 5. Analysis of allocations in svchost.exe heaps.

We then apply this analysis to study the internals of the


windows DNS client cache, producing a plugin to enumerate
V. C ONCLUSIONS AND FUTURE WORK
all cached entries - information which is forensically signifi-
The present work examined a novel method of user space cant for the detection of malware communication channels.
program analysis using the enumerations of heap allocations.
We present patches for the Rekall memory forensic tool that
enable the analysis of the pagefile in an operating system We also document a new type of memory acquisition
specific way. We find that the pagefile is crucial for the analysis smear called “Pagetable Smear”. This occurs mainly due to the
of user space programs, more so than kernel data structures. fact that many pages may change state during the acquisition
process, between the time the page table itself was imaged,
We present a set of plugins which enumerate program and the page referenced by the page-table was imaged. This
heap allocations and identify inter-relations between these type of smear might result in completely incorrect pages being
allocations. The plugins can be used to rapidly reverse engineer considered part of the process’s working set. Further research is
many user space programs by simply examining their in- required to establish the optimal acquisition order to minimize
memory data structures and identifying some of the more this kind of smear.

1144
3rd IEEE International Workshop on Security and Forensics in Communication Systems 2015

Private Heap of DNS Cache referred from dnsrslvr.dll


[9] S. Ghemawat and P. Menage, “TCMalloc : Thread-Caching Malloc,”
http://goog-perftools.sourceforge.net/doc/tcmalloc.html, 2015, [Online;
[DNS_HASHTABLE_ENTRY]
0x00 Next <DNS_HASHTABLE_ENTRY Pointer> accessed 09-Feb-2015].
0x08 Name <UnicodeString Pointer>
0x18 Record <DNS_RECORD Pointer> [10] C. Valasek and T. Mandt, “Windows 8 heap internals,” Black Hat USA,
Hash
table DNS DNS 2012.
(Alloc size Hashtable Hashtable
1688)
Entry Entry
[11] J. McDonald and C. Valasek, “Practical windows xp/2003 heap ex-
ploitation,” Black Hat USA, 2009.
DNS [DNS_RECORD] [12] C. Valasek, “Understanding the low fragmentation heap,” 2010.
0x00 Next <DNS_RECORD Pointer>
Record
0x08 Name <UnicodeString Pointer>
[Online]. Available: http://illmatics.com/Understanding the LFH.pdf
0x10 Type [Enumeration:Type]: (CNAME=5, A=1)
0x12 DataLength [unsigned short] [13] A. White, “Identifying the unknown in user space memory,” Ph.D.
DNS
Record
0x16 Data <UnicodeString Pointer> (CNAME record) dissertation, QUT, 2013. [Online]. Available: http://eprints.qut.edu.au/
<IPAddress> (A records)
64181/1/Andrew White Thesis.pdf
[1] output.elf.E01 11:28:41> dns_cache [14] M. Russinovich, D. Solomon, and A. Ionescu, Windows Internals, ser.
Name Record Type Data
------------------------------ -------------- ------- ----
Developer Reference. Pearson Education, 2012, no. pt. 1. [Online].
dl.google.com 0x00000536c710 HTABLE Available: https://books.google.ch/books?id=w65CAwAAQBAJ
. dl.google.com 0x000002872b30 CNAME dl.l.google.com
. dl.l.google.com 0x0000011e2a30 A 173.194.44.5 [15] J. D. Kornblum, “Using every part of the buffalo in windows memory
. dl.l.google.com 0x0000011e3880 A 173.194.44.3 analysis,” Digital Investigation, vol. 4, no. 1, pp. 24–29, 2007.
www.facebook.com 0x000005101f00 HTABLE
. www.facebook.com 0x0000011e3030 CNAME star.c10r.facebook.com [16] N. L. Petroni, A. Walters, T. Fraser, and W. A. Arbaugh, “Fatkit: A
ssl.bing.com 0x000005101f60 HTABLE
. ssl.bing.com 0x000001244fc0 CNAME ssl-bing-com.a-0001.a- framework for the extraction and analysis of digital forensic data from
msedge.net volatile system memory,” Digital Investigation, vol. 3, no. 4, pp. 197–
. ssl-bing-com.a-0001.a-msedg 0x0000011e2b30 CNAME a-0001.a-msedge.net
e.net 210, 2006.
. a-0001.a-msedge.net 0x0000028721c0 A 204.79.197.200
[17] M. Cohen, “Windows Virtual Address Translation and
the Pagefile.” http://rekall-forensic.blogspot.ch/2014/10/
Fig. 7. Final DNS Cache data diagram deduced by examining the allocations windows-virtual-address-translation-and.html, 2014, [Online; accessed
in Figure 5. Below is a sample of the output of the dns cache plugin. The 09-Feb-2015].
plugin indicates the address of the hash table entry and then follows each of [18] Intel, Intel 64 and IA-32 Architectures Developer’s Manual: Vol. 3A.
the DNS RECORD entries. Intel Corporation, 2015, vol. 3A, ch. Chapter 3.
[19] GMG Systems, Inc., “KnTTools with KnTList,” KnTToolswithKnTList,
2015, [Online; accessed 09-Feb-2015].
[20] B. Carrier, “The Sleuth Kit,” http://www.sleuthkit.org/sleuthkit/desc.
R EFERENCES php, 2015, [Online; accessed 09-Feb-2015].
[1] “Rekall memory forensic framework,” http://www.rekall-forensic.com/, [21] Brian Carrier, File system forensic analysis. Addison-Wesley Reading,
2015, [Online; accessed 09-Feb-2015]. 2005, vol. 3.
[2] “Volatility,” https://github.com/volatilityfoundation/volatility, 2015, [22] S. Vömel and F. C. Freiling, “Correctness, atomicity, and integrity:
[Online; accessed 09-Feb-2015]. defining criteria for forensically-sound memory acquisition,” Digital
[3] M. H. Ligh, A. Case, J. Levy, and A. Walters, The Art of Memory Investigation, vol. 9, no. 2, pp. 125–137, 2012.
Forensics: Detecting Malware and Threats in Windows, Linux, and Mac [23] S. Vömel and J. Stüttgen, “An evaluation platform for forensic memory
Memory, 1st ed. Wiley Publishing, 2014. acquisition software,” Digital Investigation, vol. 10, pp. S30–S40, 2013.
[4] R. M. Stevens and E. Casey, “Extracting windows command line details [24] B. Schatz, “Bodysnatcher: Towards reliable volatile memory acquisition
from physical memory,” digital investigation, vol. 7, pp. S57–S63, 2010. by software,” digital investigation, vol. 4, pp. 126–134, 2007.
[5] A. S. Tanenbaum, “Modern operating systems,” 2009. [25] A. Savoldi and P. Gubian, “Blurriness in live forensics: An intro-
[6] B. Dolan-Gavitt, “The vad tree: A process-eye view of physical mem- duction,” in Advances in Information Security and Its Application.
ory,” digital investigation, vol. 4, pp. 62–64, 2007. Springer, 2009, pp. 119–126.
[7] D. Lea, “A Memory Allocator,” http://g.oswego.edu/dl/html/malloc. [26] G. G. Richard and A. Case, “In lieu of swap: Analyzing compressed
html, 2000, [Online; accessed 09-Feb-2015]. ram in mac os x and linux,” Digital Investigation, vol. 11, pp. S3–S12,
[8] N. Douglas, “ptmalloc2 homepage,” http://www.nedprod.com/ 2014.
programs/Win32/ptmalloc2/, 2013, [Online; accessed 09-Feb-2015]. [27] phatbuck...@gmail.com, “Issue 124:add plugin to dump dns resolver
cache,” https://code.google.com/p/volatility/issues/detail?id=124#c12,
2013, [Online; accessed 09-Feb-2015].

1145

You might also like