Wednesday, August 20, 2008

Auditing the System Call Table

When malicious, kernel-level code is installed on the system, one action it may take is to hook various system services. What this means is that it takes some standard piece of operating system functionality and replaces it with its own code, allowing it to alter the way all other programs use the OS. For example, it may hook functions involved in opening registry keys, and modify their output so as to hide registry keys the rootkit uses. As system calls are the primary interface between user and kernel mode, the system call table is a popular place to do such hooking.

It's worth noting that many security products also make heavy use of hooking. One common example is antivirus software; among the many functions it hooks is NtCreateProcess (used, as the name suggests, to start a new process) so that it can do its on-demand scanning of any newly launched programs. For this reason, it's not safe to assume that any hooking of system calls is malicious; in fact, some of the most suspicious-looking things initially often turn out to be security software.

Still, it may be quite useful to be able to examine the system call table of a memory image during an investigation, in order to detect any hooks that shouldn't be there. To do this, we'll first look at how system calls work in Windows and lay out the data structures that are involved. I'll then describe a Volatility plugin that examines each entry in the system call table, gives its symbolic name, and then tells what kernel module owns the function it points to. If you want to skip the learning experience and get straight to the plugin, you can download it here and place it in your memory_plugins directory. You'll also need to get my library for list walking and place it in "forensics/win32".

If you look at any of the native API functions, like ZwCreateFile, you'll notice that they all look almost identical:
lkd> u nt!ZwCreateFile
nt!ZwCreateFile:
804fd724 b825000000 mov eax,25h
804fd729 8d542404 lea edx,[esp+4]
804fd72d 9c pushfd
804fd72e 6a08 push 8
804fd730 e83cf10300 call nt!KiSystemService (8053c871)
804fd735 c22c00 ret 2Ch
We see that the function just places the value 0x25 into eax, points edx at the stack, and calls nt!KiSystemService. It turns out that this value, 0x25, is the system call number that corresponds to the CreateFile function.

Without going into too much detail about how KiSystemService works, the function essentially takes the value in the eax register, and then looks up that entry in a global system call table. The table contains function pointers to the actual kernel-land functions that implement that system call.

But, of course, the situation isn't quite as simple as that. In fact, Windows is designed to allow third party developers to add their own system calls. To support this, each _KTHREAD contains a member named ServiceTable which is a pointer to a data structure that looks like this:
typedef struct _SERVICE_DESCRIPTOR_TABLE {
SERVICE_DESCRIPTOR_ENTRY Descriptors[4];
} SERVICE_DESCRIPTOR_TABLE;

typedef struct _SERVICE_DESCRIPTOR_ENTRY {
PVOID KiServiceTable;
PULONG CounterBaseTable;
LONG ServiceLimit;
PUCHAR ArgumentTable;
} SERVICE_DESCRIPTOR_ENTRY;
As you can see, we can actually have up to four separate system service tables per thread! In practice, however, we only see the first two entries in this array filled in: the first one points to nt!KiServiceTable, which contains the functions that deal with standard OS functionality, and the second points to win32k!W32pServiceTable, which contains the functions for the GDI subsystem (managing windows, basic graphics functions, and so on). For system call numbers up to 0x1000, the first table is used, while for the range 0x1000-0x2000 the second table is consulted (this may generalize for 0x2000-0x3000 and 0x3000-0x4000, but I haven't tested it).

To take a look at the contents of these two tables, we can use the dps command in WinDbg, which takes a memory address and then attempts to look up the symbolic name of each DWORD starting at that address. To examine the full table, you should pass dps the number of DWORDS you want to examine -- the exact number will be the value found in the ServiceLimit member for the table you're interested in. For example:
lkd> dps nt!KiServiceTable L11c
805011fc 80598746 nt!NtAcceptConnectPort
80501200 805e5914 nt!NtAccessCheck
80501204 805e915a nt!NtAccessCheckAndAuditAlarm
80501208 805e5946 nt!NtAccessCheckByType
[...]
8050128c 8060be48 nt!NtCreateEventPair
80501290 8056d3ca nt!NtCreateFile
80501294 8056bc5c nt!NtCreateIoCompletion
[...]
Note that NtCreateFile is the 0x25th entry in the table, as we expected. On a system with no hooks installed, all functions in nt!KiServiceTable will point into the kernel (ntoskrnl.exe), and all functions in win32k!W32pServiceTable will be be inside win32k.sys. If they don't, it means the function has been hooked.

The plugin for Volatility, then, works as follows. First, we go over each thread in each process, and gather up all distinct pointers to service tables. We examine all of them in case one thread has had its ServiceTable changed while the others remain untouched. Then we display each entry in each (unique) table, along with the name it usually has (in an unhooked installation), and what driver the function belongs to. Here's some sample output:
$ python volatility ssdt -f xp-laptop-2005-07-04-1430.img
Gathering all referenced SSDTs from KTHREADs...
Finding appropriate address space for tables...
SSDT[0] at 804e26a8 with 284 entries
Entry 0x0000: 0x805862de (NtAcceptConnectPort) owned by ntoskrnl.exe
Entry 0x0001: 0x8056fded (NtAccessCheck) owned by ntoskrnl.exe
Entry 0x0002: 0x8058945b (NtAccessCheckAndAuditAlarm) owned by ntoskrnl.exe
[...]
Entry 0x0035: 0xf87436f0 (NtCreateThread) owned by wpsdrvnt.sys
[...]
SSDT[1] at bf997780 with 667 entries
Entry 0x1000: 0xbf93517d (NtGdiAbortDoc) owned by win32k.sys
Entry 0x1001: 0xbf946c1f (NtGdiAbortPath) owned by win32k.sys
[...]
Here we can see that the NtCreateThread function has been hooked by wpsdrvnt.sys. A little Googling shows that this driver is a part of Sygate Personal Firewall -- as mentioned before, security products are the most common non-malicious software that hooks kernel functions.

In closing, I should mention one caveat to using this tool: at the moment, the names of the system calls are hardcoded with the values derived from WinDbg on Windows XP SP2. As demonstrated by the Metasploit System Call Table page, the order and number of entries in the system call table change between different versions of Windows, so make sure that you only analyze SP2 images with this plugin! As always, patches are welcome if you want to adapt this to deal with other versions of Windows.

Now go forth, and catch those rootkits!

Sunday, August 17, 2008

Introducing Volshell

This one's for all the command line lovers out there: I'm happy to release volshell, an interactive shell built on Python and designed with memory analysis research in mind. I gave a demo of this at my OMFW talk, "Interactive Memory Exploration with Volatility"; since it was more of a live demo, I don't have slides from that, but you can find my notes here. You should be able to follow the notes as a sort of walkthrough that will get you up and running with volshell, and introduce some of the more advanced features.

Briefly, here are some of the features of volshell:
  • Shell is a full Python interpreter, so all the power of Python can be leveraged.
  • Uses Volatility 1.3 object model for easy access to data structures in memory.
  • Can use iPython for the underlying shell if available, which enables some nice features.
  • Commands modelled after WinDbg.
  • Works with any memory image format that Volatility supports (dd, crash, vmem, hibernation file)
To use it, just download volshell.py and drop it in your memory_plugins directory in Volatility 1.3. Then start the shell with:

$ python volatility volshell -f $IMAGE

Enjoy!

Saturday, August 16, 2008

Linking Processes to Users

In the course of an investigation, it may be critical to be able to link up a process that's running to a particular user account. Particularly in a multi-user environment such as Windows Terminal Server, this isn't always as easy as checking who was logged in at the time.

Luckily, each process in Windows has an associated token, a chunk of metadata that describes what Security Identifier (SID) owns the process and what privileges have been granted to it. As Larry Osterman explains, A SID is essentially a unique ID that is assigned to a user or group, and is broken into several parts: the revision (currently always set to 1), the identifier authority (describing what authority created the SID, and hence how to interpret the subauthoriries), and finally a list of subauthorities.

In general, when users see SIDs (which they rarely do), they are in what's called the Security Descriptor Definition Language (SDDL) form. This is a string that looks like:
S-1-5-21-1957994488-484763869-854245398-513

Here, "1" is the revision, "5" is the identifier authority, and the remaining portions are the subauthorities. The exact data structure for a SID is:
typedef struct _SID {
BYTE Revision;
BYTE SubAuthorityCount;
SID_IDENTIFIER_AUTHORITY IdentifierAuthority;
DWORD SubAuthority[ANYSIZE_ARRAY];
} SID, *PISID;

The SID_IDENTIFIER_AUTHORITY here is actually an array of 6 characters. However, at the moment only the final character is used. Osterman's article enumerates all the possible identifier authorities; for our purposes we will be focusing on the NT authority, which is {0,0,0,0,0,5}. This is the authority which describes accounts managed by the NT security subsystem.

So how can we get this information from a memory image? We start, as usual, by looking at the _EPROCESS structure. Inside it, we find the Token member at offset 0xc8. However, the member doesn't directly point to an object of type _TOKEN, as we might expect. Instead, it is described as an _EX_FAST_REF. These types of objects are basically an optimization used by Windows to store both a pointer to and object and the object's reference count, all in a single DWORD. In an _EX_FAST_REF, the last 3 bits are co-opted to encode the reference count of the object. To get the actual pointer, you can mask off the last 3 bits, like so:
token_address = proc.Token.Value & ~0x7

Now, on to the token itself. Each token contains a list of user and group SIDs. The relevant members of the _TOKEN structure (for our immediate purpose) are UserAndGroupCount (unsigned long) and UserAndGroups (pointer to array of _SID_AND_ATTRIBUTES), at offsets 0x4c and 0x68, respectively. _SID_AND_ATTRIBUES, in turn, contains a pointer to the SID itself and a DWORD of flags giving the SID's attributes (the meaning of which are dependent on the type of SID; for group SIDs, the flags can be found in winnt.h).

Unfortunately, just having the SIDs by themselves may not be so meaningful to you. Actual account names would be better; luckily, these can be found by looking in the registry. The key HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\ProfileList contains all the local user account SIDs on the machine, along with the location of their profile on disk. The username can usually be inferred from this; e.g. a profile directory of %SystemDrive%\Documents and Settings\Bob would imply that the username is Bob.

Aside from the individual account SIDs, there are also a number of well-known group SIDs. These are SIDs that Microsoft has set aside for specific purposes, and will be the same on any Windows machine. A full list is available in KB243330, "Well-known security identifiers in Windows operating systems".

As a reward for sitting through all that dry description, here's a nice juicy tool that you can use to get started looking at the SIDs associated with a process in a memory image, written as a plugin for the just-released Volatility 1.3. You can download getsids.py here. To use it, just drop it in your plugins directory (memory_plugins) and run:
$ python volatility getsids -f xp-laptop-2005-07-04-1430.img
[...]
dd.exe (3300): S-1-5-21-1957994488-484763869-854245398-1006
dd.exe (3300): S-1-5-21-1957994488-484763869-854245398-513 (Domain Users)
dd.exe (3300): S-1-1-0 (Everyone)
dd.exe (3300): S-1-5-32-544 (Administrators)
dd.exe (3300): S-1-5-32-545 (Users)
dd.exe (3300): S-1-5-4 (Interactive)
dd.exe (3300): S-1-5-11 (Authenticated Users)
dd.exe (3300): S-1-5-5-0-49673 (Logon Session)
dd.exe (3300): S-1-2-0 (Users with the ability to log in locally)

Volatility 1.3 is out!

After tons of hard work by a lot of people (including me), Volatility 1.3 has been released to the world at large. AAron has a blog post up with all the juicy details, including a list of new features. For my part, I want to take this opportunity focus on a couple new things that I think are really cool. They're mostly developer-focused, so if you have no interest in adding new capabilities to Volatility, you can skip the rest of this entry and just head over to AAron's post to see all the new modules and functionality.

The first feature I want to point out is the new plugin system. Basically, rather than creating a new module and then editing vmodules.py to add new commands to Volatility, you can now just create a class extending forensics.commands.command with the code you want to run, drop it into a file, and put that file into the memory_plugins directory, and Volatility will pick it up and see it as a new command that can be run. This means that anyone can just give out a single file and allow anyone to use it in Volatility with minimal effort.

In addition, it removes the need to manually integrate new modules from contributors into the source tree; instead, plugins can be developed and distributed independently, without relying on the Volatility devs at all. I'm hoping that this capability will allow a cool little cottage industry of plugin development to form around Volatility, in much the same way that users of EnCase currently trade EnScripts.

Second, I want to describe the new object model. Volatility 1.3 contains a new way of working with data structures in memory dumps. Each data structure found in vtypes.py can now be instantiated at a given memory address as a full-fledged Python object, and the data inside it can be accessed using standard Python syntax. No need to use read_obj again! For example, to print the size of a process located at address 0x823c87c0, we can do:

eprocess = Object('_EPROCESS', 0x823c87c0,
addr_space, None, profile=Profile())
print eprocess.VirtualSize


In addition, each structure can be given object-specific behaviors by subclassing the main Object class. For example, the _UNICODE_STRING type's Buffer member is a pointer to an unsigned short. By creating a specific _UNICODE_STRING class (as is done in memory_objects/Windows/xp_sp2.py), we can cause the Buffer member to be returned to the user as a Python string with the correct string data, automatically translated from Unicode.

Hopefully, with these new features, developing cool stuff for Volatility will be easier than ever. I know that for myself, I've found that it's now orders of magnitude faster to go from "Wouldn't it be neat if ..." to a full, working plugin. In the coming weeks, I'll be writing some posts introducing the new development features in more depth, so that as many people as possible can get involved. Remember, the power of Volatility is in its community!

Friday, August 15, 2008

Sorry for the Hiatus!

It's been quite a while since I wrote any new blog posts. This isn't entirely because I've been lazy; rather, I've picked up and relocated to sunny (and often hot and humid) Atlanta, Georgia to start the PhD program at Georgia Tech. I'm going to be working on lots of cool stuff with the Georgia Tech Information Security Center.


Now that I'm here and starting to get settled in, you can expect the blog posts to start up again. Particularly with the forthcoming release of Volatility 1.3, I'm going to have a lot of new plugins and functionality to blog about.


As a teaser, here are some of the things I've got in the works:


  • getsids.py -- get the SID (kind of like a user ID in unix) that owns each process

  • moddump.py -- extract loaded kernel modules from memory

  • unloaded_modules.py -- list recently unloaded kernel modules

  • ssdt.py -- show the System Service Descriptor Table, along with the kernel module that owns the memory. This can be used to detect hooking, legitimate and otherwise.

  • volshell.py -- an interactive shell designed for exploration of memory images (presented at OMFW; note that this is aimed mainly at memory forensics researchers)

  • windowlist.py -- extracts a list of window handles and titles by using some reverse-engineered GDI structures in the kernel



I'm planning on accompanying these with posts describing the technical details of how they work. Also, as soon as I get the code ported to 1.3, I'll be releasing the code I wrote to extract registry information (as presented in my DFRWS paper).


I also spoke with Michael Cohen, the creator of PyFlag at DFRWS, and it sounds like he's interested in integrating in-memory registry support into PyFlag through Volatility. This will let users access the registry data in a memory dump through the PyFlag VFS, and perform queries and correlation on the registry data. This will be, I believe, "wicked awesome" (technical term).


Hopefully this has given you a taste of things to come, and gotten you good and excited about 1.3 (the amazing features of which I'll also be writing about soon). Stay tuned!