Posts

Showing posts from August, 2015

(Sys)Call Me Maybe: Exploring Malware Syscalls with PANDA

System calls are of great interest to researchers studying malware, because they are the only way that malware can have any effect on the world – writing files to the hard drive, manipulating the registry, sending network packets, and so on all must be done by making a call into the kernel. In Windows, the system call interface is not publicly documented, but there have been lots of good reverse engineering efforts, and we now have full tables of the names of each system call ; in addition, by using the Windows debug symbols, we can figure out how many arguments each system call takes (though not yet their actual types). I recently ran 24,389 malware replays under PANDA and recorded all the system calls made, along with their arguments (just the top-level argument, without trying to descend into pointer types or dereference handle types). So for each replay, we now have a log file that looks like: 3f9b2340 NtGdiFlush 3f9b2340 NtUserGetMessage 0175feac 00000000 00000000 000000

One Weird Trick to Shrink Your PANDA Malware Logs by 84%

When I wrote about some of the lessons learned from P ANDA Malrec 's first 100 days of operation , one of the things I mentioned was that the storage requirements for the system were extremely high. In the four months since, the storage problem only got worse: as of last week, we were storing 24,000 recordings of malware, coming in at a whopping 2.4 terabytes of storage. The amount of data involved poses problems not just for our own storage but also for others wanting to make use of the recordings for research. 2.4 terabytes is a lot, especially when it's spread out over 24,000 HTTP requests. If we want our data to be useful to researchers, it would be great if we could find better ways of compressing the recording logs. As it turns out, we can! The key is to look closely at what makes up a PANDA recording: The log of non-deterministic events (the -rr-nondet.log files) The initial QEMU snapshot (the -rr-snp files) The first of these is highly redundant and actually