The Coroner’s Toolkit

When a malcontent breaks into, or cracks, your computer, your reactions are likely to be very much the same. What was taken? What was left behind? Is the computer safe to use? How can I keep my computer safer in the future? To find answers, reach for The Coroner’s Toolkit.
When a thief breaks into your home, you’re likely to feel victimized, vulnerable, and confused. You may wonder: What was taken? Will the house ever feel safe again? What can I do to protect myself from another intrusion?
When a malcontent breaks into, or cracks, your computer, your reactions are likely to be very much the same. What was taken? What was left behind? Is the computer safe to use? How can I keep my computer safer in the future?
While the latter question is important, the former three questions weigh more heavily immediately after a break-in. The suits and the geeks want an assessment as soon as possible, especially if the compromised system held critical information or served a critical purpose.
If you (unfortunately) find yourself heading up such an investigation, reach for The Coroner’s Toolkit (TCT, http://www.porcupine.org/forensics/tct.html), a collection of programs by Dan Farmer and Wietse Venema (of Postfix fame) that performs post mortem analysis of Linux and Unix systems. TCT may not finger the perp, but it can resurrect the dead.
Whoever quipped “An ounce of prevention is worth a pound of cure” must have been a system administrator. Many system problems can best be solved by preventing them from occurring. But because not all problems can be prevented, the next best practice is preparation. Indeed, the TCT documentation says, “TCT probably won’t help you out unless you’ve already looked at it, played with it, and know what[ its] tools do, as well as what to expect from them.”
So, let’s take a look at TCT to prepare you for the time — perhaps the inevitable — when you need to deploy TCT.

Readying the Patient

Installation of TCT is easy, clean, and straightforward on any of the supported systems. According to the TCT FAQ, TCT is supported on FreeBSD 2-4.*,OpenBSD 2.*,BSD/OS 2-4.*,SunOS 4-5.*, and Linux 2.*. The test machine ran Fedora Core 1, a “vanilla” kernel (2.4.22-1.2115.nptl), and because Perl is also needed, Perl 5.8.1-92.
Download the source package and unpack it. Then change to the resulting TCT directory (let’s call it $TCT_HOME) and type make. If your system is supported, the compilation should finish fairly quickly, leaving you with binaries in the $TCT_HOME /bin subdirectory. Unlike other packages, TCT has no make install step; you simply run the tools from the TCT bin directory. If you’d like, you can add that directory to your PATH and even set TCT_HOME in your environment.
The most significant tools in TCT include grave-robber, mactime, unrm, and lazarus.

Using grave-robber

TCT was written according to the “Order Of Volatility” (OOV) principle, which states that one should always collect more volatile information first. So, TCT looks at process and network state before examining disks, and it collects file time stamps before accessing any files.
OOV motivates the design of grave-robber, too. grave-robber runs a suite of sub-programs to capture forensic information about a system. It captures process and network information, the state of open files, as well as gathering data from directories. After a break-in, the data collected by grave-robber can be used to aid analysis. (The TCT database is store is stored in $TCT_HOME /data.
Because grave-robber requires privileged access (in some cases), its best to run the tool as root.
Figure One shows the kind of output you can expect when you run grave-robber for the first time. If executed with no arguments, grave-robber starts recursing the file system in the root directory, /.
FIGURE ONE: grave-robber collects data when run the first time
# grave-robber
Starting preprocessing paths and filenames on fermi...
Processing $PATH elements...
/usr/kerberos/sbin
/usr/kerberos/bin
/usr/local/mozilla
/usr/local/sbin
/usr/local/bin
/sbin
/bin
/usr/sbin
/usr/bin
/usr/X11R6/bin
/usr/local/java/bin
/usr/local/firefox
/usr/local/thunderbird
/util
/root/bin
/usr/local/tct-1.15/
Processing dir /usr/local/tct-1.15
Processing dir /etc
Processing dir /bin
Processing dir /sbin
Processing dir /dev
Finished preprocessing your toolkit... you may now use programs or
examine files in the above directories
During execution, grave-robber gathers an enormous amount of baseline information: memory usage and allocation; unallocated filesystem space; the current output of netstat, route, arp, and other network tools; and all process data (via ps and lsof). grave-robber also runs stat and md5 on all files, and runs strings on directories and configuration information.
All of the data collected data is stored for future reference. Information about how and when the data was collected is stored in $TCT_HOME /coroner.Log. Error output is stored in $TCT_HOME /error.Log. The actual collected data is stored in the $TCT_HOME /data directory, organized by hostname. In each hostname directory, you’ll find a file called body, which contains the actual data in ASCII text. Each host’s directory also contains a variety of other directories that contain additional historical system information, including open files, directory layouts, and encrypted copies of files.
Using grave-robber, a system administrator can craft a blueprint of how a system typically runs, and can then later compare that signature to the system after a crack. A system administrator can also run grave-robber after a crack of a critical system to attenuate the effects of some rogue or kitted process in memory.
Where should you begin your analysis after a crack? The first things to investigate are file access times. TCT provides the utility mactime to review which files have been accessed, modified, or created within a specific timeframe. For example, if you think that configuration files or sensitive data files have been modified or access, you can use mactime to find those files quickly.
Figure Two shows a simple example. The dot (.) limits mactime to the current working directory. The –d option calculates the modification time without a dependency on a body file.
FIGURE TWO: mactime reveals the truth about the last modified time
$ ls –l
total 4
-rw-r--r-- 1 mfrye mfrye 800 Oct 13 18:15 garble.c
$ mactime –d . 13oct2004
Oct 13 04 18:15:00 800 ma. -rw-r--r-- mfrye mfrye ./garble.c
Jan 02 05 22:50:10 800 ..c -rw-r--r-- mfrye mfrye ./garble.c
garble.c is a source code file that, for the sake of this example, was last modified by its developer on Oct 13 04. An ls –l looks OK, but mactime reveals the true modification time as Jan 02 05. Whoever modified the file tried to hide the mischief using touch –t or something similar. While other utilities can give you similar information, once your system is compromised you can’t trust those utilities either.
If you’re conducting an autopsy of a system that has definitely been cracked, but you have no idea which files have been tampered with, you can run mactime –R to run a recursive.
Figure Three supposes that the system has been cracked and that trojans are implanted. Assuming that no staff accessed the system since December 24, 2004, the command mactime –d /usr/local/tct-1.15 12/23/2004-1/2/2005 reveals
that unauthorized access of several files occurred on Dec 27.
FIGURE THREE: Revealing unauthorized access
$ mactime –d /usr/local/tct-1.15 12/23/2004-1/2/2005
Dec 27 04 17:37:09 4096 ..c drwxr-xr-x 1001 1001 /code/garble/Date
4096 ..c drwxr-xr-x 1001 1001 /code/garble/docs
Dec 27 04 17:37:10 484 ..c -rw-r--r-- 1001 1001 /code/garble/TODO.before-next-release
1250 ..c -rw-r--r-- 1001 1001 /code/garble/additional-resources
15146 ..c -rw-r--r-- 1001 1001 /code/garble/help
4096 ..c drwxr-xr-x 1001 1001 /code/garble/src
1742 ..c -rw-r--r-- 1001 1001 /code/garble/bibliography
5 ..c -rw-r--r-- 1001 1001 /code/garble/patchlevel
4500 ..c -rw-r--r-- 1001 1001 /code/garble/TODO
26323 ..c -rw-r--r-- 1001 1001 /code/garble/CHANGES
11954 ..c -rw-r--r-- 1001 1001 /code/garble/LICENSE
1563 ..c -rw-r--r-- 1001 1001 /code/garble/quick-start
4096 m.c drwxr-xr-x 1001 1001 /code/garble/extras
Dec 27 04 21:20:06 4096 m.c drwxr-xr-x 1001 1001 /code/garble/conf
Dec 27 04 21:20:22 4096 m.c drwxr-xr-x 1001 1001 /code/garble/etc
Dec 27 04 21:20:25 4096 m.c drwxr-xr-x 1001 1001 /code/garble/lib
4096 m.c drwxr-xr-x 1001 1001 /code/garble/bin
Each file now needs to be verified. The files should be checked using mactime without the –d option to compare the last, true modification time of the file to what’s stored in the grave-robber database (assuming you ran grave-robber previously to establish a baseline.) Once you’ve determined which files have been modified, you can restore the files from backups.
But what if you find that files have been deleted? That’s a job for unrm.

Out With the New, In With the Old

Reconstructing data is a two step process with TCT. The first step is recovering the data with a utility called unrm.
Unfortunately, unrm is not as straightforward as it may sound. unrm isn’t the opposite of rm. That is, you can’t run unrm /home/mfrye/important just after you delete it. However, unrm may allow you to recover deleted files provided it has enough disk space to work and you allow it enough time to perform its processing.
Let’s look at an example of how unrm works. Assume that your Linux box has been cracked and the intruder deleted hundreds of files. Further assume that you have a separate disk drive on the system with at least 2 GB free. Figure Four shows how to use unrm to try and recover the missing files.
FIGURE FOUR: Using unrm to recover deleted files
# mkdir /code/output
# mount /dev/hdb1 /code/output
# unrm /dev/hda2 > /code/output/unrm.out
unrm processes the existing disk, /dev/hda2, and restores files to the second disk, /dev/hdb1, which is mounted at /code/output. It’s advisable to use a separate physical disk for restoration, because unrm works by opening the named device and copying data blocks. By default, unrm copies unallocated data blocks only, so writing to the same disk could defeat the purpose of running unrm.
If you have only one disk in the machine you’re attempting to recover, it’s best to run shutdown immediately and move the compromised hard disk to another machine for processing.
The output of unrm is pure data. Depending on the size of the hard disk and the amount of mischief inflicted, the output of unrm can be small or very large.

Raising the Dead

Now on to the next step of reconstructing the lost data: using the aptly-named lazarus, a Perl script that makes sense of the unrm output.
lazarus examines the first 100 bytes of each block of data in the unrm output file, looking for text file headers. If a block turns out to be binary, lazarus uses file instead. For each block that’s the same type, lazarus appends that data, making the assumption that the blocks belong to the same file. However, if the block type changes, lazarus separates that data as a new file. Thus, lazarus splits the output of unrm into many smaller chunks.
lazarus creates two directories. The first directory is $TCT_HOME /blocks, and it stores the recovered data that lazarus has divided by type. The second directory is $TCT_HOME /www, which contains a series of HTML files that link to data in the first directory. These pages help you analyze the data.
For example, the commands in Figure Four produced the file /code/output/unrm.out. Continuing, the command…
$ cd /code/output
$ lazarus –h unrm.out
… produces data in $TCT_HOME /blocks, HTML in $TCT_HOME /www, and a main, HTML “index” page in /code/output/unrm.out.frame.html. lazarus, like unrm, takes several minutes, if not hours, to run depending on the amount of data to be resurrected. (For reference, a 1.6 GB unrm output file took about over four hours to be processed on a 450 MHz Pentium with 192 MB of RAM.)
Once lazarus finishes, load up the index file in a local browser window. Figure Five shows a sample of the lazarus output. The contents of the unrm.out file are shown as a map of your disk. Letters indicate type, as shown by the legend at the top of the page.
FIGURE FIVE: The output of lazarus, resurrector of files



While this HTML output gives you a good sense of where things are located in the unrm output and can be helpful when you’re looking for files based on their (former) locality, it’s not so useful for finding specific things. However, using a few regular expressions, you can find files based on the content of the files rather than locality.
Figure Six shows the output of a grep looking for files containing the string “trilug.” The –i flag ignores case.
FIGURE SIX: Searching through recovered blocks using grep
$ cd $TCT_HOME/blocks
$ grep –i trilug *
26461.m.txt:TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
26461.m.txt:TriLUG Organizational FAQ : http://trilug.org/faq/
26461.m.txt:TriLUG Member Services FAQ : http://members.trilug.org/services_faq/
26461.m.txt:TriLUG PGP Keyring : http://trilug.org/~chrish/trilug.asc
If grep complains Argument list too long, try searches with a little more finesse and granularity. Using the find command, you can also narrow things down further. For example:
$ find . –type f –name \*.txt –exec grep \
–C5 –i mfrye@trilug.org {} \;
>
> >
> >
> > Matt
From mfrye@trilug.org Fri Apr 9 12:59:17 2004
Subject: Re: My name misspelled in trilug
From: Matthew Frye <mfrye@trilug.org>
A handy trick for searching out source code is to search for a specific #include or function that the code uses.
What if you’re looking for images or sound files? While these files are virtually impossible to search for using the same methods used to find text, the browser output finds these files and identifies them without much trouble. For example, Figure Seven shows an image recovered by unrm.
FIGURE SEVEN: Navigation links and a recovered file, courtesy of lazarus



lazarus’ s HTML provides useful links for navigating your recovered files, as shown at the top of Figure Seven. “previous block of same type”, “next block of same type,” “previous block,” and “next block” can be used to navigate from file to file. If you’ve determined that you are in the general area of recovered data that you need, these links can give you granular access tothat locality.
This is more useful than it sounds. For example, if you are looking for a particular image from a news web site, it would be difficult to find the image because TCT reassigns numbered names to each recovered image. However, if you recall that you read a story that accompanied the image, you can grep in the blocks directory for the text of that story and the image should be found within a few blocks in the recovered data map. Using the HTML, along with refined command line searches, you should be able to locate most any file. Just remember to give yourself enough disk space and time.
Depending on how much data unrm brings back and the size of the partition you are working with, lazarus can sometimes encounter an unrm output file that is greater than 2 GB in size. These days, large file support and 64-bit operating systems make these files easier to deal with. However, should you find that lazarus can’t process your unrm output, use an operating system and Perl binary that can handle large files sizes when you do your recovery work.

Just Call Yourself “Quincy”

The Coroner’s Toolkit is a powerful set of utilities for investigating and undoing damage, intentional or accidental. grave-robber, mactime, unrm, and lazarus make for a powerful combination and allow a system administrator to develop a disaster plan with the capability of recovering lost data, a veritable holy grail in system administration.
Remember, though, that TCT is relatively unpolished and that many analysis tools still need to be written. TCT can spend a lot of time collecting data and you may still end up recovering nearly nothing of value. The best idea is to practice with TCT.
To hone your computer forensics skills further, review Wietse Venema and Dan Farmer’s handouts (http://www.porcupine.org/forensics/handouts.html) from a Computer Forensics Analysis class presented in August 1999.

Matt Frye is a system administrator and currently serves as Chairman of the North Carolina System Administrators, and Director of the Center for Nonprofit Computing. You can reach Matt at class="emailaddress">mattfrye@gmail.com.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62