dcsimg

Storage Monitoring via SystemTap

With storage becoming increasingly complex, being able to monitor what's happening with your servers has taken on a critical role. To truly understand what is happening with your storage you may need to monitor what is happening within the kernel.

The SystemTap scripting language is fairly easy to understand if you know a little C language. Below is a snippet of a .stp file taken from here.

global count=0

probe kernel.function("sys_sync") {
  count++
  printf( "sys_sync called %d times, currently by pid %d\n", count, pid );
}

This simple example just counts the number of times the kernel function sys_sync is called, printing the number of times it is called to stdout so where ever you run stap you will see the output. Alternatively, you could have opened a file and written the information to that file. In addition, notice that the script also print the Process ID (pid) of the process that called the kernel function sys_sync. This can be very useful in tracking down processes that are causing problems.

Notice that the simple script uses a function call, kernel.function(“sys_sync”) that is called whenever the kernel function sys_sync is called. Generically these SystemTap functions are called “probe points” or “probe patterns”. There are a number of these available and some examples are listed below in Table 1.

Table 1 – SystemTap Probe Point Examples

Probe Type Description
syscall.read Triggers when entering the read() system call
syscall.close.return Triggers when returning from the close() system call
module(“floppy”).function(“*”) Triggered when entering any function in the floppy module
module(“ext3″).function(“ext3_file_write”) Triggered when entering the function “ext2_file_write” in the ext3 module
kernel.function(“*@net/socket.c”).return Triggered when returning from any function in the file net/socket.c (this is in the kernel)
kernel.statement(“*@kernel/sched.c:2917″) Triggered when hitting line 2917 of the file kernel/sched.c in the Linux kernel
timer.jiffies(1000) Causes the SystemTap probe to be fired every 1000 kernel jiffies (link)
process(“/bin/ls”).function(“*”) Triggered when any entering any function in /bin/ls but not its libraries of syscalls)
kernel.function(“*init*”) Triggered when entering any kernel function with “init” in its name

There are also a large number of elements to the scripting language as listed below in Table 2 (reproduced from here.

Table 1 – SystemTap Language Examples

Element Description
if (exp) {} else {} Standard if-then-else statement
for (exp1 ; exp2 ; exp3 ) {} “for” loop (almost exactly like C language)
while (exp) {} “While” loop
do {} while (exp) Pretty typical do-while loop
break break statement (break an iteration of if block)
continue Continue iteration
next Return from the probe
return Return an expression from the probe (much like C)
foreach (VAR in ARRAY) {} Array iteration

SystemTap Example for NFS

There are example SystemTap scripts that can be found by a simple Google search. For example, this page has a number of scripts, including one for NFS that counts the number of NFS functions used by each process (pid). This script can be useful in finding applications that are really hitting NFS very hard.

But there are also SystemTap scripts that come as a “library”. These are referred to as “Tapsets” but they can be used in your scripts to provide functionality that you may need. There are two TapSets for NFS – one for the server and one for the client. The NFS server Tapset, stapprobes.nfsd, has several defined probe points for NFSv2 and NFSv3 operations.


  • nfsd.proc.lookup: This probe is triggered whenever a client opens or searches a file on server.
  • nfsd.proc.read: This probe is triggered whenever a client reads data from a file on the server.
  • nfsd.proc.write: This probe is triggered whenever a client writes data to a file on the server.
  • nfsd.proc.commit: This probe is triggered whenever a client does a commit operation.
  • nfsd.proc.create: This probe is triggered whenever a client creates a file on the server.
  • nfsd.proc.remove: This probe is triggered whenever a client removes a file on the server.
  • nfsd.proc.rename: This probe is triggered whenever a client a file on the server.
  • nfsd.proc.compound: This probe is triggered whenever the server receives a NFSV4 operation from a client.
  • nfsd.open: This probe is triggered whenever the NFS server opens a file.
  • nfsd.read: This probe is triggered whenever the NFS server reads a file.
  • nfsd.writ3: This probe is triggered whenever the NFS server writes to a file.
  • nfsd.commit: This probe is triggered whenever the NFS server commits to a file.
  • nfsd.lookup: This probe is triggered whenever the NFS server opens or searches for a file.
  • nfsd.create: This probe is triggered whenever the NFS server creates a file.
  • nfsd.createv3: This probe is triggered whenever a client creates a regular file or sets file attributes on the server side.
  • nfsd.unlink: This probe is triggered whenever a client removes a file or a directory on the server side.
  • nfsd.rename: This probe is triggered whenever a client renames a file on the server side.
  • nfsd.close: This probe is triggered whenever the NFS server closes a file.
  • nfsd.dispatch: This probe is triggered whenever the NFS server receives an NFS operation from a client.

There are also probes on the NFS client side in stapprobes.nfs:


  • nfs.fop.llseek: This probe is triggered whenever a llseek operation is performed on the NFS client side.
  • nfs.fop.llseek.return: This probe is triggered whenever an llseek operation has returned (client side NFS).
  • nfs.fop.read: This probe is triggered whenever a read operation happens on the NFS client side.
  • nfs.fop.read.return: This probe is triggered whenever a read operation returns (client side NFS).
  • nfs.fop.write: This probe is triggered whenever a write operation happens on the NFS client side.
  • nfs.fop.write.return: This probe is triggered whenever a write operation returns (client side NFS).
  • nfs.fop.aio_read: This probe is triggered whenever an aio_read operation happens on the NFS client side.
  • nfs.fop.aio_read.return: This probe is triggered whenever an aio_read operation returns (client side NFS).
  • nfs.fop.aio_write: This probe is triggered whenever an aio_write operation happens on the NFS client side.
  • nfs.fop.aio_write.return: This probe is triggered whenever an aio_write operation returns (client side NFS).
  • nfs.fop.mmap: This probe is triggered whenever an mmap file operation occurs on the NFS client side.
  • nfs.fop.open: This probe is triggered whenever there is a file open operation on the NFS client side.
  • nfs.fop.flush: This probe is triggered whenever there is a flush operation on the NFS client side.
  • nfs.fop.release: This probe is triggered whenever a page release operation on the NFS client side.
  • nfs.fop.fsync: This probe is triggered whenever there is a an fsync operation on the NFS client side.
  • nfs.fop.lock: This probe is triggered whenever there is a fle lock operation on the NFS client side.
  • nfs.fop.sendfile: This probe is triggered whenever a send file operation is done on the NFS client side.
  • nfs.fop.sendfile.return: This probe is triggered whenever a send file operation returns (NFS client side).
  • nfs.fop.check_flags: This probe is triggered whenever there is a check flag operation on the NFS client.
  • nfs.aop.readpage: This probe is triggered whenever an async read operation failed on the NFS client side.
  • nfs.aop.readpages: This probe is triggered whenever several pages are read (client side NFS).
  • nfs.aop.readpages.return: This probe is triggered whenever it is returned from reading several pages (NFS client side).
  • nfs.aop.set_page_dirty: This probe is triggered whenever “set dirty pages” happens on the NFS client side.
  • nfs.aop.writepage: This probe is triggered whenever the client writes a mapped page to the server.
  • nfs.aop.writepages: This probe is triggered whenever several dirty pages are written from the NFS client to the NFS server.
  • nfs.aop.prepare_write: This probe is triggered whenever a client prepares a page for writing on the client side.
  • nfs.aop.commit_write: This probe is triggered whenever a client commits a page to be written.
  • nfs.aop.release_page: This probe is triggered whenever a client releases a page.
  • nfs.proc.lookup: This probe is triggered whenever the client executes a search or open operation.
  • nfs.proc.read : This probe is triggered whenever the client synchronously reads a file from the server.
  • nfs.proc.read.return : This probe is triggered whenever there is a return from a synchronously file read from the server (this is on the client).
  • nfs.proc.write : This probe is triggered whenever a client synchronously writes a file to the NFS server.
  • nfs.proc.write.return : This probe is triggered whenever a client returns from synchronously writing a file to the NFS server.
  • nfs.proc.commit: This probe is triggered whenever a client writes the buffered data to disk where the buffered data is asynchronously written by client before (doesn’t exist in NFSV2).
  • nfs.proc.commit.return: This probe is triggered whenever the client returns from a nfs.proc.commit operation.
  • nfs.proc.read_setup : This probe is triggered whenever a client asynchronously reads file from server. This function is used to setup a read rpc task, not do a real read operation.
  • nfs.proc.read_done: This probe is triggered whenever, on the client, a read reply is received or some read error occurs (timeout or socket shutdown).
  • nfs.proc.write_setup : This probe is triggered whenever the client asynchronously writes data to the server. This function is used to setup a write rpc task, not do a real write operation.
  • nfs.proc.write_done: This probe is triggered whenever, on a client, a write reply is received or some write error occurs (timeout or socket shutdown).
  • nfs.proc.commit_setup: This probe is triggered whenever the client asynchronously does a commit operation. This function is used to setup a commit rpc task, not do a commit operation.
  • nfs.proc.commit_done: This probe is triggered whenever, on the client, a commit reply is received or some commit error occur (timeout or socket shutdown).
  • nfs.proc.open: This probe is triggered whenever, doing an open operation on nfs client side, the nfs_open function is used to allocate file read/write context information.
  • nfs.proc.release: This probe is triggered whenever the client does a release operation.
  • nfs.proc4.handle_exception: This probe is triggered whenever, on an NFSv4 client, the error handling is encountered (only exist in NFSV4).
  • nfs.proc.remove: This probe is triggered whenever the NFS client removes a file from the server.
  • nfs.proc.rename: This probe is triggered whenever the NFS client renames a file on the server.

It’s fairly obvious there are a large number of probe points for the NFS client.

This is just an example of what could be achieved with SystemTap – using a combination of probe points on the client and the server. One could easily construct an NFS monitoring system based on these Tapsets. Using the same approach, one could also easily envision a comprehensive set of probes that gather information about the performance and health of the storage on a server or the clients. But I haven’t found any projects that do this in a collective manner. Hmm…. .

Summary

I’ve been looking for a comprehensive storage management/monitoring tool for some time without any luck. I can find parts of a total solution but nothing that spans the range of what I need. For example, previously I talked about a GUI tool for using LVM with some basic file system formatting capabilities. But one of the key elements I’m looking for, monitoring, wasn’t available.

But in my research I ran across SystemTap which allows you to dig deep into the kernel to gather information. To me this sounds like a great opportunity for easily developing a storage monitoring system that gets as far into the kernel as you want or need.

Even better news about SystemTap is that there are already developed libraries, called Tapsets, that have been developed to address various monitoring or data gathering needs. One of these Tapsets, that usually comes with SystemTap, is focused on NFS.

The NFS Tapset has two versions – one for the server and one for the client. Both are fairly comprehensive with a very long list of probe points (I hope you didn’t miss the long bullet lists in this article). You can use these Tapsets to gather an amazing amount of information about NFS on both the server and the client.

However, SystemTap, or an equivalent capability, is just part of the picture. What needs to be developed next is a set of SystemTap scripts that intelligently gather the information and use it. You don’t want to gather so much information that you end up with Gigabytes of data every day that you have to process. But you also want to make sure you are measuring useful probe points. You could easily construct these probe points to also gather information when certain thresholds are surpassed to perhaps gather information about misbehaving applications or applications that are using a great deal of the I/O capability. And finally, it would be very useful to gather this data and allow users to plot it to get a time history of various performance measures.

If anyone is looking for a project…. .

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62