IO Profiling of Applications: strace_analyzer

In the last couple of articles we have talked about using strace to help examine the IO profile of applications (including MPI applications; think HPC). But strace output can contain hundreds of thousands of lines. In this article we talk about the using a tool called strace_analyzer to help sift through the strace output.

Strace is a very useful tool for examining the IO profile of applications, as it comes standard on every Linux distro. However, as we’ll see in this article, strace can produce hundreds of thousands of lines of output. Trying to develop statistics and trends from a files of this size is virtually impossible to do by hand.

In this article, we will take a look at a tool to do a statistical analysis on strace output: strace_analyzer. This tool can take an individual strace file that has been created with the “-T -ttt” options and produce a statistical analysis of the IO portion of the strace. It also produces data files and .csv (comma delimited files for spreadsheets) files that can be used for plotting.

Using the strace_analyzer

strace_analyzer is a Perl app that parses through the strace file looking for specific IO functions. When run, it looks for the following IO functions,

  • open
  • read
  • write
  • close
  • lseek
  • llseek
  • lseek64
  • stat
  • stat64
  • fstat
  • chmod
  • access
  • rename
  • mkdir
  • getdents
  • fcntl
  • unlink
  • fseek
  • rewind
  • ftell
  • fgetpos
  • fsetpos
  • fclose
  • fsync
  • creat
  • readdir
  • rewinddir
  • scandir
  • telldir
  • flock
  • lockf

Obviously this list does not include all IO functions but it is a work in progress.

The first step in using strace_analyzer is to create an strace output file using the “-T -ttt” options. I also used the “-o” option to send the data to a file. Then you can run the strace_analyzer to process the file with the following command:

./strace_analyzer_ng_0.08.pl -dat -short < strace.out > summary.out

There are several command line options with “-dat” and “-short” being used in the previous example. You can always type “./strace_analyzer_ng_0.08.pl -help” to get a listing of the options. For this example, “-dat” produces data files for plotting tools such grace or simple python codes that use matplotlib. The “-short” option produces a shorter summary output file that doesn’t contain all the details of a particular IO function.

When you run the strace_analyzer it also produces a file called “strace_analyzer.log” that some additional information information about the run, primarily for open() syscalls (more on that later). The code also produces a subdirectory called “RESULTS_DAT” where the .dat and .csv files are contained. Depending upon the application there can be a large number of files and some of them can be fairly large. For applications that have a large number of IO functions sometimes the .csv files cannot be loaded into a spreadsheet because there are too many rows. In addition, trying to plot the data in a spreadsheet can also be impossible because many spreadsheet programs have a limit on the number of data points that can be plotted.

The output from the strace_analyzer is divided into a few main groups.


  • Time statistics (e.g. how much wall clock time was spent doing IO, etc.)
  • IO syscall Count
  • Write Statistics
  • Read Statistics
  • Close Statistics
  • Open Statistics
  • lseek activity statistics
  • IOPS Statistics
  • File Statistics (on a per file basis)

Without the “-short” option you get a great more detail in each group. With the “-short” flag you get more of a summary.

Rather than walk through the strace_analyzer step by step and open myself to ridicule of my coding style, my choice of languages, my lack of using regular expressions for parsing, and other assorted programming criticisms, let’s apply the strace_analyzer to an example so we can see how the tool is used and what we can possibly learn by using it. By the way – I’m open to patches and criticism, but they should be constructive. if you don’t like my coding style that’s fine, but help me understand why a different coding style is better. Otherwise, talk to the hand, because the ears aren’t listening (and the eyes aren’t reading and the fingers are moving the file to the trash).

Examining Abaqus/Explicit

The application I will be examining is called Abaqus/Explicit and is developed, supported, and marketed by Simulia. It is what is called a finite element program that is used for examining dynamic events. For example, it can be used to examine how a container of laundry detergent behaves when it’s dropped from a grocery shelf onto the floor, or what happens to a cell phone when it’s dropped onto a floor, or when a bullet penetrates the wing of an aircraft, or, an even more fun analysis, how a diaper behaves when it is “loaded” to capacity. Abaqus/Explicit is one of the leading applications for dynamic analysis and it a common HPC application.

For this example, the e5 data set will be used. e5 is a simple benchmark where a stiffened steel plate is subjected to a high intensity blast load. The application with the e5 benchmark was run under the auspices of the HPC Advisory Council. It was run on the Osiris cluster that the HPC Advisory Council uses. It is a cluster of Dell SC 1435 servers with two Quad-Core AMD OpteronTM 2382 SE processors and a Mellanox ConnectX DDR InfiniBand fabric. The storage for the problem is fairly simple – the master node of the cluster had a simple SCSI hard drive that was exported via NFS over a GigE network to all of the compute nodes.

Abaqus/Explicit uses MPI so the application can run in parallel on multiple processors. For this example, it was run on two nodes with 8-cores each for a total of 16 processes. Each MPI process created a simple strace output file. strace_analyzer was then run on each process. The tables and plots below are from the output from strace_analyzer.

As mentioned previously, the first output group from the strace output is a timing summary. Table 1 below contains the Total Run Time in seconds of each of the 16 processes as well as the total time in seconds spent performing IO using the list of IO functions previously mentioned. The last column of data is the percentage of time the application spent doing IO relative to the total run time.

Table 1 – Time Statistics for the 16 Processes

Process Total Run
Time (secs)
IO Times
(secs)
% Time for
IO
12419 424.719 0.7889 0.185%
12420 425.292 0.0991 0.023%
12421 425.373 0.1222 0.028%
12422 425.433 0.1155 0.027%
12423 424.517 0.1023 0.024%
12424 425.291 0.1250 0.029%
12425 425.331 0.1293 0.030%
12427 425.291 0.1355 0.532%
14297 418.912 2.2275 0.532%
14298 418.827 2.3856 0.570%
14299 425.785 2.6418 0.620%
14300 425.112 2.3226 0.546%
14301 418.706 2.1369 0.510%
14302 424.769 2.1317 0.502%
14303 419.200 2.0271 0.486%
14304 418.868 1.8862 0.450%

Notice that the amount of time spent doing IO is very small compared to the total wall clock time. The large amount of time is 2.65 seconds or 0.62% of the total run time. It’s probably safe to say that this particular application is not IO bound.

The next output group is an IO syscall count. The output is a list of the IO commands the application is capable of recognizing and how many times these syscalls were used during the run of the application. Table 2 below contains the IO syscalls for all 16 processes.

Table 2 – IO syscall Count

Process access lseek fcntl stat unlink open close fstat read mkdir getdents write
12419 14 19,689 31 136 1 287 297 148 6,349 69 8 74,328
12420 5 3,150 5 78 0 241 243 103 1,584 0 8 1,870
12421 5 3,159 5 78 0 241 243 104 1,581 0 8 1,882
12422 5 3,155 5 78 0 241 243 104 1,579 0 8 1,880
12423 5 3,159 5 78 0 241 243 104 1,581 0 8 1,882
12424 5 3,149 5 78 0 241 243 104 1,577 0 8 1,876
12425 5 3,147 5 78 0 241 243 104 1,581 0 8 1,870
12427 5 3,149 5 78 0 241 243 104 1,583 0 8 1,870
14297 5 3,149 5 78 0 246 248 104 1,588 0 8 1,870
14298 5 3,147 5 78 0 246 248 104 1,586 0 8 1,870
14299 5 3,517 5 78 0 246 248 104 1,586 0 8 1,880
14300 5 3,160 5 78 0 246 248 104 1,587 0 8 1,882
14301 5 3,156 5 78 0 246 248 104 1,585 0 8 1,880
14302 5 3,163 5 78 0 246 248 104 1,588 0 8 1,884
14303 5 3,149 5 78 0 246 248 104 1,588 0 8 1,870
14304 5 3,147 5 78 0 246 248 104 1,586 0 8 1,870

This table gives us some interesting information. For example, the first process (12149) has 6 times more lseek() syscalls than any other process. It also has about 4 times the read() syscalls than other process. Perhaps more importantly, it has 40 times the number of write() syscalls than any other process.

In the world of MPI and parallel programs there are many ways to perform IO. One of the most classic methods is to have the first MPI process, called the rank-0 process since it is the first MPI process, do all of the reading and writing for the application. The other MPI processes receive/send their data from/to the rank-0. This approach ensures that the other MPI processes don’t overwrite the data from any other process.

From the results in Table 2 it appears that the first process (12419) is the rank-0 MPI process and it is doing the vast majority of the IO for the application. There is other output from strace_analyzer to help support this supposition as we’ll see later. But at this point it’s pretty obvious that process 12419 is the rank-0 process and is doing the vast majority of the reading and writing for the application.

If indeed process 12419 is the rank-0 process then why are the other processes doing so many read() and write() system calls? We’ll see more of that later and discover that read() and write() are very common system calls and are used in a variety of ways.

Another interesting observation is that for every 3.77 write syscalls there is an lseek() syscall. An lseek moves the file pointer within the file to a new location. Depending upon how far the file pointer is repositioned this means that any data stored in the drive cache or the VFS cache, or in the IO scheduler cache or any other cache is no longer useful. Basically it hurts throughput (sometimes even kills it).

The next output group contains the write syscall statistics. Below is a sample of the summary for process 12419:

----------------------
-- Write Statistics --
---------------------- 

-- File sizes for write() syscall --

IO Size Range         		     Number of syscalls
=======================================================
(  1)    0KB <   <    1KB		          12999
(  2)    1KB <   <    8KB		          61049
(  3)    8KB <   <   32KB		            268
(  4)   32KB <   <  128KB		              0
(  5)  128KB <   <  256KB		              0
(  6)  256KB <   <  512KB		              0
(  7)  512KB <   < 1000KB		              0
(  8) 1000KB <   <   10MB		              0
(  9)   10MB <   <  100MB		              0
( 10)  100MB <   <    1GB		              0
( 11)    1GB <   <   10GB		              0
( 12)   10GB <   <  100GB		              0
( 13)  100GB <   <    1TB		              0
( 14)    1TB <   <   10TB		              0 

-- WRITE SUMMARY --
Total number of Bytes written = 134,122,801 (134.122801  MB)
Number of Write syscalls =   74,328 

Average (mean) Bytes per syscall = 1,804.76345605253 (bytes) (0.00180476345605253  MB)
   Standard Deviation: 1,137.06808888182 (bytes) (0.00113706808888182  MB)
   Mean Absolute Deviation: 1,016.65380745704 (bytes) (0.00101665380745704  MB)
Median Bytes per syscall =         2,048 (bytes) (0.002048 MB)
   Median Absolute Deviation: 473.416424457721 (bytes) (0.000473416424457721  MB) 

Time for slowest write syscall (secs) = 0.050185
   Line location in file: 749638 

Smallest write syscall size: 1 (Bytes)
Largest write syscall size: 14336 (Bytes)

Comments on "IO Profiling of Applications: strace_analyzer"

neondiet

I really enjoyed that read: what a useful tool. I will definitely be using this from now on. Many thanks!

Reply
jirihorky

Hi,

very nice tool! I have recently written very similar tool to do IO profiling based on strace. It is a GUI application with a core written in C (for speed) and the GUI in Python. The IOprofiler has integrated histogram and access pattern plotting. However it doesn’t (yet) provide all the statistics this tool can.

Check it out here, if you are interested: http://code.google.com/p/ioapps/

It can also replay all the traces to become a standalone benchmark with the same IO worload pattern as your original application.

Jeff, I am sorry to not continue on your code, but at least the ioreplay part must have been written in C (and thus all file descriptors handling as well). I’ve also integrated some more logic to file descriptor-to-file name mappings.

Jiri Horky

Reply

Dear Jeffrey

Can I get a copy of python version, please?

Regards,

Yonny

> Versions and Source for Python
> Coming soon – if you need it now, please contact me.

Reply

thank you for share!

Reply

This is a message to the webmaster. Your website is missing out on at least 300 visitors per day. I have found a company which offers to dramatically increase your visitors to your site: http://fot.li/zn They offer 1,000 free visitors during their free trial period and I managed to get over 30,000 visitors per month using their services, you could also get lot more targeted traffic than you have now. Hope this helps :) Take care.

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

This is a topic which is close to my heart… Many thanks! Exactly where are your contact details though?|

Reply

I had been encouraged your blog through our cousin. I’m no longer beneficial no matter if that set up can be created by means of them seeing that nobody else understand these kinds of special regarding our issues. You have been amazing! Many thanks!

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine

Reply

Hello, just wanted to tell you, I loved this blog post. It was funny. Keep on posting!|

Reply

iNg58c Very good blog article. Want more.

Reply

Here are a number of the web-sites we advise for our visitors.

Reply

lSZJ79 Hey, thanks for the blog article.Really thank you! Will read on…

Reply

A5O4dS Thanks again for the article post.Really looking forward to read more. Fantastic.

Reply

dyg0vu Awesome blog post. Cool.

Reply

KpSWmg Really informative article.Thanks Again. Fantastic.

Reply

zOPHvd bjerchkhvtfr, [url=http://zqavheqtutnv.com/]zqavheqtutnv[/url], [link=http://iwnpwqufpsra.com/]iwnpwqufpsra[/link], http://plfpxzjhqspd.com/

Reply

Simply wish to say your article is as astonishing. The clearness in your post is simply great
and i can assume you are an expert on this subject. Well
with your permission let me to grab your RSS feed to keep updated with forthcoming
post. Thanks a million and please continue the gratifying work.

Reply

Hello there, I believe your website might be having internet browser compatibility issues. Whenever I take a look at your website in Safari, it looks fine however, if opening in I.E., it has some overlapping issues. I just wanted to give you a quick heads up! Apart from that, excellent site!|

Reply

Hi it’s me, I am also visiting this site daily, this website is
really pleasant and the users are in fact sharing good thoughts.

Reply

comment4, , 635, , heb,

Reply

Apart from the primary software, you can obtain loaded music creation lessons in every issues with audio generating much like the ProTools, Logic, keyboards, software instruments plus more. With this whole music manufacturing suite, you are going to stay in a position to get maximum knowledge employing only various sorts of instruments to develop music and beats easily. Also included are excellent quality instrumentals and beats that can assist you creating tracks on sonic producer easily.

Reply

I do not even know the way I ended up right here, but I thought this submit was once
great. I do not realize who you’re but certainly you are going to a famous blogger for those who
are not already. Cheers!

Reply

You have made various nice points there. I was able to looking about them and found most people will have the identical opinion along with your blog.

Reply

I found your blog using Google and I must say this is one of the most informative blogs We’ve read for a while. I most certainly will be sure I get back read your future posts.

Reply

I read this post completely concerning the comparison of latest and preceding technologies,
it’s awesome article.

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine
Cheap North Face Australia

Reply

Stiv Wilson, it would take 17% of the world’s oil tankers working full time to haul the 315 billion pounds of plastic bits out of the

Reply

Hi! I could have sworn I’ve visited this blog before
but after browsing through a few of the posts I realized it’s new to me.
Anyways, I’m definitely happy I came across it and I’ll be book-marking it and checking back regularly!

Reply

IO Profiling of Applications: strace_analyzer | Linux Magazine
affpojxqvol
ffpojxqvol http://www.g35b76au238vv44o3o4w6tz2bl73jo0us.org/
[url=http://www.g35b76au238vv44o3o4w6tz2bl73jo0us.org/]uffpojxqvol[/url]

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>