dcsimg

FS_scan: Getting Detailed with Your Data

Need details on your file system's data? FS_scan allows you dig deep into your storage, giving you the ability to perform trend analysis on the results.

Just last week we walked you though a new tool, agedu, that allows you to get a snapshot view of your file system. agedu produces a very nice graphical display that provides an overview of the age and size of your data (either change or access time). However, there are times when you need or want more detail on the data that’s sitting in your storage. This time around we’ll look at a new tool, FS_scan, that does precisely that.

FS_scan allows you to recursively scan a directory tree to get a detailed view of your data. In particular, it will tell you the dates and ages of your files, the average ages of the files in a given directory, and it will tell you the oldest files in the directory tree. It also produces a CSV file that you can open in a spreadsheet. With this information you can get a very detailed view of the state of your storage with the ability to do a trend analysis of the resulting data (i.e. How fast is it changing? How often are files accessed? How often is data modified?).

Let’s dive in and see what our data’s doing.


When You Just Need More Details

Remember that when talking about the data on your storage there are three dates (or three ages) that need to be considered: (1) Last date accessed or the access age, (2) Last date modified or the modify age, (3) Date last changed or the change age. So when examining a file system it becomes much more difficult to quantify how data is being used because all three dates or ages can be very important. Agedu is a great tool for getting a quick glimpse of the access age or change age of the file system being examined, but it is only a glimpse of the state of the filesystem. If you want to create a more detailed report or monitor the file system over time for a trend analysis then you need more detailed information than what agedu can provide at this time.

One option for getting more detailed information is to use the stat command in Linux. It can be used to get the status of files or even the file system. For example the output from stat looks like the following,

$ stat *
  File: `~storage002.html'
  Size: 11472     	Blocks: 24         IO Block: 4096   regular file
Device: 811h/2065d	Inode: 3220767     Links: 1
Access: (0600/-rw-------)  Uid: ( 1000/laytonjb)   Gid: ( 1000/laytonjb)
Access: 2009-05-24 17:19:52.000000000 -0400
Modify: 2009-05-24 17:19:52.000000000 -0400
Change: 2009-05-24 17:19:52.000000000 -0400
  File: `storage002.html'
  Size: 11285     	Blocks: 24         IO Block: 4096   regular file
Device: 811h/2065d	Inode: 3220766     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/laytonjb)   Gid: ( 1000/laytonjb)
Access: 2009-05-24 17:13:13.000000000 -0400
Modify: 2009-05-24 16:02:27.000000000 -0400
Change: 2009-05-24 16:02:27.000000000 -0400

Or you can get a glimpse of the file system status using the “-f” option.

$ stat -f *
  File: "~storage002.html"
    ID: f11c91747fe09927 Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 37263886   Free: 33094551   Available: 31216553
Inodes: Total: 9396224    Free: 9106153
  File: "storage002.html"
    ID: f11c91747fe09927 Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 37263886   Free: 33094551   Available: 31216553
Inodes: Total: 9396224    Free: 9106153

Both options provide useful information. The first option, stat, gives the access, modify, and change dates for the file, as well as the uid, gid, the size of the files, and the permissions. The second option, stat -f gives additional information including the file system type and the fundamental block size. However, if you want to use the stat command to gather detailed information you will have to perform these commands for the directory tree, parse the information, and assemble it into a usable form.

Python has a nice module, called the os module that can easily walk a file system and gather virtually all of the same information that the stat command produces. Even better is that this module is part of the standard library for many of the python packages in many of the distributions. This can easily form the basis of a tool to walk a file system and gather detailed file information.

Python Modules to the Rescue

One of the functions in the os module is called “walk” (os.walk). This function allows you to easily walk a directory tree (i.e. examine the files recursively in a directory tree) and get information on the directories and the files. From the Python 2.6.2 documentation there is a simple example that has been modified and presented below.

#!/usr/bin/python

import os
from os.path import join, getsize

for root, dirs, files in os.walk('.'):
    print root, "consumes",
    print sum(getsize(join(root, name)) for name in files),
    print "bytes in", len(files), "non-directory files"

This quick code snippet displays the number of bytes taken by non-directory files in each directory under the starting directory (current working directory). This simple snippet can form the basis of a script that can walk through a directory tree and gather information about the files. A quick note – this code snippet does not have any exception handling and it is definitely possible you can encounter exceptions.

With the ability to walk a directory tree, you can open the files in the directory and gather statistics on each file. The os module also has a function (method) called os.fstat that can give you most of the information that the stat command produces. Taking the previous example and extending it a bit results in the following example.

#!/usr/bin/python

import os
from os.path import join, getsize

for root, dirs, files in os.walk('.'):
    print root, "consumes",
    print sum(getsize(join(root, name)) for name in files),
    print "bytes in", len(files), "non-directory files"
    for file in files:
       fileloc = root + "/" + file
       FILE = os.open(fileloc, os.O_RDONLY)
       junk = os.fstat(FILE)
       size = junk[6]
       atime = junk[7]
       mtime = junk[8]
       ctime = junk[9]
       uid = junk[4]
       gid = junk[5]
       print "   File: %s size: %s atime: %s mtime: %s ctime: %s" % (file,size,atime,mtime,ctime)
       os.close(FILE)

In the second for loop, the full path to the file is created (fileloc) using the root of the director tree (root) and the file name (file). Notice that os.fstat function returns a list of attributes. For example, it returns the access time (atime), the modify time (mtime), and the change time (ctime), which are all in seconds since the epoch. There are other attributes as well includes the size in bytes (size), the uid (uid) and gid (gid).

The previous example serves as a quick introduction to what you can do with Python using the modules in the standard library. In particular the os module has many functions that are useful for getting detailed information about files.

Down to Details »

FS_scan: A Tool For Detailed File System Information

Comments on "FS_scan: Getting Detailed with Your Data"

Its like you read my mind! You seem to know so much about this,like you wrote the book in it or something. I think that you could do with some pics to drive the message home a bit, but other than that, this is great blog. this watch video

Always a large fan of linking to bloggers that I enjoy but really don’t get lots of link love from.

Usually posts some extremely fascinating stuff like this. If you?re new to this site.

One of our visitors not long ago proposed the following website.

Although internet sites we backlink to beneath are considerably not related to ours, we feel they may be in fact really worth a go via, so possess a look.

The specific focus will probably be with conversation, liberty, expressing and as well conviviality. Finished . will probably be to the a single provide, fortify Togo, this type of minimal work out in search of hearing and as well, next, to provide a new well-timed chance to reveal routines and as well internet masters home pooling regarding possessions and as well records hcg

The facts mentioned in the post are several of the very best available.

Every as soon as in a even though we select blogs that we read. Listed beneath are the most current web pages that we opt for.

Here are some hyperlinks to sites that we link to mainly because we believe they’re really worth visiting.

We came across a cool website which you might enjoy. Take a look should you want.

We prefer to honor a lot of other world-wide-web sites on the net, even if they aren?t linked to us, by linking to them. Below are some webpages worth checking out.

Wonderful story, reckoned we could combine a number of unrelated data, nevertheless genuinely really worth taking a look, whoa did one discover about Mid East has got far more problerms as well.

That is the finish of this report. Right here you?ll uncover some web-sites that we believe you will enjoy, just click the links.

Every when inside a whilst we decide on blogs that we study. Listed below would be the most recent internet sites that we pick out.

One of our guests not long ago suggested the following website.

We like to honor several other world-wide-web internet sites on the web, even if they aren?t linked to us, by linking to them. Underneath are some webpages really worth checking out.

Check beneath, are some absolutely unrelated websites to ours, on the other hand, they may be most trustworthy sources that we use.

Here are a few of the web-sites we suggest for our visitors.

Wonderful story, reckoned we could combine a number of unrelated information, nevertheless really worth taking a look, whoa did one discover about Mid East has got a lot more problerms as well.

The info mentioned in the article are a few of the best offered.

Here are some hyperlinks to internet sites that we link to since we assume they may be really worth visiting.

We came across a cool web site that you may possibly love. Take a appear in the event you want.

Although web sites we backlink to below are considerably not associated to ours, we really feel they’re essentially really worth a go by way of, so have a look.

One of our visitors not too long ago encouraged the following website.

Usually posts some quite interesting stuff like this. If you?re new to this site.

Leave a Reply