http://www.linux-mag.com/2002-10/lamp_01.html), we looked at what Chris Powell’s mod_log_sql does for you and tried a basic configuration. (After that article appeared, Chris released a new version that fixed a few bugs we discovered in the process of writing that article. Consider upgrading if you haven’t already.) Then in November 2002 (http://www.linux-mag.com/2002-11/lamp_01.html), we started building a basic web interface in PHP to present a view of the logged data. Using that framework, you could construct pages to list the most popular URIs, referers, and so on — all in real-time. That, after all, is part of the beauty of mod_log_sql. You get the benefits of an SQL interface without any unnecessary delays.

" />
x
Loading
 Loading
Hello, Guest | Login | Register

Data Reduction, Part 1

Roughly a year ago, we spent two months looking at logging web hits in MySQL, using Apache and mod_log_sql. In the October 2002 issue (available online at http://www.linux-mag.com/2002-10/lamp_01.html), we looked at what Chris Powell’s mod_log_sql does for you and tried a basic configuration. (After that article appeared, Chris released a new version that fixed a few bugs we discovered in the process of writing that article. Consider upgrading if you haven’t already.) Then in November 2002 (http://www.linux-mag.com/2002-11/lamp_01.html), we started building a basic web interface in PHP to present a view of the logged data. Using that framework, you could construct pages to list the most popular URIs, referers, and so on — all in real-time. That, after all, is part of the beauty of mod_log_sql. You get the benefits of an SQL interface without any unnecessary delays.

Roughly a year ago, we spent two months looking at logging web hits in MySQL, using Apache and mod_log_sql. In the October 2002 issue (available online at http://www.linux-mag.com/2002-10/lamp_01.html), we looked at what Chris Powell’s mod_log_sql does for you and tried a basic configuration. (After that article appeared, Chris released a new version that fixed a few bugs we discovered in the process of writing that article. Consider upgrading if you haven’t already.) Then in November 2002 (http://www.linux-mag.com/2002-11/lamp_01.html), we started building a basic web interface in PHP to present a view of the logged data. Using that framework, you could construct pages to list the most popular URIs, referers, and so on — all in real-time. That, after all, is part of the beauty of mod_log_sql. You get the benefits of an SQL interface without any unnecessary delays.

At least that was the theory.

One thing we neglected to discuss or consider was the long-term consequences of using mod_log_sql. What happens as you accumulate more and more data? Sure, we mentioned the possibility of adding indexes to make some queries faster, but indexes only help so much. The real problem is that the default logging format for mod_log_sql is very simple, so as to be efficient for real-time logging. The tradeoff, however, is that it’s not space efficient. As time goes on, the data can become quite large, difficult to manage, and slow to query.

This month and next, let’s look at a recently implemented…

Please log in to view this content.

Not Yet a Member?

Register with LinuxMagazine.com and get free access to the entire archive, including:

  • Hands-on Content
  • White Papers
  • Community Features
  • And more.
Already a Member?
Log in!
Username

Password

Remember me

Forgotten your password?
Forgotten your username?
Read More
  1. KDE 4.4: Does It Work Yet?
  2. Writing Custom Nagios Plugins with Python
  3. Power Up Linux GUI Apps
  4. Tweeting from the Command Line with Twyt
  5. When Memory Serves You: Using ramfs and tmpfs
Follow Linux Magazine
Rackspace