x
Loading
 Loading
Hello, Guest | Login | Register

Data Reduction, Part 2

Last month, we left off part way through a data reduction effort with ten million Apache log records in a single MySQL table that was taking up far too much disk space and memory. We analyzed the data, found ways to normalize the schema to reduce the space required, and created the new tables. Now let’s finish the job by creating a script that can intelligently move data from the old table into the new ones.

Last month, we left off part way through a data reduction effort with ten million Apache log records in a single MySQL table that was taking up far too much disk space and memory. We analyzed the data, found ways to normalize the schema to reduce the space required, and created the new tables. Now let’s finish the job by creating a script that can intelligently move data from the old table into the new ones.

The Plan

Migrating data from the old table to the normalized set of tables should be both simple and not overly burdensome on the database server. We should never do anything that will lock the original table for more than a second or so, otherwise it would prevent hits from being logged in a timely fashion and would cause Apache processes to block, waiting for the database.

The method we’ll use is to migrate entries in chronological order, from oldest to newest, and we’ll do so in small batches, pausing between each batch to reduce the overall impact of converting the data. It will take a non-trivial amount of time, but it should be a smooth transition. To reduce the number of queries sent to MySQL, we’ll try to cache frequently used data in memory along the way.

The code is included with this article as Listing One. Let’s walk through the code and see how it works.

The Code

Lines 1-10 tell Perl to enable warnings,…

Please log in to view this content.

Not Yet a Member?

Register with LinuxMagazine.com and get free access to the entire archive, including:

  • Hands-on Content
  • White Papers
  • Community Features
  • And more.
Already a Member?
Log in!
Username

Password

Remember me

Forgotten your password?
Forgotten your username?
Read More
  1. Got Security? You're in Denial
  2. KDE 4.4: Does It Work Yet?
  3. Writing Custom Nagios Plugins with Python
  4. Power Up Linux GUI Apps
  5. Tweeting from the Command Line with Twyt
Follow Linux Magazine
Rackspace