x
Loading
 Loading
Hello, Guest | Login | Register
Today's HPC Clusters Resource Center

Scalable I/O on Clusters, Part I

Linux clusters have become so successful that they’ve proliferated internationally through research labs, universities, and large industries that require an inexpensive source of high performance computing cycles. Developers and users have pushed the technology by scaling their applications to more and more processors so that larger problems can be solved more quickly. This has resulted in clusters where some applications can actually become I/O bound — the input/output of data to/from a large number of processors limits the performance of the application.

Linux clusters have become so successful that they’ve proliferated internationally through research labs, universities, and large industries that require an inexpensive source of high performance computing cycles. Developers and users have pushed the technology by scaling their applications to more and more processors so that larger problems can be solved more quickly. This has resulted in clusters where some applications can actually become I/O bound — the input/output of data to/from a large number of processors limits the performance of the application.

Most Linux clusters use NFS (the Network File System) to share data among nodes and provide a consistent name space across all machines. As a result, parallel applications (executing simultaneously on multiple processors) typically read data from files stored on a single disk on a single server.

While NFS works well for small clusters (less than 32 nodes), its performance decays rapidly as more than 64 nodes start making simultaneous I/O requests. These requests “saturate” the NFS server that stores the files of interest.

What’s needed is a way of applying the Beowulf philosophy, which has worked well for computational scaling, to file systems by spreading the file serving workload across many disks, buses, and nodes. Enter PVFS, the Parallel Virtual File System (http://parlweb.parl.clemson.edu/pvfs/).

The Parallel Virtual File System (PVFS)

Developed at Clemson University and designed specifically for Linux clusters, PVFS provides network access to a “virtual” file system distributed across different disks on multiple independent servers or nodes. This is…

Please log in to view this content.

Not Yet a Member?

Register with LinuxMagazine.com and get free access to the entire archive, including:

  • Hands-on Content
  • White Papers
  • Community Features
  • And more.
Already a Member?
Log in!
Username

Password

Remember me

Forgotten your password?
Forgotten your username?
Read More
  1. Cluster 3.0: Dynamic Provisioning with MOAB and XCAT
  2. InfiniBand Interconnects for Computing Clusters
  3. Optimizing Performance for HPC: Part 2 - Interconnect with InfiniBand
  4. Optimizing the Nehalem for HPC
  5. Sledgehammer HPC