Coping with Traffic

As the load on network services increases, you need to look for an

Linux has been highly regarded as a great platform on which to build network services such as Web sites and e-mail servers. It runs on inexpensive hardware and it is fast, stable, and easy to deploy. But out of the box, Linux isn’t equipped to handle heavy loads with mission-critical reliability. As organizations continue to deploy Linux servers in increasingly important roles, both scalability and reliability emerge as important issues.

If you need to service millions of dynamic Web clients per hour, or if you need to handle e-mail for millions of users, and do so with no downtime, you’ll quickly reach the point where a single computer just can’t keep up with the load. Although some Internet protocols can be scaled across multiple systems using simple DNS-based “load balancing,” many other cases require more involved work. And there are few (if any) standard Linux tools which make it easy to do.

In situations like this, clustering is a common solution. Clustering allows you to organize several computers into groups (called clusters) which appear to the user as if they are part of one big virtual computer. The clustering software is responsible for deciding which computer (or node) in the cluster should service a given request. Even if one node in the cluster dies, the other nodes will still continue accepting work. For more background on clustering, see “Linux Clustering in Depth” (http://www.linux-mag.com/2000-10/clustering_01.html) from our October 2000 issue.

It is worth noting that the term “cluster” is used to describe a number of ways to interconnect computers. LVS represents only one of these methods: a set of tools to allow a group of different Linux systems to transparently provide a service over a protocol that was designed to be hosted by a single server. This is completely different from “Beowulf clustering,” which is a set of tools and programming libraries that allows computations to be deliberately distributed over a number of systems. (The term ipvs — IP Virtual Server or IP Service — also refers to an LVS deployment.)

While there is no shortage of commercial clustering solutions currently available, the Linux Virtual Server (LVS) project has created a freely available solution that has matured quickly. LVS allows you to create service clusters in which multiple content servers are able to handle incoming requests. The actual “content” can be anything from HTTP to e-mail (SMTP, POP), a replicated database, a Web cache/proxy, telnet, chat servers, SSH, or just about any other IP-based service.

LVS Architecture

LVS Figure 1
Figure One: A simple LVS cluster.

In an LVS deployment, there is a master computer, known as the director, and one or more content servers that provide the content or service(s). Incoming requests arrive at the director, which then uses any of several scheduling algorithms to route the requests to the content servers. Figure One illustrates a simple LVS deployment.

Depending upon which routing implementation is used (see the LVS Routing Implementations sidebar), the director may even be on a different physical network segment than the content servers.

Since the director might route a request to any of the content servers, it is necessary for all of those servers to provide consistent content. There are two basic approaches that can be used to maintain consistency — replication (via rsync, cvs, rdist, etc.) or through a file sharing system such as NFS. Some newer systems are even using Storage Area Network (SAN) technologies for sharing filesystems across multiple servers (though that is beyond the scope of this article).

From an administrative point of view, the content servers are just “normal” servers that happen to listen to a special IP address (known as a Virtual IP or VIP) for their requests. This means you can administer them using the same tools that you’d otherwise use.

LVS Routing Implementations

There are three LVS configurations to choose from. Each has its benefits and drawbacks, but this freedom allows you to select the method that best supports your needs. The configurations vary in both how the requests flow from the client to the director to a content server, and in how the responses get from the content server back to the client.

Network Address Translation (NAT)

LVS via NAT allows you to use any operating system as a content server. The content servers may have addresses in private IP space thus saving real world IP addresses. However, LVS via NAT suffers a scalability problem for heavier load implementations because each request and response must travel through the director. This problem

can be overcome through the use of multiple directors and round-robin DNS load balancing.

IP Tunnel

LVS via IP Tunnel does not suffer the scalability problem of NAT because response packets are sent directly to the requesting client and not routed back through the director. Some overhead is added with this method because of the IP tunnel itself. However, because a tunnel is used, the director and content servers can be placed on physically separate network segments.

Direct Routing (DR)

LVS via DR offers the best of both worlds, taking the scalability of the IP Tunnel approach without its overhead. Using this method, requests come through the director but travel back to the requestor directly, without going back through the director (unlike NAT). One disadvantage of LVS via DR is that the director and client content servers must be located on the same physical network segment.

Features and Benefits

Now that we have a basic idea of how LVS works, it’s helpful to understand how buzzword-compliant an LVS cluster can be.

High Availability

With even a cluster of just one director and two content servers, LVS is a powerful and inexpensive way of ensuring that the content is available if a node fails. Adding a second director to a cluster will ensure that the service continues even if the first director dies.


Once a cluster is up and running, you can simply add more servers to handle the increasing requests. If the incoming request load becomes too much for a single director to handle, you can always add a second one.


When you need to update the software on the nodes in a cluster, there’s no need for users to notice. You can simply remove content servers from the cluster (one at a time), perform upgrades and testing, and then re-add them when you are sure they’re working as expected.

Load Balancing

LVS implements a growing number of scheduling algorithms in order to achieve optimal response times. See the LVS Scheduling Algorithms sidebar for a breakdown of the algorithms.

LVS Scheduling Algorithms

The options include standard round-robin and weighted round-robin scheduling, least connection, weighted least connection scheduling, and locality-based least-connection scheduling.


The round-robin scheduling algorithm sends each incoming request to the next server in its list. Each content server will receive an equal number of requests.

Weighted Round-Robin

Each content server is assigned a weight (a number), which the director uses to decide how much of the request traffic should be sent to a given server. A server with a weight of 1 will receive roughly half as many requests as a server with a weight of 2.


The director sends network connections to the server with the least number of established connections. This algorithm is able to dynamically account for the fact that one server may be handling an excessive number of connections.

Weighted Least-Connection

Like weighted round-robin, except that the server is considering active connections. The weights are often used to indicate the relative processing power of servers in a cluster.

Destination Hashing

The director decides which server to use by looking up the destination IP address in a static hash table.

Source Hashing

The director decides which server to use by looking up the source IP address in a static hash table.


In addition to the methods listed above, there are two locality-based scheduling methods available. For a more complete description of LVS scheduling, see http://www.LinuxVirtualServer.org/scheduling.html/.


LVS installation involves the four basic steps. They are as follows:

  • Applying a patch to the kernel
  • Installing a Perl module
  • Installing the administrative software for the LVS
  • Making a configuration file

Each node in the cluster will need a patched kernel as well as the other software components.

Prior to beginning the installation and configuration of an LVS cluster, you should have the service running properly on all of the content servers. If you intend to run a Web cluster with Apache, make sure that Apache is running properly on each server. If the content server isn’t working properly, troubleshooting LVS will be far more difficult.

The Net::DNS Perl module is required for the configuration file. If you do not already have Net::DNS installed, you should install it before proceeding.

The Net::DNS module can be obtained from the CPAN (Comprehensive Perl Archive Network) at http://www.cpan.org/. If you have a properly configured CPAN.pm, it is as simple as running:

perl -MCPAN -e shell

and typing

install Net::DNS

at the cpan> prompt.

The next step is to fetch the latest ipvs source code and kernel patch from http://www.LinuxVirtualServer.org/software/.

Next, you will need to apply the kernel patch:

cd /usr/src/linux
gunzip -c linux-X.X.X-ipvs-X.X.X.patch.gz| patch-p1

and configure the kernel using the method you are most comfortable with (make config, make menuconfig, or make xconfig). The Configuring the Kernel for LVS sidebar contains a list of the options you will need to enable during the configuration process.

(If you’re not familiar with rebuilding your kernel, see the Kernel HOWTO at http://www.linuxdoc.org/HOWTO/Kernel-HOWTO.html before you attempt to set up LVS.)

Configuring the Kernel for LVS

The following options must be enabled in your kernel:

Code maturity level options
[*] Prompt for development and/or incomplete code/drivers

Networking options
[*] Network firewalls

[*] IP: firewalling

[*] IP: masquerading

[*] IP: masquerading virtual server support
(12) IP masquerading table size (the Nth power of 2)
<*> IPVS: round-robin scheduling
<*> IPVS: weighted round-robin scheduling
<*> IPVS: least-connection scheduling
<*> IPVS: weighted least-connection scheduling
<*> IPVS: locality-based least-connection scheduling

[*] IP: aliasing support

Next, rebuild the kernel using your standard routine, which probably looks like this:

make bzImagecmake modules
make modules_install

Copy the kernel to it’s new location, run /sbin/lilo, and then reboot your server.

If you build your kernels and modules on a workstation or internal system for distribution to your production servers, remember to archive the whole /lib/modules/ your_kernel_version directory tree and install it with your kernel and your System.map files. It’s also a good idea to keep a copy of your .config file with the kernel image and its modules; store it in /boot or wherever you keep your kernel images installed.

You should also consider adding your own unique identifier to the extraversion field near the top of the top-level Makefile. This insures that the module utilities will find and load the proper modules for the LVS patched kernel.

With the kernel patching out of the way, it’s now just a matter of compiling and installing ipvsadm the administrative software for LVS. Simply run make and make install in the ipvsadm sub-directory of the LVS distribution.


For our cluster, we will use Direct Routing (DR) mode. The first step in producing a custom configuration file is to edit or create a base configuration file. The contrib/config directory contains sample base configuration files, including lvs_dr.conf, which we’ll be using. Figure Two shows what the file should contain for our installation.

Figure Two: LVS Direct Routing Configuration

Here is our example configuration file:

SERVICE=t http rr

After editing the lvs_dr.conf file, you can use the configure script to produce a working rc-style script that can be used to start and stop LVS:

./configure lvs_dr.conf

The resulting rc.lvs_dr file should be placed in the appropriate directory (such as /etc/rc.d/init.d) for your system startup scripts. Until each server is rebooted, though, you’ll need to start LVS manually by running:

rc.lvs_dr start

The configuration process must be completed on each server in the cluster.


Once every server in the cluster has been configured, you can run /sbin/ipvsadm on the director to see a report on the current state of the cluster. Figure Three shows the output of ipvsadm on our sample cluster before any requests have been serviced.

Figure Three: ipvsadm Output on an Idle Cluster

IP Virtual Server version 1.0.2 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP rr
-> mandrake:www Route 1 0 0
-> slackware:www Route 1 0 0

To test the cluster, we’ll need to connect from a computer outside the cluster itself. The easiest way to do that is to point a Web browser at the cluster’s VIP (192.168. 1.110) and see what happens. Once the connection is open, you can re-run ipvsadm on the director and you should see an Active Connection to one of the servers. The next connection that you open should go to the next server, and so on.

Using ipvsadm

As we’ve seen, ipvsadm is the administrative tool used to maintain and configure LVS. With ipvsadm you can add, edit, or delete services within the cluster. Let’s look at adding a new service (SSH on TCP port 22) to our current setup.

To add the SSH service to our cluster, the command is:

ipvsadm -A -t -s rr

-A signifies that we’re adding a service.

-t indicates that the service is TCP-based, followed by the VIP and port number (or name, in this case) for the service.

-s specifies the scheduling mechanism, and rr means round-robin.

Next, we add the actual servers for this service:

ipvsadm -a -t -r -g
ipvsadm -a -t -r -g

In this case:

-a tells ipvsadm that we’re adding a server to the cluster for a particular service.

-t has the same meaning as before.

-r indicates the server’s real IP address and the port number (or name of the service).

-g indicates the routing method (Direct Routing).

As Figure Four shows, we can run /sbin/ipvsadm again and see the new service with an active SSH connection (assuming you’ve also tested it).

Figure Four: ipvsadm Output with an Open SSH Connection

IP Virtual Server version 1.0.2 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP rr
-> mandrake:www Route 1 0 0
-> slackware:www Route 1 0 0
TCP rr
-> slackware:ssh Route 1 1 0
-> mandrake:ssh Route 1 0 0

To remove an individual server from the cluster (such as when you need to perform maintenance), simply use ipvsadm to delete the server:

ipvsadm -d -t -r

Note the only difference is that we used -d instead of -a.

And to remove the SSH service entirely:

ipvsadm -D -t

As you’d expect, the ipvsadm manual page contains information about all the various command-line options.

Going Forward…

While the LVS project has produced a very powerful set of tools for helping to scale Linux to handle larger workloads, it is not a silver bullet. There are applications for which LVS is not the answer.

For example, some dynamic Web sites make use of sate information that is stored on the Web server as well as on the client (usually in the form of a cookie). If such an application is not modified to somehow share the state information with other servers in the cluster, users will likely notice strange behavior because they are not guaranteed to talk to the same physical server each time they click a link on your Web site.

Looking past this minor issue, setup and administration of LVS is relatively easy. Additionally, it can operate with little overhead. Many network services can and do work well in an LVS cluster with few hassles. When you consider the fact that it’s free software, Linux Virtual Server is a great value.


Steve Suehring is a systems engineer at Voyager.net. He can be reached at suehring@voyager.net.

Comments are closed.