Last month, we looked at MySQL’s new storage engine, NDB (also known as NDBCluster or MySQL Cluster). Now it’s time to look at the compilation, installation, and configuration process.
The Big Picture
Before we jump into the nuts and bolts, let’s review a bit of what you’re going to build. As you’ll recall from last month, a MySQL cluster avoids downtime by using several redundant machines, and harnesses that extra power for increased performance.
For the purposes of this column, let’s assume that you have four machines available, mysql[1-4].example.com. Each of the machines runs a single instance of the NDB server process (ndbd). One of the machines (mysql1) also runs the MySQL server process (mysqld), as well as the NDB management server (ndb_mgmd). The MySQL server is considered a node in the cluster, but since it’s using the NDB API to act as a client of the cluster, it’s known as an API node. Multiple API nodes can communicate with the cluster simultaneously.
We haven’t touched on the management server before. For the purposes of this discussion, its role is quite simple: the management server holds the master configuration file for the NDB nodes. When a node starts, it contacts the management server for information about the cluster. This simplifies the configuration process a bit, because most of the configuration data is kept on a single machine (which is backed up, of course!).
Traditional MySQL clients continue to communicate with the MySQL server and are completely oblivious to the number of NDB nodes and their configuration. That’s the beauty of this, really: applications don’t need to care what’s going on behind the MySQL server.
Getting the Source, Building the Code
As of this writing, MySQL 4.1 is a “gamma” release. That means it has successfully gone through the unstable “alpha” development phase and the bug finding and fixing “beta” phase. Gamma is close to a production release and may even be generally available as an official package by the time you read this.
However, in watching recent changes in the MySQL source code repository, the NDB code is clearly where the action is. Until 4.1 is a production ready release, your best bet is to build your own installation using the public MySQL source code.
First, you’ll need to download a copy of BitKeeper version 3, the same source code management (SCM) tool that Linus and friends use for the Linux kernel. You’ll find BitKeeper at http://www.bitmover.com/cgi-bin/download.cgi. You’ll also need to have reasonably recent copies of a few build tools on hand, including autoconf, automake, libtool, m4, and make. The good news is that you’re likely to already have those if you’re using a newer Linux distribution.
With BitKeeper installed, the first step is to clone the MySQL source tree from bkbits.net (a BitKeeper hosting service run by BitMover, the folks who develop BitKeeper):
$ bk clone bk://mysql.bkbits.net/mysql-4.1
Once BitKeeper finishes, enter the mysql-4.1 directory and perform a one-time ritual to generate the necessary build files:
The final command, ./configure, needs options that vary from site to site. You can run ./configure –help to produce a list of all the options. However, for your first build, it’s probably sufficient to supply an installation path and the flag that builds the NDB storage engine.
If all goes well, that’ll build you a fresh copy of MySQL that you can install with sudo make install. You’ll need to do that on each node in your cluster.
Next, there are two distinct types of configuration. Each node in the cluster (including the API nodes) must be given enough information to contact the management node. The management node, of course, needs to know everything else. So let’s work through it in that order.
Each node in the NDB cluster needs an Ndb.cfg file in a directory that NDB can use to store its data. For example, if you installed MySQL into /home/mysql, you could create /home/ mysql/var/cluster/data and put the file there.
The file should contain only one line that looks something like this:
The nodeid parameter specifies the unique ID number associated with this node. The host parameter tells the local NDB node how to get in touch with the management server to find the rest of the configuration. In this case, the management server is running on mysql1 and listening to TCP port 2200.
Management Server Configuration
The management server’s configuration file (config.ini) can be stored in the same directory as Ndb.cfg on mysql1. For our setup, it looks like Listing One.
Listing One: A sample NDB management server configuration file
# config.ini for a MySQL Cluster test install.
# 128MB indexes, 1.5GB data
# 2 replicas
There’s a lot there, so let’s look at it one piece at a time.
The COMPUTER DEFAULT section is empty. Typically, it’s where you specify options that apply to every computer specified in the COMPUTER section(s).
The DB DEFAULT section lists global defaults for each of the database nodes (the ndbd instances). Here, the file specifies two replicas, which means that every piece of data is mirrored (stored on two nodes). It also specifies the amount of data (1.5 GB) and index (128 MB) memory that each node should reserve. Unfortunately, NDB doesn’t dynamically share a single memory area between data and indexes, so it may take a bit of tinkering before you find the right balance for your application. We also specify the path on the file system where NDB should store its log files.
The API DEFAULT section, also empty, applies to all API nodes in the cluster.
The MGM DEFAULT section contains parameters for all management nodes. We have only one parameter, which is the recommended default value for a management node.
The TCP DEFAULT section tells NDB nodes how to communicate using TCP. It’s possible to use special third party network interfaces with NDB, but TCP will suit us just fine.
Then we have several COMPUTER specifications. Each machine participating in the cluster (running a database, AP, or management node) should be specified here and given an id number.
The MGM section dictates where the management node runs, the TCP port it uses, and the types of logging it does. In our cases, that’s both syslog and a plain text file.
That leaves the DB and API sections. These specify the function of each unique id number and map them to computers from the COMPUTER sections.
Using that data, each node can extract its own configuration data while also gaining a global understanding of who else is part of the cluster.
Whew! With all that out of the way, it’s time to start up the cluster nodes and MySQL. That’s the easy part, which is good, because we’re just about out of space for this month.
To get things rolling, start up the management server (ndb_ mgmd), the database nodes (ndbd), and then the API node (mysqld). You should already know how to handle the last one. If you built and installed MySQL and NDB from source, you’ll find both binaries in the libexec directory where you installed everything. In our case, that’s /home/mysql.
To start up the management server, run:
$ cd /home/mysql/var/cluster
$ ../../libexec/ndb_mgmd -c config.ini -d
And to start the database nodes, you’ll need to do this on each of the four machines:
$ cd /home/mysql/var/cluster
$ ../../libexec/ndbd -d
Next month we’ll look at monitoring the cluster nodes, NDB performance, and its features to help you decide when NDB is a good choice for your applications.
Jeremy Zawodny plays with MySQL by day and spends his spare time flying gliders in California. He is the author of High Performance MySQL, published by O’Reilly and Associates. Reach him at Jeremy@Zawodny.com.
Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62