dcsimg


Threads and Mutexes — Part II

Welcome to the second part of our look at programming with threads. In last month's column we talked about the functions that allow you to create and wait on threads. This month we're going to dive deeper into the problems that often arise when using threads to write concurrent programs. Before we begin that however, we'll return to the ticket agent example we looked at last month and discuss the solution to the problem of over-selling of tickets.

Welcome to the second part of our look at programming with threads. In last month’s column we talked about the functions that allow you to create and wait on threads. This month we’re going to dive deeper into the problems that often arise when using threads to write concurrent programs. Before we begin that however, we’ll return to the ticket agent example we looked at last month and discuss the solution to the problem of over-selling of tickets.

One quick note before we proceed: If you’re writing applications that use the pthreads libraries for Linux, you must include the option -lpthread as a flag to the compiler on the command line. This will tell the compiler and linker to look for the necessary functions in the pthreads library and link them into your application.

The Ticket Agency

As we saw last month, using threads allows you to have multiple pieces of code that share and operate on a common set of data. However, threads introduce additional complexity and create code that is often difficult to debug. Let’s return to our ticket agency example. Each ticket agent may be thought of as one thread. There is common data between the ticket agents — the number of tickets to be sold. Our goal is to sell all the tickets without selling too many. Each ticket agent can be modeled with the C function:


void* ticket_agent (void* foo)
{
while (total_sold < NUM_TICKETS)
{
if (sold_ticket ())
total_sold++;
}
return NULL;
}




Figure One: The ticket_agent() Function


int main ()
{
pthread_t agents[NUM_AGENTS];
void* return_val;
int i;
srand (time (0));
for (i = 0; i < NUM_AGENTS; i++)
pthread_create (&agents[i], NULL, ticket_agent, NULL);
for (i = 0; i < NUM_AGENTS; i++)
pthread_join (agents[i], &return_val);
printf (“%d\n”, total_sold);
return 0;
}

Then ticket agents can be created as threads in the main() function, as shown in Figure One. This program seems simple. Ticket agents should sell tickets until they’re all gone. Then, the main() function prints out the total number of tickets sold. However, if we actually run this program, we get some disturbing results. We need to add a few global variables:


#define NUM_TICKETS 10000000
#define NUM_AGENTS 4
int total_sold = 0;







feedback



       page 1 2 3 4 5 6   next >>

 

Linux Magazine /
March 2001 / COMPILE TIME
Threads and Mutexes: Part II




If we run the program ten times, we will get the following:


mordack:~> ./a.out
10000001
mordack:~> ./a.out
10000002
mordack:~> ./a.out
10000001
mordack:~> ./a.out
10000001
mordack:~> ./a.out
10000000
mordack:~> ./a.out
10000000
mordack:~> ./a.out
10000001
mordack:~> ./a.out
10000001
mordack:~> ./a.out
10000001
mordack:~> ./a.out
10000001
mordack:~>

Notice that the program sold too many tickets eight out of the ten times! In fact, one of those times it sold two extra tickets. Automating this process and running the program 10,000 times yielded the following results:


mordack:~> ./a.pl
Over sold 0 exactly 3924 times.
Over sold 1 exactly 4735 times.
Over sold 2 exactly 1341 times.
Over sold 3 exactly 0 times.

This tells us that the program is only working correctly about 40 percent of the time. The rest of the time we are over-selling the number of tickets we have. Why?

Imagine, near the end of the program, when all but one of the tickets have been sold. One ticket agent, Bob, checks the value of the total_sold variable to make sure he can still sell a ticket. At that instant, the operating system’s thread scheduler decides it’s time for Bob to stop running and time for Sally to run for a while. Before Bob has had a chance to update the value of total_sold, Sally checks to make sure she can sell a ticket. Hence, both Bob and Sally can increase the value of total_sold when it is only one shy of the total number of tickets.

Race Conditions

Computer scientists refer to this problem as a “race condition.” The problem exists because multiple threads are racing to update the values of variables based on previous values of those variables. A race condition exists any time the speed of execution of a thread, or the way threads are scheduled, can alter the outcome of the application. Race conditions are amongst the most common and most difficult problems to track down.

In our ticket agent program, the race condition exists because each ticket agent assumes that once the value of total_sold in the while() loop has been checked, that value will remain unchanged throughout the body of the loop. However, since the thread scheduler may stop the currently running thread and allow another thread to execute during that time, this is not a valid assumption.

Fortunately, the pthreads library provides a mechanism for avoiding this problem — the “mutex,” which is an abbreviation for “mutual exclusion.” A mutex allows one thread to “lock” a given piece of code so that no other thread may execute until the piece of code holding the lock is finished. In the case of our ticket agency, Bob wanted to make sure that once he read the value of total_ sold, he could sell one ticket and know that he wasn’t selling a ticket that no longer existed. In other words, he wanted all of the code in his while() loop to be executed without the possibility of some other thread executing. Bob could guarantee that the value of total_sold would not change if a mutex were used.

First, let’s look at the pthreads library functions that enable you to use mutexes:


int pthread_mutex_lock(pthread_mutex_t* mutex);

intpthread_mutex_unlock(pthread_mutex_t*mutex);

These functions both return 0 on success and an error value on failure; see the manual page for more details. A thread obtains a mutex by calling the pthread_mutex_ lock() function. When one thread has the mutex, any other thread that calls the pthread_mutex_lock() function will be put into a temporary state of suspended animation. Threads that are put into a state of suspended animation are referred to as “blocked.”

Once the original thread calls the pthread_mutex_ unlock() function, pthread_mutex_lock() will return in one of the suspended threads that called it, and that thread will gain access to the mutex. By using the pthread_ mutex_lock() and pthread_mutex_unlock() functions around a given piece of code, you can guarantee that only one thread will be able to execute that code at any given time.

To create a mutex in Linux, simply declare it and assign it (in the declaration) as follows:

pthread_mutex_tmy_mutex= PTHREAD_MUTEX_INITIALIZER;

There also exists a function:


int pthread_mutex_init
pthread_mutex_t* mutex,const thread_mutexattr_t* mutexattr;

you can use to initialize locks (with more information as to the type of lock and how it behaves). For our example, however, PTHREAD_ MUTEX_INITIALIZER() is quite sufficient.

By placing a lock around the read and write of the total_ sold() variable, each ticket agent can make sure that no other ticket agent is editing that variable. This eliminates the race condition from the program. Let’s take a look at the new ticket_agent() function in Figure Two.




Figure Two: The ticket_agent() Function with mutex


pthread_mutex_t tickets_sold_lock = PTHREAD_MUTEX_INITIALIZER;

void* ticket_agent (void* foo)
{
int not_done = 1;
while (not_done)
{
pthread_mutex_lock (&tickets_sold_lock);
if (total_sold < NUM_TICKETS)
{
if (sold_ticket ())
total_sold++;
}
else
not_done = 0;
pthread_mutex_unlock (&tickets_sold_lock);
}
return NULL;
}

First of all, notice the calls to ptrhread_mutex_lock() and ptrhread_ mutex_unlock(). We needed to put these calls around the code that checks the value of total_sold and writes to the value of total_sold — and we needed to put them inside the while() loop.

If we had placed the call to pthread_mutex_lock() before the while() loop and the call to ptrhread_mutex_unlock() after the while() loop body, only one thread would have been able to sell tickets. It would acquire the lock before entering the while() loop, therefore causing all the other threads to block before entering their while() loops. It would only release the lock once it had exited the loop, after all the tickets had been sold. Therefore, in the modified version, we created our own local variable (not_done) to track when the while() loop should exit, and we moved the code to lock and unlock inside the loop. This way every thread gets a chance to sell tickets.

After running this program 1,000 times, we got the following results:


mordack:~> ./a.pl
Over sold 0 exactly 10000 times.
Over sold 1 exactly 0 times.
Over sold 2 exactly 0 times.
Over sold 3 exactly 0 times.

Note that we did not over-sell a ticket once in the 10,000 trial runs of the program. Although this does not guarantee that we do not have a race condition in our program, the results after the modification are much better than our original attempt.

The Dining Philosophers

Now that we’ve illustrated the usefulness of mutexes, let’s examine a program that requires their careful usage. A famous example in concurrent programming is the Dining Philosophers program. Imagine a group of philosophers sitting around a circular table. Philosophers only do two things — think and eat. When a philosopher thinks, he has no effect on the other philosophers. When he eats however, he can potentially create a problem for the philosophers on his left and right.

You see, there is only one plate of rice in the middle of the table and only one chopstick between each philosopher. Everyone can eat the rice, but in order to do that, a philosopher must be able to pick up his two adjacent chopsticks. Therefore, if one of the philosophers is eating, the philosophers to his right and left can’t eat until he puts down his chopsticks.

We can model this problem with a program that uses threads to represent the philosophers and an array of mutexes to represent the chopsticks. When a thread holds the lock for a chopstick it is equivalent to a philosopher using that chopstick to eat. The code for the program is shown in Figure Three .




Figure Three: Dining Philosophers — Part I


#include <pthread.h>
#include <stdio.h>

#define NUM_PHILOSOPHERS 5

pthread_mutex_t chopsticks[NUM_PHILOSOPHERS];

void think ()
{
/* Intentionally left blank. Simulate thinking. */
}

void eat ()
{
/* Intentionally left blank. Simulate eating. */
}

void* philosopher (void* number)
{
int my_num = *((int*)number);
while (1)
{
/* First we think. */
think ();

/* Grab the chopsticks to my left and to my right */
pthread_mutex_lock (&chopsticks[my_num]);
pthread_mutex_lock (&chopsticks[(my_num + 1) %
NUM_PHILOSOPHERS]);

/* Eat */
printf (“Philosopher %d eating!\n”, my_num);
eat ();
printf (“Philosopher %d done!\n”, my_num);

/* Put the chopsticks down. */
pthread_mutex_unlock (&chopsticks[(my_num + 1) %
NUM_PHILOSOPHERS]);
pthread_mutex_unlock (&chopsticks[my_num]);
}
return NULL;
}

int main ()
{
int i;
pthread_t phils[NUM_PHILOSOPHERS];
void* return_val;

for (i = 0; i < NUM_PHILOSOPHERS; i++)
pthread_mutex_init (&chopsticks[i], NULL);

for (i = 0; i < NUM_PHILOSOPHERS; i++)
pthread_create (&phils[i], NULL, philosopher, &i);

for (i = 0; i < NUM_PHILOSOPHERS; i++)
pthread_join (phils[i], &return_val);

return 0;
}

The main() function of this program creates all the philosopher threads and passes them a number, 0 through NUM_ PHILOSOPHERS – 1, so that the philosopher threads know which chopsticks to grab. This program seems simple enough, but there is a serious problem. If we run it we’ll see that a lot of philosophers eat, but at some point the program will hang. It will hang at a different point each time we run it.








feedback



<< prev   page 1 2 3 4 5 6   next >>

 

Linux Magazine /
March 2001 / COMPILE TIME
Threads and Mutexes: Part II




Deadlock

Consider the situation where every philosopher decides to eat at the same time. Because of the way we wrote this program, every philosopher will grab the chopstick to his right and then grab the chopstick to his left. However, imagine this scheduling of threads: Philosopher 0 grabs chopstick 0 and then gets swapped out, Philosopher 1 grabs chopstick 1 and then gets swapped out, and so on. Each philosopher will grab one chopstick and then wait to get the other one. However, since all the philosophers are waiting, and none are eating, they will continue to wait like that forever. This problem is called deadlock.

There is no blanket way to solve deadlock. However, there are two possible ways to eliminate the deadlock problem that exists in the Dining Philosophers program. (Why else would computer scientists like it so much?) The first solution involves using another mutex. You might have observed that the problem causing the deadlock was that a philosopher was able to grab one chopstick but then gets swapped out while another thread runs. This could be considered a race condition on the chopsticks. Each philosopher is racing to grab a second chopstick, assuming that after grabbing one chopstick, they will eventually be able to grab the other. As in the ticket agent problem, if we create a lock to make sure that only one philosopher attempts to grab chopsticks at a time, we can be sure that the philosopher will eventually get two chopsticks. The new code for the philosopher function is listed in Figure Four.




Figure Four: Dining Philosophers — Part II


pthread_mutex_t chopstick_lock = PTHREAD_MUTEX_INITIALIZER;

void* philosopher (void* number)
{
int my_num = *((int*)number);
while (1)
{
/* First we think */
think ();

/* First get the lock on grabbing chopsticks.
Then, grab the chopstick to my left and to my right */
pthread_mutex_lock (&chopstick_lock);

pthread_mutex_lock (&chopsticks[my_num]);
pthread_mutex_lock (&chopsticks[(my_num + 1) %
NUM_PHILOSOPHERS]);

/* Release the chopstick grabbing lock. */
pthread_mutex_unlock (&chopstick_lock);

/* Eat */
printf (“Philosopher %d eating!\n”, my_num);
eat ();
printf (“Philosopher %d done!\n”, my_num);

/* Put the chopsticks down */
pthread_mutex_unlock (&chopsticks[(my_num + 1) %
NUM_PHILOSOPHERS]);
pthread_mutex_unlock (&chopsticks[my_num]);
}
return NULL;
}

Now that we’ve guaranteed that only one philosopher can grab chopsticks at any given time, we know that the philosopher will eventually get two of them. Since he is the only one in that section, the others must be either eating or thinking.








feedback



<< prev   page 1 2 3 4 5 6   next >>

 

Linux Magazine /
March 2001 / COMPILE TIME
Threads and Mutexes: Part II





The other solution to the problem is a bit more subtle. You might have realized that the problem with the philosophers in the original program was that they each tried to get the chopstick on the right first. If even one of the philosophers were to try to get the chopstick on the left first, the circularity that was causing the deadlock would be broken. For this solution, we simply make half of the philosophers grab the chopstick on their right first and the other half grab the chopstick on their left first. The code for this is shown in Figure Five.




Figure Five: Dining Philosophers — Part III


void* philosopher (void* number)
{
int my_num = *((int*)number);
while (1)
{
/* First we think. */
think ();

/* If I’m an even philosopher, grab chopsticks one way. */
if (my_num % 2)
{
pthread_mutex_lock (&chopsticks[my_num]);
pthread_mutex_lock (&chopsticks[(my_num + 1) % NUM_PHILOSOPHERS]);
}
/* Otherwise, grab them the other way. */
else
{
pthread_mutex_lock (&chopsticks[(my_num + 1) % NUM_PHILOSOPHERS]);
pthread_mutex_lock (&chopsticks[my_num]);
}

/* Eat */
printf (“Philosopher %d eating!\n”, my_num);
eat ();
printf (“Philosopher %d done!\n”, my_num);

/* Put the chopsticks down */
pthread_mutex_unlock (&chopsticks[(my_num + 1) % NUM_PHILOSOPHERS]);
pthread_mutex_unlock (&chopsticks[my_num]);
}
return NULL;
}

No Easy Answers

As we said last month, writing bug-free programs with threads is difficult. Much care must be taken to insure correct behavior. There really are no easy fixes. The use of mutexes allows you to avoid race conditions, but unfortunately, as you can see in the case of the dining philosophers, mutexes introduce the possibility of creating a deadlock in your program. Unlike race conditions, which usually cause incorrect results, deadlocks cause the program to hang entirely, which makes them absolutely no fun to debug. You now hopefully have a better understanding of the issues involved in programming with threads and the tools necessary for writing effective applications with them.



Benjamin Chelf is an author and engineer at CodeSourcery, LLC. He can be reached at chelf@codesourcery.com.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62