Parallel Programming in C for the Transputer
© D. Thiébaut, 1995




EXERCISES


 
7-1

Implement a short program that uses the input and output managers of this section, and that maintains common buffer of 1000 integer-size packets. Create two additional tasks, Sink and Source, that will absorb and generate packets for the output and input managers, respectively. Make Sink and Source high priority tasks so that you can precisely control their swap time. Make them process packets in bursts of random length, and of random occurrence, so that InManager and OutManager will be forced to swap at different times relative to each other. The main loop of Source, for example, could look something like this:

while (1)
{
	NoPackets = rand()%MAXBURST;
	for (i=0; i<NoPackets; i++)
		ChanOutInt(InC, i);
	ProcWait(rand()%MAXDELAY);
}

Can you make your program deadlock? How often does this occur?

.


 


7-4 Why not a certainty?

You may wonder why we spent all this time computing a probability of occurrence. After all, if the code is flawed, won't it fail automatically? The answer is that it won't, and this is the source of much frustration when debugging parallel programs. Their behavior is not repeatable, and bugs may not always show up right away. A short example will illustrate this phenomenon. Assume that you launch a program on one transputer where the main function "procpars" two low-priority parallel tasks, Task1 and Task2. Depending on when you actually press the Enter key after the ld-net command, the two tasks will interleave in time differently respective to each other. Figure 7-2 shows two of many possible scenarios. In both cases the first three actions are the same:

Task1 is represented as the dark gray rectangles, and starts right away. But here the two scenario differs. Because the swap times are defined by timer events, Task1 can continue only up until the next quantum swap. So the amount of processing done by Task1 in its first quantum is directly related to the exact time at which the Enter key is pressed[1]. As a result, the amount of processing done by Task1 in its first quantum is unpredictable, and unrepeatable!


Figure 7- 2: Launching of the same program at two different instants in time.


EXERCISES


 
7-2

If you worked out Exercise 5-2, then very likely you have found out that when several tasks are launched simultaneously, the first task of the list never enjoys a first full quantum. The reason is exemplified in Figure 7-2.

But there are ways to make sure that all tasks will always start with a full (or very close to full) quantum. Can you think of a way to control the tasks so that their interleaved execution with respect to each other is always the same?

Assume that Task1 and Task2 in the original code are launched by main() as follows:

ProcPar(Task1P, Task2P, NULL);
.

 


7-5 Detecting Deadlocks: The Interleave Problem

Deadlocks occur in the context of shared resources or shared data because the action of modifying a shared item is comprised of several instructions or statements, and when two or more processes happen to access the same shared item simultaneously, the instructions of the two concurrent accesses have to execute in an interleaved fashion, resulting in havoc.

In our Enqueue/Dequeue example, the shared item is the queue. The deadlock occurs when the dequeue and enqueue operations, both operations that modify the queue, are interleaved, as we saw in Table 7-1.

To detect whether a particular section of code is prone to deadlocks, we use Lewis and Rewini's interleave matrix [LEWI92]. The matrix is an easy tool to analyze the behavior of two tasks whose execution is interleaved. The rows of the matrix correspond to statements or instructions executed by one task, while the columns correspond to the actions performed by the other task. Each statement is shown as a bar. We indicate that a particular section of a task is being executed by putting an arrow across the bar.

The rules for filling in the matrix are simple:

Interleave Matrix Rule 1

The only directions acceptable for arrows crossing the rows and columns of the interleave matrix are to the right and down.

Rule 1 simply means that since the tasks are run in a multitasking environment, the processor can execute the instructions or statements of only one task at a time. The arrows represent execution time spans. Since the horizontal axis represent processor time spent on one task, and the vertical axis time spent on the other task, arrows can go only in one direction at a time, and not diagonally. This would imply that the processor was able to run the code of two distinct tasks at once, which is not possible in a multitasking environment.

Interleave Matrix Rule 2

A path consists of a series of consecutive arrows, going horizontally, vertically, or in mixed directions. All paths must exit the interleave matrix, either past the last row or past the last column.

This rule simply states that if the processor starts running a task whose statements are shown horizontally or vertically, then it should be expected to finish the task eventually. Because the execution of the two tasks may be interleaved, it might be that the processor will start executing one task, then switch to the other, then switch back again, and so on, resulting in arrows going right, down, then right again.

Interleave Matrix Rule 3

The interleave matrix may contain cells that cannot be entered either from the left or from the top. Such cells correspond to code sections executed by the two tasks that perform unprotected access to the same shared resources. We will refer to these cells as forbidden.

When two task must modify a shared resource, they must invariably go through the process of

Therefore, if two tasks need to modify the same resource, and if both use the method above, then it is not possible for both of them to interleave any of these three steps. We will indicate the presence of such cells in the matrix by filling the squares of the matrix.

Interleave Matrix Rule 4

The state of key variables is shown inside each cell of the matrix.
This is useful for monitoring key variables (usually shared variables) as a function of time, as the tasks progress through their interleaved execution.

Interleave Matrix Results

The goal of the interleave matrix is to explore each possible path through the matrix and to make sure that none of them violates any of the rules.
If all paths exit the matrix without crossing over forbidden cells, then there is no possible deadlock.
If there exists one path that never leaves the matrix, then the code can experience deadlocks.
If there exists one path that crosses a forbidden region, then the resources are incorrectly locked, which may lead to false results and/or subsequent deadlocks.

Let's take a look at the equivalence matrix of the InManager and OutManager tasks of Example 3. It is shown below in Figure 7-3.


Figure 7- 3: Interleave matrix for In/Out Managers example.

The top of each column corresponds to a statement of the OutManager. We only selected the statements of major interest, and grouped the others under the terms CS#3 and CS#4 for Critical Section number 3 and 4, respectively. The term critical section indicates an area of the code where the access to the shared resource is privileged. The head of each row represents the major statements of the InManager task. CS#1 and CS#2 are the critical sections of that task.

We have shown one path, marked by a series of horizontal arrows. As this path crosses straight over all the columns of the matrix, it represents the execution of OutManager, one major statement at a time, without any interleaving with InManager. We see this because the path is above the first row of the matrix, indicating that InManager never started in that scenario.

The F and B symbols show the state of the Front and Back indexes. When they are plain, Front and Back are not locked by their semaphore. When crossed by a slash, then their use is locked by their associated semaphore. For example, Back goes from plain to crossed over the second column because the execution of the SemP(BackSem) statement makes its use priviledged to OutManager.

The forbidden cells are the gray-filled areas. For example, the gray cell marked with an asterisk is forbidden because it shouldn't be possible to enter it from the top or from the left. Entering it from the top would mean that OutManager had issued successfully a SemP(FrontSem) statement, and that InManager had just issued, also successfully, the same statement. This shouldn't be possible because SemP and SemV are implemented so as to specifically prevent that kind of behavior.

This first path wasn't much of a challenge. Let's take a look at what happens when the two tasks are interleaved:


Figure 7- 4: A path leading to a deadlock.

We see here a fully interleaved execution of the two tasks. First horizontal arrow: OutManager starts and executes the while statement. Then a task switch occurs and InManager executes its while statement: the second arrow changes direction and goes down. OutManager resumes and executes its SemP(BackSem) statement. The path continues until we reach a cell that we cannot leave because both Front and Back are locked, and to go right or down would require locking one of them again while it is still locked. This is not possible, and, as a result, the arrows return back to the cell. InManager and OutManager are deadlocked.

The interleave matrix is a simple tool for analyzing the interleave problem created by several tasks or processes accessing shared data. In cases where more than two tasks are involved, a multi-dimensional cube can be constructed.

Solution

The queue was the data structure the access to which we were trying to protect, but since two indexes Front and Back controlled its modification, the error was to issue locks to the shared variables at different point of the computation. Since the queue was the data structure shared, we should have used only one lock and activated it right at the beginning of InManager and OutManager.

Rule 5

When a data structure is shared by several tasks that have modify privilege over it, one lock only should be established on the structure, before the first shared member of the structure is modified. The lock should be released after the last modification is carried out on the structure.

The rule states that we should have used only one semaphore, as shown in the code below.

void InManager(Process *P,  Channel *InC)
{
	while (1)
	{
		/*---wait til there's room ---*/
		while (QueueFull)
			ProcReschedule();

		/*---lock use of queue ---*/
		SemP(QueueSem);
		/*---Receive pckt in buffer ---*/
		ChanIn(InC,
				&(Buffer[Front]),
				sizeof(Buffer[0]));
		Empty = 0;
		Front = (Front+1)%MAXBUFSIZE;

		if (Front==Back)
{ Front = 0; Back = 0; QueueEmpty = 1; } /*---release Back and Front ---*/ SemV(QueueSem); } }
void OutManager(Process *P, Channel *OutC)
{
	while (1)
	{
		/*---wait til there's a pckt---*/
		while (QueueEmpty)
			ProcReschedule();

		/*---lock use of Back ---*/
		SemP(QueueSem);
		/*---Send out pckt ---*/
		ChanOut(OutC, 
			&(Buffer[Back]),
			sizeof(Buffer[0]);
		Full = 0;
		Back = (Back+1)%MAXBUFSIZE;

		if (Back==Front)
		{
			Front = 0;
			Back  = 0;
			QueueFull  = 1;
		}
		/*---release Back and Front ---*/
		SemV(QueueSem);
	}
}

Figure 7- 5: Interleave matrix for Input/Output Managers with one semaphore only.

The resulting interleave matrix now becomes much simpler, as Figure 7-5 shows. The action of updating Front and Back during the Enqueue and dequeue operations have become atomic, removing the trap created by the two forbidden regions from the matrix. Atomic actions here refers to actions accessing shared resources that cannot be interleaved with other actions accessing the same resources. Of course both managers will block on the ChanIn and ChanOut operations, but while they are blocked, no other enqueue or dequeue operations can take place.

This is the easiest solution, but it may not be as efficient as it could be. The reason is that in more general cases, the amount of work done between the time the semaphore is locked and then released could be fairly substantial, and locking the whole data structure for such a long time could reduce the performance of the application by blocking other tasks.

In some cases, we can revert back to the first two-semaphore approach, but setup the second lock request in such a way that if the second semaphore cannot be locked, then the task should release both semaphores and try again from the beginning. For example, the algorithm for implementing the new, safe version of the InManager should read as follows:


S1:	wait until the queue is not full.
S2:	lock semaphore guarding Front
S3:	get new packet and store it in Buffer[Front]
S4:	update Front and Empty status
S5:	if semaphore guarding Back is locked then 
		begin
		release lock on Front 
		wait some (short) random amount of time
		wait until locks can be established on both Front and Back
		end
S6:	once Front and Back are locked, update Full, Front and Back.
S7:	release locks on Front and Back.		
			
The advantage of this method is that Step S5 prevents the Input and Output Managers to both hold one of the two indexes and to deadlock when they try to lock the other. The first of the two managers to discover that it cannot lock the second index will release the first index it holds, and will try to lock both of them again. Both manager will then be executing Step S5 of the algorithm, and the random wait will allow one of them to succeed before the other. But there is still a potential deadlock condition within Step 5. That deadlock condition is that locking two separate semaphores in succession is not an atomic action, and therefore interleave execution could happen, resulting in another deadlock condition. Will we ever be able to avoid deadlocks altogether in this example?! Yes, by shifting gears and boosting the priority of the task to High when the two locks are attempted. Here's how:

/* InManager is about to execute Step S5 */
if (BackSem.use)
{
     /* Back is already locked. Release Front */
     HSemV(FrontSem);

     /* wait some random period of time 
        (we assume srand() already called */
     ProcWait(rand()%MAXTIMEOUT);

     /* Atomic action starts... */
     ProcToHigh();

     /* if both semaphores can be taken, lock them */
     if ((FrontSem.use==0)&&(BackSem.use==0))
     {
          HSemP(FrontSem);
          HSemP(BackSem);
     }
     
     /* End of atomic action */
     ProcToLow();
}

Two comments are worth making here. A quick glimpse at the conc.h header file shows that semaphores are defined as three-member structures, the first one of which, use, reports the status of the semaphore: 0 not locked, !0 locked. So testing the semaphore without blocking is trivial. The second comment is about the introduction of the HSemP and HSemV functions. These Logical Systems functions are needed here because we access the semaphores from tasks bearing different priority levels, since we originally assumed that InManager and OutManager were low priority tasks, and that they lock their first semaphores at low priority. Because the same semaphores can be locked when the task has shifted to high priority, we need to use the mixed-priority semaphore functions HSemP and HSemV.

This leads us to the second rule for implementing deadlock-free accesses to shared resources.

Rule 6

When modification of a shared resource requires the use of several successive locks, each lock should be carried out only if its successful operation is assured. If not, then all locks already acquired should be released, and the execution should not be allowed to continue until they can be all acquired again, including the new lock, in an atomic action.

This rule is definitely more complex than the first one, and it may not always be possible to implement it, in which case Rule 1 should be used.




[Previous] [HOME] [NEXT]