Nonetheless, let's take a look at a simple example and see how some Logical Systems C functions can be used to recover from (very frequent) deadlocks.
However, when a transputer counts the number of neighbors for the cells that lie on a border common with another transputer, some exchange must be made between the two transputers, so that an accurate count of neighbors can be made. Typically, each transputer will send to the other a vector corresponding to the current population of cells on the border. Figure 7-8 illustrates the exchange that must take place between each transputer, every generation.
We will make two simplifications here. The first one is that we assume that the transputers have special rules for dealing with the cells located on the boundaries of the full grid. Our second assumption is a simplifying one: we will assume that the exchange of information is done in such a way that transputers diagonally apart will get the status of each other's inside cell. The four cells at the intersection of the inside row and inside column of each transputer depend on the status of the other three, and hence there should also be "diagonal" information transfer, which we do not consider here. Instead, we are interested in exchange of data between the transputers, which we sketch as follows.
S1: GenerationCount <- 0 S2: Get column and Row from neighboring transputers S3: Send row and column to neighboring transputers S4: Compute new generation S5: GenerationCount <- GenerationCount + 1 S6: if (GenerationCount < MAXGENERATION) goto Step S2This six-step algorithm is thus run by each transputer.
From what we have seen in this chapter, you may already see that we blatantly forgot to implement any synchronization between the exchange of data. Clearly, if all four start by waiting for a column or a row from their neighbor, then the program will go nowhere... Except if we write the code keeping in mind that deadlocks will occur. The idea is to use communication functions that can time-out. LS C supports several of them, among which we have ChanInTimeFail, ChanOutTimeFail, and ProcTimerAlt. These functions work similarly to their counterpart ChanIn, ChanOut, and ProcAlt, but take an extra argument that represents a time in the future. When the timer associated with the current priority reaches the value stored in that argument, the function is aborted. If the task was inactive, blocked on a deadlocked channel, then the timer will awaken the task and bring it back into the queue of active tasks.
A quick example will illustrate how ChanInTimeFail can be used this way.
#define TIMEOUT 1000
int SafeChanInInt(Channel *chan, int *data)
{
int i;
i = ChanInTimeFail(chan, data, sizeof(int), Time()+TIMEOUT);
return i;
}
The ChanInTimeFail function, whose syntax chart is given below, is set up to receive an integer from the channel chan, and to store it in (*data). However, if the ChanInTimeFail function fails to return before a thousand timer ticks have elapsed (64 ms), then the channel transfer will be aborted, and the value 1 will be stored in i. If the transfer completes within 1000 ticks, then the function is successful, and 0 is stored in i.
int ChanInTimeFail(Channel *, void *, int, int)
Let's take a look at the code for the four transputers now. It contains only the code relative to the data exchange, as we left out the maintenance of the grid of cells.
/* =======================================================================
timeout.c
DESCRIPTION:
Simulates some of the data exchanges required in Conway's game of
life. the program simulates the exchange of a full vector (column
or row) of the grid of cells maintained by each transputer by
sending a single integer. This could be a bit-map representing 32
cells for example).
The program will deadlock as no effort has been made to synchronize
the exchange of information in a dead-lock free fashion. the
program will recover from the deadlocks, however, by using the
timeout feature of the ProcAltTimerList function.
TO COMPILE AND RUN:
make -f timeout
ld-net timeout
ASSOCIATED NIF-FILE
buffer_size 200;
host_server CIO.EXE;
level_timeout 400;
decode_timeout 2000;
1, timeout, R0, 0 , 2[0] , 3[1] , ;
2, timeout, R1, 1[1] , 4[0] , , ;
3, timeout, R4, 4[1] , 1[2] , , ;
4, timeout, R2, 2[1] , 3[0] , , ;
====================================================================== */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "conc.h"
/* =========================== DEFINITIONS ============================ */
#define NO_NODES 4 /* number of transputers */
#define MINTIMEOUT 400 /* minimum timeout period */
#define TIMEOUT (Time()+_node_number*MINTIMEOUT)
#define NO_EXCHANGES 100 /* Total # of data-exchanges or */
/* generations */
#define LeftChanIn LINK1IN
#define RightChanIn ((_node_number==1)? LINK2IN: LINK0IN)
#define LeftChanOut LINK1OUT
#define RightChanOut ((_node_number==1)? LINK2OUT: LINK0OUT)
/* =========================== PROTOTYPES ============================= */
void ExchangeData(void);
/* ============================= GLOBALS ============================== */
int NoTimeOuts = 0;
/* -------------------------------------------------------------------- */
/* MAIN */
/* -------------------------------------------------------------------- */
main()
{
int i, StartTime, EndTime;
srand(Time()); /* initialize random # generator */
StartTime = Time(); /* keep track of processor time */
/*--- simulates computation of several generations ---*/
for (i = 0; i<NO_EXCHANGES; i++)
ExchangeData();
EndTime = Time();
/*--- display elapsed time ---*/
if (_node_number==1)
printf("Elapsed time: %6.3f seconds\n%d timeout(s)\n\n",
1.0*(EndTime-StartTime)*64/1000000, NoTimeOuts);
}
/* -------------------------------------------------------------------- */
/* EXCHANGEDATA */
/* function used to exchange inside columns and inside rows with */
/* neighboring transputers */
/* -------------------------------------------------------------------- */
void ExchangeData(void)
{
int GotLeft = 0, /* keep track of exchanges already */
GotRight = 0, /* performed. */
SentLeft = 0,
SentRight = 0;
int index, dummy, vector;
Channel *ChanList[3]; /* use list in ProcAlt function */
ChanList[0] = LeftChanIn;
ChanList[1] = RightChanIn;
ChanList[2] = NULL;
/*--- wait until 4 data exchanges are performed ---*/
while (GotLeft+GotRight+SentLeft+SentRight<4)
{
/*--- if there's still something to receive wait a while and
listen if its coming on a link ---*/
if (GotLeft+GotRight<2)
{
index = ProcTimerAltList(TIMEOUT, ChanList);
switch (index)
{
case 0: /*--- something on first channel in list ---*/
vector = ChanInInt(ChanList[0]);
if (!GotLeft)
GotLeft++;
else
GotRight++;
/*--- remove 1st channel from list ---*/
ChanList[0] = ChanList[1];
ChanList[1] = NULL;
break;
case 1: /*--- something on second channel in list ---*/
vector = ChanInInt(ChanList[1]);
GotRight++;
/*--- remove 2nd channel from list ---*/
ChanList[1] = NULL;
break;
case -1: /*--- time-out occurred ---*/
NoTimeOuts++;
break;
}
}
/*--- simulate sending a column or row ---*/
vector = _node_number;
/*--- if we haven't sent anything yet... ---*/
if (SentRight+SentLeft==0)
{
/*--- choose one neighbor at random and send it vector ---*/
if (rand()%2)
{
ChanOutInt(RightChanOut, vector);
SentRight++;
}
else
{
ChanOutInt(LeftChanOut, vector);
SentLeft++;
}
continue;
/*--- then go back to receive mode ---*/
}
/*--- we already have sent to the "left" neighbor... ---*/
if (!SentRight)
{
/*--- then send to the right neighbor ---*/
ChanOutInt(RightChanOut, vector);
SentRight++;
}
/*--- we already have sent to the "right" neighbor... ---*/
if (!SentLeft)
{
/*--- then send to the left neighbor ---*/
ChanOutInt(LeftChanOut, vector);
SentLeft++;
}
}
}
Why utilize a time-out period that is a function of the node number and not a constant used uniformly by the four transputers? If we had selected the same constant for all three transputers, then we would have created a scenario where the transputers would keep "missing" each other. Because the transputers are loaded at different time, they will actually start the ProcTimerAlt function at different times, but because they always exchange four pieces of information between each other, they would quickly synchronize themselves, and start a pattern where they would timeout for the ProcTimerAlt function at the same time, then move on to the next section where they would enter the ChanOut statements which are not deadlock-resistant. This would create a deadlock. But trying to fix this problem by using ChanOutTimeFail functions with identical time-out intervals would not solve the problem. The reason is that all four transputers would switch to input mode at the same time, time-out of this mode at the same time, go into output mode (ChanOutTimeFail) at the same time, and then time-out, still at the same time, ad infinitum. They would never reach a point where one would be in input mode while its neighbor would be in output mode. For this reason the transputers must use different timeout periods.
Another option would be to adopt the same system used in computer packet-switched networks where several sources colliding while trying to establish a communication will wait a random period of time and try again. The randomness practically eliminates the chances that they will collide again. We could have used this method as well and made each transputer wait for random periods of time before timing out (see Exercise 7-9).
Running on four 20 MHz T805, the above code times at 2.560 seconds. For 100 simulated generations, this is awfully slow. A synchronized version of the program, written so that it is deadlock-free boasts an execution time approximately 50 times faster on the same four transputers (see Exercise 7-10). The number of time-outs reported by the program is 99. Since the time-outs occur only in the ProcAlt call, and since two calls to ProcAlt must be made before receiving the two vectors needed for each generation, this means that one data exchange (four vectors) experiences an average of one time-out, and that this time-out lasts (2.560-0.051)/99 = 0.025 seconds, or 0.025/0.000064 = 390.625 ticks. About the size of MINTIMEOUT. This makes sense. Node 1 has the shortest time-out period, and will always be the first one to "unblock" a lock, hence we can expect an average delay per unblocked deadlock of 400 ticks.
Reducing the MINTIMEOUT constant could reduce the total execution time, but may also result in too short a period of time to insure that the 4 transputers remain synchronized once unblocked.
| 7-7 |
Modify the code of the timeout.c program listed previously in this chapter, and change the TIMEOUT macro to
#define TIMEOUT (Time()+MINTIMEOUT) Run the program on four transputers connected in a square mesh. Why isn't the program running? . |
| 7-8 |
Change the ChanOut calls to ChanOutTimeFail, and modify the macro TIMEOUT as shown in the previous exercise. Run the program on a network of four transputers. Why isn't the program still running? |
| 7-9 |
Modify the TIMEOUT macro so that it yields out a random time in the future, between Time()+MINTIMEOUT and Time()+MAXTIMEOUT. What is the smallest MINTIMEOUT value you can use that allows the program to always finish its 100 generations? |
| 7-10 |
Modify the timeout.c program and rearrange the exchange of information between the four transputers, so that deadlock-free communication can be established without introducing any time-out functions. Time the execution of your program and compare it to that of the original program. |
ChanInChanFail(channel *c, void *buffer, int size, channel *timeoutchan)
ChanOutChanFail(channel *c, void *buffer, int size, channel *timeoutchan)
The timeoutchan channel is directed at the task that calls these ChanFail functions, and is used to unblock the process that called ChanInChanFail, or ChanOutChanFail.
Assume, for example, that some task T1 regularly sends out information over some channel c to another task T2, and that it is feared that T2 may deadlock, bringing T1 in the lock by not responding to T1's ChanIn operations. One way to detect the deadlock and to recover from it is to create a new task T3, running in parallel to T1 (very likely but not necessarily on the same transputer), to "watch over" T1.
Time-Out Rule |
Note that when a communication function is aborted, either by a time-out
condition, or by the activation of a watch-dog channel, data may get lost.
The reason is that if the transfer of information had already started, then it
is stopped and no attempt is made by the timing-out agents to let it continue
until the end. When dealing with hard channels, it will be impossible to find
out if any bytes of information had been transferred before the transmission
was aborted. Therefore, time-out delays should be selected long-enough that the probability of a time-out to occur during a transmission be as small as possible. |
In Feitelson's scheme, one parallel task on the processor would be dedicated as a watch-dog task monitoring three counters. The first one, Total, represents the number of tasks running on that processor. This counter does not include the Watch-dog task, and is incremented every time a new task is created, while decremented every time a task finishes. The second counter, Stuck, represents the number of tasks that are blocked, or active in a state where they cannot contribute to the computation. A task stuck in a while loop waiting for a queue to become non empty would be a good example of such a case. Typically a task would increment the stuck counter before performing a blocking operation:
/*--- update counter in critical section ---*/ ProcToHigh(); Stuck++; ProcToLow(); /*--- perform blocking operation ---*/ ChanOut(...); /*--- wake up, and update counter again ---*/ ProcToHigh(); Stuck--; ProcToLow();
or before a busy-wait loop:
/*--- update counter in critical section ---*/
ProcToHigh();
Stuck++;
ProcToLow();
/*--- busy wait ---*/
while (Queue.Empty)
ProcReschedule();
/*--- update counter again ---*/
ProcToHigh();
Stuck--;
ProcToLow();
Note that the update of the counters must be done in a critical section, since Stuck is shared and updated by many tasks. The watch-dog task keeps comparing the Stuck counter to the Total counter. When the two are equal, one processor has nothing to do. When all the processors in the system reach this state, the last processor to actually reach the state should initiate a deadlock recovery operation. Feitelson assumes a shared memory system, so that individual processors can set shared bits to report their condition. This way, the last processor in the pool to reach the deadlock condition will detect this situation by setting the last bit in the pool, and will therefore be entitled to start the recovery. Using virtual channels and a protocol where watch-dog tasks keep on passing a token message representing the status of the system, a similar algorithm is manageable.
In the domain of data communication, deadlock arise from the creation of cycles in the resource or wait-for graphs. Preventing cycles from occurring is by far the most efficient way to deal with deadlocks. Several methods are available, and can be implemented in LS C.
In situations where deadlocks are allowed to occur, fail-safe communication functions can be use to recover from deadlock conditions. These communication functions (ChanInTimeFail, ChanInChanFail, ChanOutTimeFail, ChanOutChanFail) allow communicating tasks to exit from their blocked state on a time-out condition, or when a specific channel (watch-dog channel) receives information.