Parallel Programming in C for the Transputer
© D. Thiébaut, 1995




EXERCISES


 
7-2

Write the complete code for InManager and OutManager using the second rule of deadlock free code.

Test your program by simulating several hundred concurrent enqueue and dequeue operations.


 


A final note about deadlocks on shared resources is a propos. In this section we concentrated on semaphores, as their use is well documented in operating systems and parallel contexts. The transputer, however, does not provide atomic instructions for testing and modifying semaphores in one step. Logical Systems C provides these functions by using the high priority level as a way to create atomicity. In some circumstances, it could be just as simple for a programmer to use the same tactic to create critical sections when simple operations on shared data are required. Could we have done so with our queue managers example? The answer is no. Can you see why?

The reason is that if we had removed the semaphores and encapsulated the code of the InManager and OutManager between a pair of ProcToHigh() and ProcToLow() calls, we could still have had an interleave problem, because the ChanIn and ChanOut operations are blocking, and we couldn't have prevented one task to start before the other one had switched back to low-priority. This leads us to Rule 7 for implementing deadlock free accesses to shared resources.

Rule 7

When the access to the shared structure does not involve any communication primitives that are blocking, then the use of semaphores can be replaced be the creation of section run in high-priority mode. If, however, communication primitives are an integral part of the access to the shared resource, then semaphores must be employed.

7-6 Semaphores and Locks

We have been using semaphores until now by assuming a superficial knowledge of their inner working. Now is a good time to discuss semaphores and locks in general. Semaphores are locks. A lock is usually associated with a resource, such as shared data, and is a mechanism for preventing parallel tasks or processes to access that resource in parallel, and in particular, to interleave their access. Since the parallel accesses to the resource are prohibited, another way of defining the lock is to say that it enforces serialization of access.

For a lock to work properly, it must exhibit three essential properties: safety, fairness, and being live.

Safety

We say that a lock is safe if the resource it protects can never be left in an indeterminate state. If N parallel tasks access the shared resource protected by a safe lock, then the result of the N accesses must be predictable. We may not be able to know the order in which the tasks are granted access to the resource, nor the intermediate state of the resource during the accesses, but the final state must be predictable.

For example, assume that two tasks, T1 and T2, use a safe lock to access a counter count, following the algorithm below:

S1 lock(count) S2 localcounter <-- count S3 count <-- count+1 S4 unlock(count)

Assume furthermore that count contains 11 when the two tasks access it in parallel and modify it. Which task will first be granted access to count is unknown, and hence we do not know which task will read the value 11 and which will read the value 12, but we can be sure that after both tasks are done, count will contain the value 13. We refer to the uncertainty that may exist during the parallel access of the locked variable as underterminism. The fact that the contents of count after the two accesses is known is referred to as determinism.

Fairness

We may not know the order in which the tasks will be granted access to the shared resource, but we want all of them to be granted access eventually, and we want all tasks to be granted access once before any tasks can be granted their second access. If a lock exhibits this property, it is said to be fair.

Being live

The last property that a lock should support is related to fairness. The lock must be live, so that a task or process cannot be asked to wait indefinitely on a lock. The idea is that if a task has requested access to the resource by trying the lock and found the resource locked, then there shouldn't be a possibility for that task to be denied access to the resource by other tasks that may either currently be waiting on the same lock, or that may arrive in the future. Usually, if fairness is implemented, locks can be made live relatively easily.

Semaphores are safe, fair and live locks, with two operations SemP and SemV controlling their locking and unlocking, respectively. The transputer does not support semaphores in hardware, and the SemP and SemV operations are only provided as facilities by Logical Systems. Although the exact details of their implementation is proprietary and protected, we can safely make several assumptions about how the three properties have been implemented.

First the SemP operation must test the status of the Semaphore variable, and if it is in the unlock state, it must lock it, and associate the successful result of the test to that task. Because the transputer does not support such Test-and-Set operations, LS C must use high priority to prevent interleaving of these actions among several tasks.

Second, if a task attempts to lock a semaphore and finds it locked, it should not be allowed to continue and should stop until it can be granted the locked resource. Blocking the task or forcing it to reschedule is a simple solution in this case. Because the lock must be fair, a separate queue of the tasks that are waiting on the same resource is required. This queue should contain only tasks that have tried to lock that resource, and should be maintained in a first-come-first-serve basis to implement fairness and liveliness. Therefore each semaphore is associated with a queue of inactive tasks that are waiting to access the shared resource.

The SemV operation is performed when the task that had gain control of the shared resource is releasing its priviledged access. If other tasks had tried the lock during the critical section, then one of them must be selected and given the control of the lock. Since we assume that a queue is keeping track of them, then a dequeue operation is performed and the task put back on the list of active tasks.


EXERCISES


 
7-4

Sketch out a pseudo code description of the internal operations of SemP and SemV. In particular, explore some of the ways that can be used for SemP, which is a function called by a task, to block that task and enqueue it in a separate queue. In particular, address the following points.

  • How can SemP "block" the task that just called it?

  • Describe a possible data structure for the queue associated with each semaphore. What should each element of the queue contain?

  • Assuming that some application is using only one semaphore and multiple tasks, show a block diagram representing the transputer queues containing the active and inactive tasks, and the queue associated with the semaphore. Are these queues three separate entities?

.

 

 
7-5
Implement your own rudimentary semaphore facility, with two functions, MySemP and MySemV, for locking the semaphores, and for releasing them. In this scheme, a semaphore will simply be a collection of two integers acting as the tickets used at some grocery stores. You may have encountered this system at your supermarket. You arrive at the deli counter and find a dispenser of tickets, each one printed with a number larger than that of the previous ticket by one. You grab a ticket and wait for the clerk to call out your number. Upon finishing serving the current client, the clerk calls out the next number in sequence.

Just as with the deli system where you keep checking your number against that of the currently served client, your semaphore system will keep tasks waiting on the semaphore circulating in the round-robin queue of active tasks, and allowing them to check their numbers from time to time.

Identify the role of the two integers.

Give an algorithmic description of the MySemP and MySemV operations, and specify how they control the two integers.

How do you avoid tasks waiting on a semaphore from wasting too much processing power?

Show that your system implements a locking of semaphores that is safe, live, and fair.

Implement your system in LS C and test it. Verify that it works!


 


7-7 Deadlocks in Communication

We now move to the problem of deadlocking due to the routing of packets in the network. Examples 1 and 2 at the beginning of this chapter showed two examples of deadlocks. Many conditions can bring an application to a stop due to communication deadlocks. Here are some of them.

These are but a few examples of possible deadlock conditions. Does it mean that each one must be addressed separately, and that we must have design rules for our programs that take care of all possible cases? Fortunately, the answer is no.

For the first case explored above, the error is often a programmer's mistake and can be detected with the help of a debugger (the tasks will be both shown as blocked on a ChanIn or ChanOut operation), or by a sophisticated deadlock detection scheme.

The second and third cases are related to each other, in the sense that the failure in both cases is created by the fact that all the tasks create a cycle of requests, and that they lock each other in a system that is closed, so that no single task can exit the deadlock condition until the others in the cycle do.

This cycle formation is the major source of deadlocks in communication, and we will see that several solutions are available for preventing them.

Deadlock condition

Deadlocks may appear in communication structures containing at least one cycle.

First, let's define what we mean by cycles. To return to Example 1 at the beginning of the chapter, we can represent the computation/communication structure of the program as a graph, where the tasks become the nodes of a graph, and their interdependency introduced by the ChanIn/ChanOut primitives become the arcs linking the nodes. This graph is often referred to as a wait-for graph [FEIT91], since it shows the relationship between tasks waiting on one another. Figure 7-6 shows this graph for the Compute/Relay pair.

Figure 7-6: cycle introduced in Compute-Relay example.

Indeed, the figure contains a cycle including the Compute and Relay tasks. Example 2 also exhibits a cycle. Although the actual factor causing the deadlock is different in both cases, the reason why the deadlock can occur lies in the presence of the cycle. It is this type of cycle that we will have to detect or prevent. Of the two, prevention is the simplest method to use, since it can be done staticly, at design time. Detection on the other hand implies recovery, which assumes that the parallel application can be subjected to deadlock and that it will have to detect them, break them, and restart applications without, one hopes, any loss of data. We will not cover this dynamic aspect of deadlock recovery, which is beyond the scope of this book. If you are interested in more information on this subject, refer to[CHAN83], [ELM86], [KNAP88] and [NATA86].




[Previous] [HOME] [NEXT]