We see that a 5 ms interval length provides the best execution time, 250.94 seconds, compared to 295.89 seconds for the asymptotic bounds for longer delays.
| 10-4 |
The tuning presented above shows how we optimized the program for one value of N: N=27. We didn't optimize the program for a range of values of N. Tune the program for all the values of N between 12 and 28, and tabulate the results. (Run the program several times for each (N,delay) pair, and record the average execution time.)
N Best update delay
12 ?
...
28 ?
Can you find a relationship between N and the best delays? If so, express it as a function of N that can be implemented in C, and implement it in the N-queen program so that the program selects the best delay as a function of N automatically (if you can't find a mathematically representable relationship, use a look-up table). Verify that your program yields good performance by running it several times for each value of N in the range 12 to 28. . |
| 10-5 |
The right-hand side of the curves in Figure 10-2 seem to indicate that
it is faster to find the first solution for a board of size
N=2n+1 than for a board of size N=2n. If this is
the case, and if the solution happens to have the first queen in the leftmost
column of the top row, then we have the unexpected result that a solution for a
problem of size 2n can be found faster if we first consider the problem
of size 2n+1. The requirement that the first queen be located in the
leftmost column allows us to simply throw it away, and so discarding the first
row and first column. The remaining board area has size N by N,
and contains one queen on each row and each column, forming a solution of size
N. This property was noticed by Stone and Stone [STON86] in their work
on the N-queen problem, when the next position of the queen is not
restricted to be in the next row (see Exercise 9-3), and explored in details by
Rao and Kumar [RAO93]. Modify the program so that it first tries to find a solution where the first queen is in the top left corner of a board of size N by N, when N=2n+1, and investigate if such is the case for all reasonable values of N (i.e. values that will not require an unreasonable amount of time on your transputer system). Can solutions for N=2n be found just as fast when the first queen is forced to be in the upper left-corner as in the general case? Tune your program so that it automatically detects the parity of N and selects the fastest method to find a solution for that board size. Test your program over a reasonable range of values of N and show the improvement brought by your tuning.
|
| 10-6 |
Implement the load-balancing scheme that was introduced in Figure
10-Error! Bookmark not defined.. This second method of load-balancing
is not centralized and requires near-neighbor balancing. First connect all the
nodes by a virtual ring and implement a load-balancing where nodes exchange
information with their left and right neighbors. Measure the performance of
your program for N ranging from 12 to 27, for an average balancing delay
of 1 ms. Tune your program in terms of its load-balancing delay and plot the
resulting best execution times as a function of N. Now change the virtual network from a ring to a toroidal mesh (or a to some other two-dimensional networks in case you do not have a sufficient number of transputers, and repeat the same steps. Compare the results obtained with a ring and with a toroidal mesh.
|
| 10-7 |
Implement the heuristic introduced in Figure 10-4 and compare it, in terms of performance, to the Robin-Hood scheme. |
Its used should also incorporate some of the techniques we saw in previous chapters to maintain high efficiency, such as double buffering, allowing processors to continue doing useful work when packets of work are shifted through the network.