next up previous
Next: Conclusion and future work Up: A Distributed LISP-STAT Environment Previous: Encoding data

Example

Having implemented the communications primitives, we were in a position to investigate some form of distribution. As a useful demonstrator and tutorial application, we chose to implement a farm model. A master process is given a list of work that needs to be done. It then waits for requests from workers, and delivers each worker an item from the list. The worker then performs the task that is required using the supplied data, and then responds with the result. The master process then collects the results into a list and returns.

This is a relatively straightforward method of distributing work, and for certain applications give useful speedup. The communications overhead is typically very low, since the problem is split up into independent tasks, requiring minimal data transfer at the beginning and end. Course grained computations like this, with little synchronisation, are ideal for distribution.

The example used was the estimation of $\pi$ by picking random points in a unit square, and testing if the points lie within an enclosed quadrant of a circle of unit radius. The square can be divided up into smaller regions and each region assigned to a process. Hence the only communications are sending the coordinates of the range and returning a ratio of points in the circle to points out of the circle. Another method is to assign a number of iterations to each processor, letting each processor pick random points across the whole square. This example is well-suited to distribution.

It is evident that this system cannot perform well for distributed applications where there is a substantial amount of communication between separate processes. As recognised previously, applications that perform well using this system are those which involve a high degree of computation that can be done in parallel. There are many statistical applications that match this model, typically involving intensive simulation. For example, estimating p-values using Monte Carlo methods, Gibbs sampling using multiple Markov-chains, computer-intensive resampling methods, and jackknife methods.


next up previous
Next: Conclusion and future work Up: A Distributed LISP-STAT Environment Previous: Encoding data
Danius Michaelides
6/1/1998