PARILLU - Parallel simulation of global illumination

Home

Next: Reference Implementation: Hierarchical Radiosity
Prev: On The Partitionability Of Hierarchical Radiosity

Locality Preserving Dynamic Load Balancing With Provably Small Overhead

We describe a kind of an orthogonal recursive bisection clustering. We show that the dynamic adaption of the clustering to changing load situations involves only small overhead. As a spatial clustering it is well suited to applications with local communication.

We will describe a procedure that rebalances a database of geometric objects ``on the fly'' concurrently with an ongoing application. We describe procedures for imbalance detection and for locality preserving rebalancing. The estimated amortized computational time complexity of these procedures is shown to be small for every single dynamic task update.

We consider an application as a set of points in Rk representing tasks and/or data items. We store the points in a binary search tree of limited depth with p leaves. Let us interpret the inner nodes as scales measuring the imbalance between the two children. If we could guarantee that at any time every inner node of the tree is balanced, then all processors get roughly the same amount of objects. We present a distributed imbalance detector that signals when the tree needs to be repartitioned. We describe a procedure that is applied recursively to all inner nodes of the tree and is responsible for rebalancing all inner nodes. We will examine the complexity of the above mechanisms for detecting and rebalancing an imbalanced distribution of objects. Fortunately, we will see that, after rebalancing is complete, several updates are possible without any rebalancing being necessary. We quantify this number and charge the communication cost of a rebalancing operation to the updates. As a result we show that every single update has a small amortized time complexity.

We have tested the load balancing algorithm for a simple application that treats a set of objects in k-dimensional space. We parameterized the application by realistic parameters, such as the load pattern, the number of objects, and the workload per task. We show that for near-practice-parameterizations our load balancer achieves good speedups.

Next: Reference Implementation: Hierarchical Radiosity
Prev: On The Partitionability Of Hierarchical Radiosity

Robert Garmann, February 22, 2002