Abstract:

W. Dittrich, D. Hutchinson and A. Maheshwari, “Blocking in Parallel Multisearch Problems”, Theory of Computing Systems, 34(2): 145-189, 2001 (invited papers from the 1998 ACM-SPAA conference).
 

External memory (EM) algorithms are designed for computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. Block-wise access to data

is a central theme in the design of efficient EM algorithms. A similar requirement arises in the transmission of data between processors in high performance parallel computation systems, for which block-wise communication is a crucial issue.

 

We consider {\em multisearch} problems, where a large number of queries are to be simultaneously processed and satisfied by navigating through large data structures on parallel computers. Our examples originate as algorithms for parallel machines, and we adapt them to the EM situation where the queries and data structure are considered to be much larger than the size of the available internal memory.

 

This paper presents techniques to achieve blocking for I/O as well as for communication in multisearch on the BSP and EM-BSP models. We describe improvements to the \emph{1-optimal} BSP* multisearch algorithm of \cite{BDM97} which permit larger search trees to be handled.

 

In the area of EM algorithms new algorithms for multisearch in balanced trees are described. For search trees of size $O(n\log n)$ where $n$ is the number of queries, we obtain a work-optimal, parallel, randomized EM multisearch algorithm whose I/O and communication time are smaller, asymptotically, than the computation time. We obtain a deterministic version via a similar technique. These algorithms are obtained via the simulation techniques of \cite{DDH97, dittrich:thesis, DDHM99, DDHM99a, hutchinson:thesis}.

 

For larger trees we describe a parallel, EM algorithm which is \emph{1-optimal} considering computation, communication, and I/O (for suitable parameter constellations) plus I/O-optimal.  We give a lower bound to the number of I/O operations required for filtering $n$ queries through a binary or multi-way search tree of size $m$ when $m\geq n^{2+\epsilon}$, for a constant $\epsilon>0$.