Abstract:

D. Hutchinson, A. Maheshwari, J.-R. Sack, and R. Velicescu, Early Experiences in Implementing the Buffer Tree,  Workshop on Algorithmic Engineering (WAE'97), Venice, Italy, September 1997.
 

Computer processing speeds are increasing rapidly due to the evolution of faster chips, parallel processing of data, and more efficient software. Users today have access to an unprecedented amount of high quality, high resolution data through various technologies. This is resulting in a growing demand for higher performance input and output mechanisms in order to pass huge data sets from the {\em external memory~(EM)}, or disk system, through the relatively small main memory of the computer and back again. In recent years, research into external memory algorithms has been growing to keep pace with the demand for innovation in this area. EM algorithms for individual problems have been developed but few general purpose EM tools have been designed. A fundamental tool is the buffer tree, an external version of the (a,b)-tree. It can be used to satisfy a number of EM requirements such as sorting, priority queues, range searching, etc. in a straightforward and I/O-optimal manner. In this paper we describe an implementation of a buffer tree. We describe benchmarking tests which lead to an experimental determination of certain parameter values different from those originally suggested in the design of the data structure. We describe implementations of two algorithms based on the buffer tree: an external memory treesort, and an external memory priority queue. Our initial experiments with buffer tree sort for large problem sizes indicate that this algorithm easily outperforms similar algorithms based on internal memory techniques. With some tuning of the buffer tree parameters we are able to obtain performance consistent with theoretical predictions for the range of problem sizes tested. We include comparisons with TPIE Merge Sort. We conclude that (a) the buffer tree as a generic data structure appears to perform well in theory and practice, and (b) measuring I/O efficiency experimentally is an important topic that merits further attention.



Last modified: Wed Oct 27 17:30:24 EDT 1999