D. Hutchinson, A. Maheshwari, J.-R. Sack, and R. Velicescu, Early
Experiences in Implementing the Buffer Tree, Workshop on Algorithmic
Engineering (WAE'97), Venice, Italy, September 1997.
Computer processing speeds are increasing rapidly due to the evolution
of faster chips, parallel processing of data, and more efficient software.
Users today have access to an unprecedented amount of high quality, high
resolution data through various technologies. This is resulting in a growing
demand for higher performance input and output mechanisms in order to pass
huge data sets from the {\em external memory~(EM)}, or disk system, through
the relatively small main memory of the computer and back again. In recent
years, research into external memory algorithms has been growing to keep
pace with the demand for innovation in this area. EM algorithms for individual
problems have been developed but few general purpose EM tools have been
designed. A fundamental tool is the buffer tree, an external version of
the (a,b)-tree. It can be used to satisfy a number of EM requirements such
as sorting, priority queues, range searching, etc. in a straightforward
and I/O-optimal manner. In this paper we describe an implementation of
a buffer tree. We describe benchmarking tests which lead to an experimental
determination of certain parameter values different from those originally
suggested in the design of the data structure. We describe implementations
of two algorithms based on the buffer tree: an external memory treesort,
and an external memory priority queue. Our initial experiments with buffer
tree sort for large problem sizes indicate that this algorithm easily outperforms
similar algorithms based on internal memory techniques. With some tuning
of the buffer tree parameters we are able to obtain performance consistent
with theoretical predictions for the range of problem sizes tested. We
include comparisons with TPIE Merge Sort. We conclude that (a) the buffer
tree as a generic data structure appears to perform well in theory and
practice, and (b) measuring I/O efficiency experimentally is an important
topic that merits further attention.