Gerth Stølting Brodal
January 1997
We study the design of efficient data structures. In particular we focus on the design of data structures where each operation has a worst case efficient implementations. The concrete problems we consider are partial persistence, implementation of priority queues, and implementation of dictionaries.
The first problem we consider is how to make bounded in-degree and out-degree data structures partially persistent, i.e., how to remember old versions of a data structure for later access. A node copying technique of Driscoll et al. supports update steps in amortized constant time and access steps in worst case constant time. The worst case time for an update step can be linear in the size of the structure. We show how to extend the technique of Driscoll et al. such that update steps can be performed in worst case constant time on the pointer machine model.
We present two new comparison based priority queue implementations, with
the following properties. The first implementation supports the
operations FINDMIN, INSERT and MELD in worst case
constant time and DELETE and DELETEMIN in worst case time
. The priority queues can be implemented on the pointer
machine and require linear space. The second implementation achieves the
same worst case performance, but furthermore supports DECREASEKEY
in worst case constant time. The space requirement is again linear, but
the implementation requires auxiliary arrays of size
. Our
bounds match the best known amortized bounds (achieved by respectively
binomial queues and Fibonacci heaps). The data structures presented are
the first achieving these worst case bounds, in particular supporting
MELD in worst case constant time. We show that these time bounds
are optimal for all implementations supporting MELD in worst case
time o(n). We also present a tradeoff between the update time and the
query time of comparison based priority queue implementations. Finally
we show that any randomized implementation with expected amortized cost
t comparisons per INSERT and DELETE operation has expected
cost at least
comparisons for FINDMIN.
Next we consider how to implement priority queues on parallel (comparison
based) models. We present time and work optimal priority queues for the
CREW PRAM, supporting FINDMIN, INSERT, MELD, DELETEMIN, DELETE and DECREASEKEY in constant time with
processors. Our implementation is the first supporting all
of the listed operations in constant time. To be able to speed up
Dijkstra's algorithm for the single-source shortest path problem we
present a different parallel priority data structure. With this
specialized data structure we give a parallel implementation of
Dijkstra's algorithm which runs in O(n) time and performs
work on a CREW PRAM. This represents a logarithmic factor improvement for
the running time compared with previous approaches.
We also consider priority queues on a RAM model which is stronger than
the comparison model. The specific problem is the maintenance of a set
of n integers in the range under the operations INSERT, DELETE, FINDMIN, FINDMAX and PRED
(predecessor query) on a unit cost RAM with word size w bits. The RAM
operations used are addition, left and right bit shifts, and bit-wise
boolean operations. For any function f(n) satisfying
, we present a data structure supporting FINDMIN and FINDMAX in worst case constant time, INSERT
and DELETE in worst case O(f(n)) time, and PRED in worst
case
time. This represents the first priority queue
implementation for a RAM which supports INSERT, DELETE and
FINDMIN in worst case time
-- previous bounds were
only amortized. The data structure is also the first dictionary
implementation for a RAM which supports PRED in worst case
time while having worst case
update time.
Previous sublogarithmic dictionary implementations do not provide for
updates that are significantly faster than queries. The best solutions
known support both updates and queries in worst case time
.
The last problem consider is the following dictionary problem over binary strings. Given a set of n binary strings of length m each, we want to answer d-queries, i.e., given a binary query string of length m to report if there exists a string in the set within Hamming distance d of the query string. We present a data structure of size O(nm) supporting 1-queries in time O(m) and the reporting of all strings within Hamming distance 1 of the query string in time O(m). The data structure can be constructed in time O(nm). The implementation presented is the first achieving these optimal time bounds for the preprocessing of the dictionary and for 1-queries. The data structure can be extended to support the insertion of new strings in amortized time O(m)
Available as PostScript,
PDF.