Succinct Data Structures
We have seen trade-offs between space and time: the faster we
want an operation to run, the more space we need.
However, sometimes the opposite is true: we can make our
operations faster by fitting our data into less space.
How small?
There is an information-theoretic lower bound on the number of
bits required to store a piece of data.
Here is an example we are already familiar with. Suppose we want
to store one integer, which could be any value in the universe
U = {0,1,...,u−1}.
Since there are u distinct possible values, we need to be able to
create u distinct binary sequences. Therefore ⌈lg(u)⌉ bits are
required to store any one of these values.
1/8
Succinct Data Structures
Definitions. Let A be a collection of data. Let opt be the
A
information theoretic lower bound on the number of bits required
to store A.
A data structure that is used to store the data A is called:
• compact if it requires O(opt ) bits;
A
• succinct if it requires opt +o(opt ) bits;
A A
• implicit if it requires opt +O(1) bits.
A
Any such data structure must use at least opt bits to store the
A
data. The extra bits may store indices, pointers, or other data
required to support operations.
2/8
Binary Rank and Select
The Scenario
We will consider a very simple universe: U = {0,1}. That is, each
individual piece of data is a single bit.
Let A be a bit array of length n. Assume that A uses n bits of
space, and that any entry A can be accessed in O(1) time. The
i
entries in A are not in sorted order.
Assume that A is static: the contents will not change.
A 0 0 1 1 1 0 1 1 0 1
0 1 2 3 4 5 6 7 8 9
3/8
Binary Rank and Select
Recall the rank and select operations as defined on a dictionary
S ⊆ U.
• For any x ∈ U, rank(x) returns the number of entries y ∈ S
such that y ≤ x.
• select(i) returns the entry y ∈ S that would have index i in a
sorted list of the entries in S.
We will define variants of these operations on the bit array A.
• rank (i) returns the number of 0’s in A ,...,A
0 0 i
• rank (i) returns the number of 1’s in A ,...,A
1 0 i
• select (i) returns the index of the ith 0 in A (if any)
0
• select (i) returns the index of the ith 1 in A (if any)
1
4/8
Binary Rank and Select
A 0 0 1 1 1 0 1 1 0 1
0 1 2 3 4 5 6 7 8 9
QUICK EXERCISE
One way to guarantee O(1) implementations of rank , rank ,
0 1
select and select is to pre-tabulate all of the answers.
0 1
i 0 1 2 3 4 5 6 7 8 9
A 0 0 1 1 1 0 1 1 0 1
i
rank (i)
0
rank (i)
1
select (i)
0
select (i)
1
Complete the above table for the example array A. As a function
of n, how many bits are required by this look-up table?
5/8
Binary Rank and Select
A 0 0 1 1 1 0 1 1 0 1
0 1 2 3 4 5 6 7 8 9
Answer
i 0 1 2 3 4 5 6 7 8 9
A 0 0 1 1 1 0 1 1 0 1
i
rank (i) 1 2 2 2 2 3 3 3 4 4
0
rank (i) 0 0 1 2 3 3 4 5 5 6
1
select (i) − 0 1 5 8 − − − − −
0
select (i) − 2 3 4 6 7 9 − − −
1
6/8
Binary Rank and Select
In general, Θ(n) entries are required, each of which is an integer
from 0 to n. The number of bits is Θ(nlog(n)).
Since only Θ(n) bits are required to store n binary digits, the data
structure in which the array A is augmented with a complete
look-up table is not compact (or succinct).
Question. Can we construct a succinct data structure that
supports binary rank and select in O(1) time?
7/8
Succinct Binary Rank
VERY QUICK EXERCISE. Argue that it is sufficient to
implement only one of rank (i) and rank (i).
0 1
Answer
For any i, rank (i) is the number of 0’s in entries A ,...,A , and
0 0 i
rank (i) is the number of 1’s in entries A ,...,A . It follows that
1 0 i
rank (i)+rank (i) = i +1,
0 1
and therefore (for example)
rank (i) = i +1−rank (i).
0 1
Once we find rank (i), we can also find rank (i) using O(1)
1 0
additional calculations.
8/8