Succinct Binary Rank
A 0 0 1 1 1 0 1 1 0 1
0 1 2 3 4 5 6 7 8 9
QUICK EXERCISE. Consider a block of size k. This block
consists of k bits.
• How many possible blocks of size k are there in total?
• How many bits are required to store the rank of one entry
within a block of size k?
• How many possible inputs to the rank () function are there,
1
assuming we only calculate rank relative to a block of size k?
• Suppose we create a look-up table containing the answer to
every possible rank query in every possible block. How small
does k have to be to ensure that the total size of this lookup
table is o(n)?
1/10
Succinct Binary Rank
Answer
• There are 2k possible binary sequences of length k.
• A rank can take on any value from 1 to k. Storing one rank
requires ⌈lg(k +1)⌉ bits.
• Since the block has size k, there are k possible inputs to the
rank () function, namely 1,...,k.
1
• The space required by this look-up table is
2k ×k ×⌈lg(k +1)⌉.
We require a value of k such that this expression is in o(n).
2/10
Succinct Binary Rank
It is sufficient to ensure that 2k ∈ O(n1−ϵ). For example,
k = 1 lg(n) results in a total size of
2
√ 1 (cid:24) (cid:18) 1 (cid:19)(cid:25)
n× lg(n)× lg lg(n)+1 ∈ o(n).
2 2
Aside: this technique of storing every block × every answer is
called the four Russians technique.
Observations so far:
• From the top down, we need blocks of size k ∈ ω(log(n)).
• From the bottom up, we need blocks of size k < lg(n).
Clearly we can’t have both at once.
Final attempt: create multiple layers of blocks.
3/10
Succinct Binary Rank
Top layer. Split up the overall array of length n into blocks of size
(cid:6) lg2(n)(cid:7)
.
Precompute the 1-rank of each entry a that is the first entry in a
i
block.
Denote the 1-rank of the entry at the start of block j by r′(j).
There are n/(cid:6) lg2(n)(cid:7) r′(j) values in total
block 0 block 1 block 2 block 3
r′(0) r′(1) r′(2) r′(3)
A
0 lg2(n) 2lg2(n) 3lg2(n) n−1
Since the length of the blocks is
(cid:6) lg2(n)(cid:7)
∈ ω(log(n)), the total
space required to store the r′(j) values is o(n).
4/10
Succinct Binary Rank
Middle layer. Take each block and split it up into subblocks of
size
(cid:6)1 lg(n)(cid:7)
.
2
For each entry a that is the first entry in a subblock, precompute
i
the 1-rank of a relative to the first entry in a ’s block.
i i
Denote the relative rank of the kth subblock within the jth block
by r′′(k,j).
r′(j)
subblock 0 subblock 1 subblock 2
r′′(0,j) r′′(1,j) r′′(2,j)
block j
0 1 2lg(n) lg(n)
indexrelativetostartofblockj
5/10
Succinct Binary Rank
QUICK EXERCISE. Approximately how many subblocks are
there in A? How many bits are required to store one r′′(k,j)
value? Show that only o(n) bits are required to store all of the
r′′(k,j) values.
Answer
• To within rounding annoyances, there are 2n/lg(n) subblocks
in total, and therefore there are 2n/lg(n) r′′(k,j) values to be
stored.
• One r′′(k,j) value is a number between 1 and lg2(n). It
requires
(cid:6) lg(lg2(n)+1)(cid:7)
∈ Θ(lglg(n)) bits.
• The total number of bits for all of the r′′(k,j) values is
(cid:18) (cid:19)
2n lglg(n)
×Θ(lglg(n)) = Θ n× = o(n).
lg(n) lg(n)
6/10
Succinct Binary Rank
Bottom layer. Create a lookup table containing every possible
subblock of size 1 lg(n) and every possible rank query. As
2
previously shown, this entire lookup table requires o(n) space.
Let s be the sequence of 1 lg(n) bits that make up a particular
2
subblock. Denote the 1-rank of an index ℓ within this particular
subblock by r′′′(s,ℓ).
QUICK EXERCISE. Now put it all together. Suppose rank (i) is
1
called on this data structure. How would you determine the answer
in O(1) time?
7/10
Succinct Binary Rank
Answer
• Let j =
i/(cid:6) lg2(n)(cid:7)
. Then j is the number of the block
containing i. Look up r′(j).
• Let k′ = i mod (cid:6) lg2(n)(cid:7) . Then k′ is the index of i relative to
the first entry in block j.
• Let k =
k′/(cid:6)1 lg(n)(cid:7)
. Then k is the number of the subblock
2
containing i. Look up r′′(k,j).
• Let ℓ = k′ mod (cid:6)1 lg(n)(cid:7) . Then ℓ is the index of i relative to
2
the first entry in subblock k of block j.
• Let s be the sequence of binary digits that make up subblock
k of block j. Look up r′′(s,ℓ).
• Finally, rank (i) = r′(j)+r′′(k,j)+r′′′(ℓ,s).
1
8/10
Succinct Binary Rank
Time and space?
• Since there are O(1) calculations and table look-ups, this
implementation of rank () is O(1).
1
• Since the three lookup tables each require only o(n) space,
the total auxiliary space required is o(n).
• Consequently, this is a succinct data structure that allows for
an O(1) implementation of rank ().
1
A succinct data structure that supports an O(1) implementation of
select () can be constructed using “similar” techniques (for a
1
generous definition of “similar”).
9/10
Application: Succinct Encoding of Binary Trees
Suppose we want to encode the structure (and not the contents)
of a binary tree with n nodes. How could we do so using a
sequence of bits?
A node can have two children, a left child only, a right child only,
or no children (4 options in total). We could encode each scenario
using a string of 2 bits, then list them in order row by row from
left to right.
10/10