Skip to content

Longest common prefix in binary representation

We are given a undirected tree with N (1 to N) nodes rooted at node 1. Every node has a value assigned with it, represented by array – A[i] where i:[1:N].

We need to answer Q queries of type : -> V X : longest length of the common prefix between value V and any ancestor of node X including X, in their binary representation of 62-bit length.

Common prefix between 2 numbers is defined as:

Example :

4: 0..................0100 (62-bit binary representation)
6: 0..................0110 
Considering both as 62-bit in it's binary representation. 
Longest length of the common prefix is: 60 (as 60 left most bits are same.)

Now we are given the N (num nodes), edges, nodes values (A[i]) and queries, and we need to answer each query in optimal time.

Constrains :

N <= 10^5, number of nodes 
A[i] <= 10^9, value of each node
Q <= 10^5 ,number of queries
Edge[i] = (i, j) <= N

Approach :

  1. Create tree and track the immediate parent of each node.
  2. for Each Query : [V, X], traverse each node n(in the path from X to root) and XOR each node’s values with V and find the most significant set bit for each of the XOR operation and pick the minimum one among all of them.
  3. So the result for Query : [V, X] : 62 – (1 + Step-2 result).

Is there any other efficient way to solve this problem? As the above approach in worst case takes O(n^2) time.



You can solve this problem in O((N+Q) log N) time using fully persistent binary search trees.

A “persistent” data structure is one that preserves the previous version when it’s modified. “Fully persistent” means that the previous versions can be modified as well. Often, fully persistent data structures are implemented as purely functional data structures.

You need a binary search tree. The classic example is Okasaki’s red-black trees, but you could port any BST implementation from any purely functional language.

With this kind of data structure, your problem is easy to solve.

  1. Create a singleton tree for the root that contains only the root value.
  2. For each child, create a new version from its parent by adding the child’s value.
  3. Continue in BFS or DFS order until you have a version of the tree for every node that contains all of its ancestor’s values. This will require O(N log N) space and time all together.
  4. For each query [v,x], then, get the tree for node x and find the largest value <= x, and the smallest value >= x. This takes O(log N) time.
  5. The ancestor with the longest common prefix will be one of the the values you found. Check them both by XORing them with v and choosing the smallest result. Then binary search (or some faster bit-hack method) to find the position of the left-most bit.

NOTE: The above discussion assumes that you meant it when you said “we need to answer each query in optimal time”.

If you can process the queries out of order, then you don’t need persistent trees. You can just use a single regular BST that you’d find in your language library, because you don’t need all the trees at once.

Walk though the graph in pre-order, adjusting the tree for each node as you find it, and then process all the queries that specifically target that node.

8 People found this is helpful