Skip to content
Advertisement

Counting Inversions using TreeSet Java

I’m using TreeSet for solving counting inversions problem. I’m using following approach which uses gnu_pbds which works in O(logn) time.

Algorithm

  1. Insert the first element of the array in the Ordered_Set.
  2. For all the remaining element in arr[] do the following:
  • Insert the current element in the Ordered_Set.
  • Find the number of element strictly less than current element + 1 in Ordered_Set using function order_of_key(arr[i]+1).
  • The difference between size of Ordered_Set and order_of_key(current_element + 1) will given the inversion count for the current element.

You can read more about this algorithm here.

For order_of_key method I’m using TreeSet‘s headset(k) method for calculation.

My code

JavaScript

Respective C++ Code

JavaScript

Observations

  • my solution works for all test cases but for some trivial test cases it returns the 1- inversion count. If I changed the working of order_of_key to the following it breaks some other test cases.
JavaScript
  • but the gnu_pbds works for all the time.

Please help me fix this code using TreeSet. Also if I’m missing the some another points respect to pbds order_of_key method then let me know.

Advertisement

Answer

The order_of_key function returns the number of elements less than the provided key. In your C++ code, you add the element to the set and then call order_of_key element plus one, in order to include the element you just added.

Preserving this arrangement in the Java code, the way to implement something like order_of_key faithfully using a Java TreeSet is to call

JavaScript

or simply

JavaScript

because either of these returns the size of the head set containing elements strictly less than k. There’s no need to do a contains-check first. When I run this using the input array from your C++ code, I get the same intermediate results and final result for the number of inversions: 6.

Note that Java’s TreeSet also includes the ability to create the tail-set, that is, the elements that are greater than the provided element. This turns out to be (in my opinion) a much simpler way to compute the number of inversions. Prior to adding an element, the size of the tail-set that is larger than the current element is the number of inversions relative to this element. Since we want the tail-set to be strictly larger than the current element, we use tailSet(k, false). We can then do this for each element in the input list:

JavaScript

UPDATE 2020-06-24

The above code works only for input with unique values. If the input has duplicate values, it doesn’t work. I note that in the C++ code, a tree that uses the less_equal<int> comparison function preserves duplicates, while using less<int> compresses out duplicates.

The reason that keeping duplicates is important is that each element — even if it’s a duplicate — can count as an inversion. Thus, input of [2, 2, 1] is considered to have two inversions. A Java TreeSet compresses out duplicates, so we have to do some additional work to preserve them.

One way to allow “duplicate” int values is to make them unique somehow. This can be done by creating a new object to contain the int value paired with a counter that’s always incremented. Duplicate values are made unique since they’ll have different counter values. Here’s a class that does that:

JavaScript

Note that the compareTo method here compares both the value and the “uniq” counter value, so creating multiple UniqInt instances will differ from each other and will be totally ordered. Once we have this class, we essentially do the same thing, except that we keep track of UniqInt instead of Integer objects in the TreeSet.

JavaScript

For the input provided in the comment (which includes duplicate values),

84, 2, 37, 3, 67, 82, 19, 97, 91, 63, 27, 6, 13, 90, 63, 89, 100, 60, 47, 96, 54, 26, 64, 50, 71, 16, 6, 40, 84, 93, 67, 85, 16, 22, 60

this gives the expected result of 290.

This is a bit fragile, however. Note that creating a new UniqInt with the same value always creates an instance that is greater than any existing one because the counter is always incremented. (Until you’ve created 2^31 of them.) When the new instance is created, its tail-set will never include any of the duplicate values already in the TreeSet. This is probably reasonable for this small example, but if this were part of a larger system, I’d think more carefully about how to get the correct head-set or tail-set relative to some value, without having to rely on the most recently created UniqInt being greater than all previous ones.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement