Why does wait(100) cause synchronized method to fail in multi threaded?

Tags: ,



I am referencing from Baeldung.com. Unfortunately, the article does not explain why this is not a thread safe code. Article

My goal is to understand how to create a thread safe method with the synchronized keyword.

My actual result is: The count value is 1.

package NotSoThreadSafe;

public class CounterNotSoThreadSafe {
  private int count = 0;
  public int getCount() { return count; }
  // synchronized specifies that the method can only be accessed by 1 thread at a time.
  public synchronized void increment() throws InterruptedException { int temp = count; wait(100); count = temp + 1; }
}

My expected result is: The count value should be 10 because of:

  1. I created 10 threads in a pool.
  2. I executed Counter.increment() 10 times.
  3. I make sure I only test after the CountDownLatch reached 0.
  4. Therefore, it should be 10. However, if you release the lock of synchronized using Object.wait(100), the method become not thread safe.
package NotSoThreadSafe;

import org.junit.jupiter.api.Test;

import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

import static org.junit.jupiter.api.Assertions.assertEquals;

class CounterNotSoThreadSafeTest {
  @Test
  void incrementConcurrency() throws InterruptedException {
    int numberOfThreads = 10;
    ExecutorService service = Executors.newFixedThreadPool(numberOfThreads);
    CountDownLatch latch = new CountDownLatch(numberOfThreads);
    CounterNotSoThreadSafe counter = new CounterNotSoThreadSafe();
    for (int i = 0; i < numberOfThreads; i++) {
      service.execute(() -> {
        try { counter.increment(); } catch (InterruptedException e) { e.printStackTrace(); }
        latch.countDown();
      });
    }
    latch.await();
    assertEquals(numberOfThreads, counter.getCount());
  }
}

Answer

This code has both of the classical concurrency problems: a race condition (a semantic problem) and a data race (a memory model related problem).

  1. Object.wait() releases the object’s monitor and another thread can enter into the synchronized block/method while the current one is waiting. Obviously, author’s intention was to make the method atomic, but Object.wait() breaks the atomicity. As result, if we call .increment() from, let’s say, 10 threads simultaneously and each thread calls the method 100_000 times, we get count < 10 * 100_000 almost always, and this isn’t what we’d like to. This is a race condition, a logical/semantic problem. We can rephrase the code… Since we release the monitor (this equals to the exit from the synchronized block), the code works as follows (like two separated synchronized parts):
    public void increment() { 
        int temp = incrementPart1(); 
        incrementPart2(temp); 
    }
    
    private synchronized int incrementPart1() {
        int temp = count; 
        return temp; 
    }
    
    private synchronized void incrementPart2(int temp) {
        count = temp + 1; 
    }

and, therefore, our increment increments the counter not atomically. Now, let’s assume that 1st thread calls incrementPart1, then 2nd one calls incrementPart1, then 2nd one calls incrementPart2, and finally 1st one calls incrementPart2. We did 2 calls of the increment(), but the result is 1, not 2.

  1. Another problem is a data race. There is the Java Memory Model (JMM) described in the Java Language Specification (JLS). JMM introduces a Happens-before (HB) order between actions like volatile memory write/read, Object monitor’s operations etc. https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.5 HB gives us guaranties that a value written by one thread will be visible by another one. Rules how to get these guaranties are also known as Safe Publication rules. The most common/useful ones are:

So, to have the counter correctly (as JMM has defined) visible, we must make it volatile

private volatile int count = 0;

or do the read over the same object monitor’s synchronization

public synchronized int getCount() { return count; }

I’d say that in practice, on Intel processors, you read the correct value without any of these additional efforts, with just simple plain read, because of TSO (Total Store Ordering) implemented. But on a more relaxed architecture, like ARM, you get the problem. Follow JMM formally to be sure your code is really thread-safe and doesn’t contain any data races.



Source: stackoverflow