Skip to content
Advertisement

Is replacing one java object reference with another considered thread safe, or are there potential synchronicity issues I’m overlooking?

I’m rolling my own simple caching solution to improve the UX of a lengthy lookup. The basic outline is that I have a class handling all accountId lookups, and another class called AccountListCache with two variables: a timestamp indicating when it was created, and a List object containing a whole bunch of accountId objects. When the lookup is invoked, that class will first check if its cache object has a timestamp from within the past X minutes. If so, it serves the cached results, and if not it refreshes the cache and serves the new list.

My plan is to have a scheduled task that regularly refreshes the cache by creating a new instance of the AccountListCache, populating its list/timestamp, and then re-assigning the variable in the lookup class. My boss brought up concerns about thread safety, but I’m thinking this should be thread safe already.

The old list and the new list are separate objects contained within separate instances of the same class, and the actual update call that matters would just be changing where the reference is pointing in memory. That should be an effectively instantaneous/atomic operation, right?

If it’s not thread safe, what does a failure/conflict look like, and how might I resolve it?

Advertisement

Answer

Yes, since one thread is writing data to be read by other threads, there are concurrency issues here.

While writing and reading object references are atomic, in the sense that “tearing” is guaranteed not to occur (unlike double or long types), there is no guarantee that writes will ever be visible to other threads unless a memory barrier is used.

In this case, at a minimum, I would make the AccountListCache field in your lookup class volatile. This will ensure that any state of the cache modified before it is assigned to the field is visible to any thread that reads the field.


Update:

… I’d like to understand both why it fails and how, because I suspect I’m about to face trickier problems in the [near] future.

The authoritative answer to these questions is the Java Language Specification, §17.4 First, you should understand the concepts of “synchronizes-with” (§17.4.4) and “happens-before.” (§17.4.5) Then you can refer to the list of synchronization points and happens-before relationships to design concurrent interactions that behave correctly.

In your case, one thread (“the writer”) needs to to populate a list, then assign it to a field of another object, and have another thread (“the reader”) read that field and access the list. Using the happens-before relationships in §17.4.5, we can construct this (partial) chain (where “hb(x, y)” means that x happens-before y):

  • If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
    • x—the cache is initialized (writer)
    • y—the reference to the cache is written to a field (writer)
  • If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
    • x—the reference to the cache is read from a field (reader)
    • y—the content of the cache is accessed (reader)

You might think this is sufficient. But note the key words “of the same thread.” This permits operations to be re-ordered if they do not change the view of the world from that thread. In your case, perhaps the field referencing the cache might be assigned before fresh items are added to the new cache instance. That change wouldn’t be visible inside the thread, but other threads could start reading the cache before it’s populated.

When you make the cache field volatile, you introduce a “synchronizes-with” relationship between the write and subsequent reads, because “[a] write to a volatile variable v (§8.3.1.4) synchronizes-with all subsequent reads of v by any thread.”

By making the field volatile, we can add more steps to the chain:

  • If an action x synchronizes-with a following action y, then we also have hb(x, y).
    • x—the cache is assigned to the volatile field (writer)
    • y—the cache is read from the volatile field (reader)
  • If hb(x, y) and hb(y, z), then hb(x, z).
    • x—the cache is initialized (writer)
    • y—the cache is assigned (writer)
    • z—the cache is read (reader)

Therefore, we have proved that the cache is initialized by the writer before it is read by the reader. Without the synchronizes-with relationship, that happens-before relationship is missing.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement