Apache Ignite: Caches unusable after reconnecting to Ignite servers



I am using Apache Ignite as a distributed cache and I am running into some fundamental robustness issues. If our Ignite servers reboot for any reason it seems like this breaks all of our Ignite clients, even after the Ignite servers come back online.

This is the error the clients see when interacting with caches after the servers reboot and the clients reconnect:

Caused by: org.apache.ignite.internal.processors.cache.CacheStoppedException: Failed to perform cache operation (cache is stopped): <redacted>

My expectation is that the Ignite clients would reconnect to the Ignite servers and continue working once the servers are online. From what I’ve read thick clients should do this, but I don’t see this happening. Why is the cache still considered to be stopped?

We are using Ignite 2.7.6 with Kubernetes IP finder.

Answer

Looks like you are using a stale cache proxy.
If you are using an in memory-cluster, and created a cache dynamically from a client, then the given cache will disappear when the cluster restarts.

The following code, executed from a client against an in-memory cluster, will generate an exception when the cluster restarts, if the cache in question is not part of a server config, but created dynamically on the client.

       Ignition.setClientMode(true);
       Ignite = Ignition.start();

       IgniteCache cache = ignite.getOrCreateCache("mycache"); //dynamically created cache


        int counter = 0;
        while(true) {
            try {
                cache.put(counter, counter);
                System.out.println("added counter: " + counter);
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

generates

java.lang.IllegalStateException: class org.apache.ignite.internal.processors.cache.CacheStoppedException: Failed to perform cache operation (cache is stopped): mycache
    at org.apache.ignite.internal.processors.cache.GridCacheGateway.enter(GridCacheGateway.java:164)
    at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.onEnter(GatewayProtectedCacheProxy.java:1555)

You need to watch for disconnect events/exceptions

see: https://ignite.apache.org/docs/latest/clustering/connect-client-nodes

IgniteCache cache = ignite.getOrCreateCache(cachecfg);

try {
    cache.put(1, "value");
} catch (IgniteClientDisconnectedException e) {
    if (e.getCause() instanceof IgniteClientDisconnectedException) {
        IgniteClientDisconnectedException cause = (IgniteClientDisconnectedException) e.getCause();

        cause.reconnectFuture().get(); // Wait until the client is reconnected.
        // proceed

If this is a persistent cluster consisting of multiple baseline nodes, you should wait until the cluster activates.
https://ignite.apache.org/docs/latest/clustering/baseline-topology

  while (!ignite.cluster().active()) {
      System.out.println("Waiting for activation");
      Thread.sleep(5000);
  }

After re-connect you might need to reinitialize your cache proxy

       cache = ignite.getOrCreateCache(cachecfg); 
}   


Source: stackoverflow