I’m monitoring a production system with AppDynamics and we just had the system slow to a crawl and almost freeze up. Just prior to this event, AppDynamics is showing all GC activity (minor and major alike) flatline for several minutes…and then come back to life.
Even during periods of ultra low load on the system, we still see our JVMs doing some GC activity. We’ve never had it totally flatline and drop to 0.
Also – the network I/O flatlined at the same instance of time as the GC/memory flatline.
So I ask: can something at the system level cause a JVM to freeze, or cause its garbage collection to hang/freeze? This is on a CentOS machine.
Advertisement
Answer
Does your OS have swapping enabled.
I’ve noticed HUGE problems with Java once it fills up all the ram on an OS with swapping enabled–it will actually devistate windows systems, effictevly locking them up and causing a reboot.
My theory is this:
- The OS ram gets near full.
- The OS requests memory back from Java.
- This Triggers Java into a full GC to attempt to release memory.
- The full GC touches nearly every piece of the VMs memory, even items that have been swapped out.
- The system tries to swap data back into memory for the VM (on a system that is already out of ram)
- This keeps snowballing.
At first it doesn’t effect the system much, but if you try to launch an app that wants a bunch of memory it can take a really long time, and your system just keeps degrading.
Multiple large VMs can make this worse, I run 3 or 4 huge ones and my system now starts to sieze when I get over 60-70% RAM usage.
This is conjecture but it describes the behavior I’ve seen after days of testing.
The effect is that all the swapping seems to “Prevent” gc. More accurately the OS is spending most of the GC time swapping which makes it look like it’s hanging doing nothing during GC.
A fix–set -Xmx to a lower value, drop it until you allow enough room to avoid swapping. This has always fixed my problem, if it doesn’t fix yours then I’m wrong about the cause of your problem 🙂