I’m doing some research regarding how HotSpot performs garbage-collection and/or heap-compaction while JNI code is running.
It appears to be common knowledge that objects could be moved at any time in Java. I’m trying to understand, definitively if JNI is subject to effects garbage-collection. There exist a number of JNI functions to explicitly prevent garbage-collection; such as GetPrimitiveArrayCritical
. It makes sense that such a function exists if the references are indeed volatile. However, it makes no sense if they are not.
There seems to be a substantial amount of conflicting information on this subject and I’m trying to sort it out.
JNI code runs in a safepoint and can continue running, unless it calls back into Java or calls some specific JVM methods, at which point it may be stopped to prevent leaving the safepoint (thanks Nitsan for the comments).
What mechanism JVM use to block threads during stop-the-world pause
The above makes me think that garbage-collection is going to run concurrently with JNI code. That can’t be safe, right?
To implement local references, the Java VM creates a registry for each transition of control from Java to a native method. A registry maps nonmovable local references to Java objects, and keeps the objects from being garbage collected. All Java objects passed to the native method (including those that are returned as the results of JNI function calls) are automatically added to the registry. The registry is deleted after the native method returns, allowing all of its entries to be garbage collected.
https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/design.html#wp16789
Okay, so the local
references are nonmovable but that doesn’t say anything about the compaction.
The JVM must ensure that objects passed as parameters from Java™ to the native method and any new objects created by the native code remain reachable by the GC. To handle the GC requirements, the JVM allocates a small region of specialized storage called a “local reference root set”.
A local reference root set is created when:
- A thread is first attached to the JVM (the “outermost” root set of the thread).
- Each J2N transition occurs.
The JVM initializes the root set created for a J2N transition with:
- A local reference to the caller’s object or class.
- A local reference to each object passed as a parameter to the native method.
New local references created in native code are added to this J2N root set, unless you create a new “local frame” using the PushLocalFrame JNI function.
Okay, so IBM stores the passed objects in the local reference root set
but it doesn’t discuss about memory compaction. This just says that the objects won’t be garbage-collected.
The GC might, at any time, decide it needs to compact the garbage-collected heap. Compaction involves physically moving objects from one address to another. These objects might be referred to by a JNI local or global reference. To allow compaction to occur safely, JNI references are not direct pointers to the heap. At least one level of indirection isolates the native code from object movement.
If a native method needs to obtain direct addressability to the inside of an object, the situation is more complicated. The requirement to directly address, or pin, the heap is typical where there is a need for fast, shared access to large primitive arrays. An example might include a screen buffer. In these cases a JNI critical section can be used, which imposes additional requirements on the programmer, as specified in the JNI description for these functions. See the JNI specification for details.
- GetPrimitiveArrayCritical returns the direct heap address of a Java™ array, disabling garbage collection until the corresponding ReleasePrimitiveArrayCritical is called.
- GetStringCritical returns the direct heap address of a java.lang.String instance, disabling garbage collection until ReleaseStringCritical is called.
Okay, so IBM basically says that the JNI passed objects COULD be moved at any time! How about HotSpot?
GetArrayElements family of functions are documented to either copy arrays, or pin them in place (and, in so doing, prevent a compacting garbage collector from moving them). It is documented as a safer, less-restrictive alternative to GetPrimitiveArrayCritical. However, I’d like to know which VMs and/or garbage collectors (if any) actually pin arrays instead of copying them.
Aleksandr seems to think that the only safe way to access the memory of passed objects is through Get<PrimitiveType>ArrayElements
or GetPrimitiveArrayCritical
Trent’s answer was less than exciting.
At least in current JVM’s (i have not checked to see how far back this was backported), CMS GC, since it’s non-moving is not affected by JNI critical sections (modulo that non stop-worl compaction can occur if there is a concurrent mode failure — in that case the allocating thread must stall until the critical section is cleared — this latter kind of stall is likely to be much rarer than the slow-path direct allocation in old gen pathology that you might see more frequently). Note that direct allocation in old gen is not only slow in and of itself (a first-order performance impact) but can in turn cause more tenuring (because of so-called nepotism), as well as slower subsequent scavenges because of more direty cards needing scanning (both of the latter being second-rder effects).
http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2007-December/000074.html
This email on the OpenJDK mailing list seems to say that the ConcurrentMarkAndSweep GC is non-moving.
https://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-Them-All
This post about G1 mentions that it does compact the heap but not much specifically about moving data.
Since the IBM documentation alludes to the fact that the objects could be compacted at any time; we need to figure out WHY the JNI HotSpot functions are actually safe at all. Right, because they must need to move to a safe state to prevent the concurrent memory effects if memory-compaction is indeed happening while the JNI code is running.
Now, I’ve been following the HotSpot code the best I can. Lets take a look at GetByteArrayElements
. It seems logical that the method must ensure that the pointer is correct before copying the elements. Lets try to find out how.
Here is the macro for GetByteArrayElements
#ifndef USDT2 #define DEFINE_GETSCALARARRAYELEMENTS(ElementTag,ElementType,Result, Tag) JNI_QUICK_ENTRY(ElementType*, jni_Get##Result##ArrayElements(JNIEnv *env, ElementType##Array array, jboolean *isCopy)) JNIWrapper("Get" XSTR(Result) "ArrayElements"); DTRACE_PROBE3(hotspot_jni, Get##Result##ArrayElements__entry, env, array, isCopy); /* allocate an chunk of memory in c land */ typeArrayOop a = typeArrayOop(JNIHandles::resolve_non_null(array)); ElementType* result; int len = a->length(); if (len == 0) { result = (ElementType*)get_bad_address(); } else { result = NEW_C_HEAP_ARRAY_RETURN_NULL(ElementType, len, mtInternal); if (result != NULL) { memcpy(result, a->Tag##_at_addr(0), sizeof(ElementType)*len); if (isCopy) { *isCopy = JNI_TRUE; } } } DTRACE_PROBE1(hotspot_jni, Get##Result##ArrayElements__return, result); return result; JNI_END
Here is the macro for JNI_QUICK_ENTRY
#define JNI_QUICK_ENTRY(result_type, header) extern "C" { result_type JNICALL header { JavaThread* thread=JavaThread::thread_from_jni_environment(env); assert( !VerifyJNIEnvThread || (thread == Thread::current()), "JNIEnv is only valid in same thread"); ThreadInVMfromNative __tiv(thread); debug_only(VMNativeEntryWrapper __vew;) VM_QUICK_ENTRY_BASE(result_type, header, thread)
I have followed every function in here and yet have to see any kind of mutex or memory synchronizer. The only function I could not follow was __tiv
which does not seem to have a definition anywhere I could find.
- Could someone explain to me why JNI interface methods such as
GetByteArrayElements
are safe? - While we’re at it, can anyone find where the JNI call transitions from VM back to Native when
JNI_QUICK_ENTRY
exits?
Advertisement
Answer
How JNI methods work in HotSpot JVM
Native methods may run concurrently with VM operations including GC. They are not stopped at safepoints.
GC may move Java objects even if they are referenced from a running native method.
jobject
handle is not a raw address into the heap, but rather one more level of indirection: consider it a pointer into a non-movable array of object references. Whenever an object is moved, the corresponding array slot is updated, but the pointer to this slot remains the same. That is,jobject
handle remains valid. Every time a native method calls a JNI function, it checks if JVM is in the safepoint state. If it is (e.g. GC is running), JNI function blocks until safepoint operation is completed.During the execution of JNI functions like
GetByteArrayElements
, the corresponding thread is marked as_thread_in_vm
. A safepoint cannot be reached while there are running threads in this state. E.g. if GC is requested during the execution ofGetByteArrayElements
, GC will be delayed until JNI function returns.Thread state transition magic is performed by the line you’ve noticed:
ThreadInVMfromNative __tiv(thread)
. Here__tiv
is just an instance of the class. Its only purpose is to automatically callThreadInVMfromNative
constructor and destructor.ThreadInVMfromNative
constructor callstransition_from_native
which checks for a safepoint, and suspends current thread if needed.~ThreadInVMfromNative
destructor switches back to_thread_in_native
state.GetPrimitiveArrayCritical
andGetStringCritical
are the only JNI functions that provide raw pointers to Java heap. They prevent GC from starting until the correspondingRelease
function is called.
Thread state transition when calling a JNI function from native code
state = _thread_in_native;
Native method may run concurrently with GCJNI function is called
state = _thread_in_native_trans;
GC cannot start at this pointIf VM operation is in progress, block until it completes
state = _thread_in_vm;
Safe to access heap