I had expected that the Java Virtual Machine would use a simple lookup table for normal method invocations: The object contains a pointer to a lookup table with the addresses for all methods that the class implements (including the methods implemented by super-classes). A specific method is simply represented by an index into that table. The JVM looks up the address of the method in the table, and jumps to that address.
But the JVM specification specifies a long and complex procedure for looking up the correct method at run-time (see the official spec):
The unsigned indexbyte1 and indexbyte2 are used to construct an index
into the run-time constant pool of the current class [which] must be a
symbolic reference to a method , which gives the name and descriptor
of the method as well as a symbolic reference to the class in which
the method is to be found. The named method is resolved. […] then
the invokevirtual instruction proceeds as follows.
If C contains a declaration for an instance method m that overrides
(§5.4.5) the resolved method, then m is the method to be invoked, and
the lookup procedure terminates.
Otherwise, if C has a superclass, this same lookup procedure is
performed recursively using the direct superclass of C; the method to
be invoked is the result of the recursive invocation of this lookup
I would expect this complex and long procedure to take a long time. Because it is done for every normal method call, almost all time for JVM based programs would be spent in this procedure.
Is this really how it is implemented in the real (Oracle) JVM? Or does the JVM do a JIT type of compilation to a lookup table? Is there a description of how the concrete JVM actually implements this?
There is nothing in the Java Language Specification or the Java Virtual Machine Specification that prescribes any particular implementation strategy. Every implementor is free to choose any implementation strategy they want, as long as the result is the same AS-IF they had implemented the algorithm described in the spec.
In other words, the algorithm in the spec describes the end result but not the recipe.
The simplest and most obvious possible optimization is to just stupidly perform the algorithm as described, but cache the result instead of throwing it away.
Most modern high-performance JVMs are derived from Smalltalk VMs and use techniques that were invented in the 1980s by the Smalltalk community.
Eclipse OpenJ9 started its life as IBM J9, which in turn is derived from the IBM VisualAge for Java Universal Virtual Machine (which was capable of seamlessly executing a mix of JVM byte code and Smalltalk byte code), which in turn was based on the IBM VisualAge for Smalltalk VM.
Oracle HotSpot is based on the Animorphic Smalltalk VM by LongView, which in turn is based on the Self VM. (The Animorphic Smalltalk VM was also the original basis for Google’s V8 ECMAScript engine.)
Azul Zing is derived from HotSpot. Oracle Labs Maxine RVM was developed by some old Smalltalk and Self developers based on ideas from the Klein VM (an experimental meta-circular Self VM written in Self).
Some of the most well-known techniques for eliminating dynamic runtime virtual message dispatch overhead are
- Devirtualization – turning dynamic runtime virtual message dispatch into static method lookup:
- Callsite Customization – compiling multiple different versions of the code, each for a specific receiver type:
- Dynamic Type Feedback:
- Dynamic Type Inference:
- Inline Caching – remembering what the result of the lookup was the last time
- Speculative Inlining:
- various other forms of Adaptive Optimizations:
[You will note that almost all of the sources are for either Self or Smalltalk. The two major reasons are that Self pioneered a lot of these techniques, and Smalltalk and the Smalltalk VM were a major influence on Java and the JVM.]
The JVMs I am most familiar with (Eclipse OpenJ9, Oracle HotSpot, Oracle Labs Maxine RVM, Azul Zing) implement most of the above.
invokedynamic bytecode introduced into the JVM Specification in Java 7 allows programmers access to the above optimizations but supply their own method lookup algorithm instead of the one hard-coded into the JVM. This makes it possible to create high-performance implementations on top of the JVM for languages whose method lookup algorithm is incompatible with Java’s.