I’d expect this code to throw a ClassCastException:
public class Generics { public static void main(String[] args) { method(Integer.class); } public static <T> T method(Class<T> t) { return (T) new String(); } }
But it doesn’t. Casting String to T doesn’t fail, until I use the returned object somehow, like:
public class Generics { public static void main(String[] args) { method(Integer.class).intValue(); } public static <T> T method(Class<T> t) { return (T) new String(); } }
Background: I created a Class which uses JAXB to unmarshal an XML file. It looks like this:
public static <T> T unmarshal(File file, Class<? extends T> clazz)
Depending on whether the root-Element is an anonymous type or not, either T or JAXBElement is being returned. JAXBElement of course, can’t be casted to T. In my unit test, where I only called unmarshal() without doing something with the result, everything worked fine. In Code, it failed.
Why doesn’t it fail directly? Is this a bug? If not, I’d like to understand why.
Advertisement
Answer
Basically, because of type erasure, Java will perform a type-check at the call-site whenever you make use of the fact that T
is something specific.
But things aren’t quite that simple, unfortunately.
The other answers are incorrect when they say T
being Object
is the reason you don’t get a ClassCastException
.
Let’s test the theory and manually choose T
to be Integer
:
Generics.<Integer>method(Integer.class);
When I run this, it still doesn’t fail.
Java does infer T
to be Integer
, method(Integer.class).intValue()
would be a compile-time error, chained calls do not inform type inference.
So what is going on?
Note that when it does fail, it never fails method
, it will always fail inside main
.
Due to type erasure, method
basically ends up without any generics information after compilation. The return type ends up being Object
, the parameter type is the raw type Class
and the cast inside the method is simply removed because a cast would be a no-op in the absence of any generics information.
You can see this when checking the bytecode of the callsite:
0: ldc #7 // class java/lang/Integer 2: invokestatic #9 // Method method:(Ljava/lang/Class;)Ljava/lang/Object; ^^^^^^^^^^^^^^^^ return type
When calling a method in the bytecode, the return type ends up as part of the method’s “name”, if you will.
Exploring furthercompiler explorer, we find that a modified main
method produces the following bytecode1 for the first four lines:
Main.<Integer>method(Integer.class); 0: ldc #7 // class java/lang/Integer 2: invokestatic #9 // Method method:(Ljava/lang/Class;)Ljava/lang/Object; 5: pop Object o = Main.<Integer>method(Integer.class); 6: ldc #7 // class java/lang/Integer 8: invokestatic #9 // Method method:(Ljava/lang/Class;)Ljava/lang/Object; 11: astore_1 Main.<Integer>method(Integer.class).intValue(); 12: ldc #7 // class java/lang/Integer 14: invokestatic #9 // Method method:(Ljava/lang/Class;)Ljava/lang/Object; 17: checkcast #7 // class java/lang/Integer 20: invokevirtual #15 // Method java/lang/Integer.intValue:()I 23: pop Integer i = Main.<Integer>method(Integer.class); 24: ldc #7 // class java/lang/Integer 26: invokestatic #9 // Method method:(Ljava/lang/Class;)Ljava/lang/Object; 29: checkcast #7 // class java/lang/Integer 32: astore_2
For each line, I have added the corresponding bytecode interspersed with the Java code.
Compare the bytecode for the different lines. Note how Java inserts a checkcast
instruction after the method call to method
, i.e. after invokevirtual
. This performs a type-check on the returned value, which is currently on top of the stack. Since it’s a String
and it’s cast to Integer
, you get a ClassCastException
.
It does not do that for the first two lines which don’t use the result.
This is why your code fails only when you actually use the result like you do.
I would have assumed that Java inserts this cast whenever you make use of the fact that T
is Integer
to verify that method
actually did return something of type T
as best it can to fail early.
Here’s another example:
Main.<Integer>method(Integer.class).toString(); 33: ldc #7 // class java/lang/Integer 35: invokestatic #9 // Method method:(Ljava/lang/Class;)Ljava/lang/Object; 38: checkcast #7 // class java/lang/Integer 41: invokevirtual #19 // Method java/lang/Integer.toString:()Ljava/lang/String; 44: pop
The compiler knows that the .toString()
call is being placed on something of type Integer
, so it places a virtual call directly to Integer
‘s version of this method. Of course the compiler needs insert a check to ensure the returned value (which could be anything at runtime) conforms to Integer
, so it inserts another checkcast
instruction.
However, even when using a class that doesn’t override Object’s toString
, Java still inserts a checkcast
:
Main.<Main>method(Main.class).toString(); 45: ldc #6 // class Main 47: invokestatic #3 // Method method:(Ljava/lang/Class;)Ljava/lang/Object; 50: checkcast #6 // class Main 53: invokevirtual #7 // Method java/lang/Object.toString:()Ljava/lang/String; 56: pop
Despite targeting a method that exists for all objects with essentially choosing the static receiver type to be Object
, Java still inserts checkcast
.
When we cast the returned value to Object
by ourselves, however, Java does not add any checkcast
whatsoever and the call can go through.
Let’s back off a little and think about what we’ve been doing. We’re not looking at Java per se, we’ve been looking at bytecode.
Java is defined by the Java Language Specification. I’d expect to find some kind of rule that describe when this type check is done and when it isn’t that.
Unfortunately, I’ve been unable to find anything about these inserted type-checks in the spec.
Others have looked, too, several years after you’ve stumbled across this.
If it is truly unspecified, whenever I said “Java does/doesn’t insert a checkcast” above, I should probably have said “this particular compiler” instead of “Java” and what we’ve been looking at might technically just be an implementation detail (as of yet).
1 Running some variant of JDK 17.0.0