In Java, verify all methods in a class path that are called actually exist within that classpath [closed]

Tags: , , ,



Given a classpath (e.g. a set of jar files) I would like to know, do any of these jar files make a method call (ignoring reflection) to a method which doesn’t exist within the class path.

For example if I had only foo.jar on my class path and it has a class which makes a call to com.bar.Something#bar(String) and that did not exist in foo.jar then I would be told the method doesn’t actually exist.

Answer

There are no tools that I am aware of that do this, and a JVM will not just blindly load all classes contained on its class path on boot. It just loads whatever you told it is the main class, and whenever it loads a class, it checks which other classes it needs to load in order to make sense of the signatures contained within (so, field types, whatever it extends or implements, method return types, method parameter types, and method exception types – any such classes are immediately loaded as part of loading a class if any such types aren’t already loaded) – and it loads classes needed to execute a statement, but only when such a statement is actually run. In other words, java (the VM) loads lazily. You cannot use it for this purpose.

What you can do is rather involved. Let’s first tighten what you’re asking for:

  1. Given a ‘set of source jars’ (source), verify each class file contained within.
  2. To verify a class, find all method and field accesses contained within all classes within source, and ensure that the mentioned field/method access actually exists, by comparing against a ‘set of target jars’ (target). Source and target may or may not be the same. For convenience you may wish to silently extend target to always include source.

Any attempt to use the VM’s classloading abilities (e.g. you load classes in with reflection directly) is problematic: That will run static initializers and who knows what kind of nasty side-effects that’s going to have. It’ll also be incredibly slow. Not a good idea.

What you’d want is not to rely on the VM itself, and to handroll your own code to do this; after all, class files are just files, you can read them, parse them, and take action based on their contents. Jar files can be listed and their contents can be read, from within java code – not a problem.

The class file format is well described in the JVM Specification but is a very complicated format. I strongly suggest you use existing libraries that can read it. ASM comes to mind.

In practice, any method invocation is encoded in a class file using one of a few ‘INVOKE’ opcodes (normal method calls are INVOKEVIRTUAL or INVOKEINTERFACE, static methods are INVOKESTATIC, constructors and initializers are INVOKESPECIAL. Field accesses (you did not mention this, but if you’re going to verify for existence of referenced entities, surely you’d also want to take fields into account) are GETFIELD and SETFIELD.

However, all of these opcodes do not then immediately encode in full what they are referring to. Instead, they encode merely a small index number: That number is to be looked up in a class file’s constant pool, where you find a fully qualified specification for what method/field is actually being referred to. For example, invoking, say, ArrayList’s ‘ensureCapacity’ method is named, in class file format, as a constant that itself refers to 2 string constants: One string constant contains the value "java/util/ArrayList", the other contains the value "ensureCapacity(I)V". (I is class-file-ese for the primitive int type, and V is representing the return type; V is class-file-ese for void).

Therefore, there is an easy shortcut and there is no need to parse the bytecode contained in a class file. Just scan the constant pool – all you need to do is verify that every method and field ref in the constant pool is referring to an actual existing method or field.

With sufficient knowledge of the class file internals (I covered most of what you need to know here already), and some basic experience with the ASM library, you should be able to write something like this yourself, using ASM, in a span of a day or so. If this is all greek to you, it’ll no doubt take perhaps a week, but no more than that; a medium sized project at best.

Hopefully these are enough pointers for you to figure out where to go from here, or at the very least, to know what it would take and what you may want to search the web for if you don’t want to write it yourself but still hold out hope that someone already did the work and published it as an open source library someplace.

NB: There are also dynamic invocations which are a lot more complicated, but by their nature, you can’t statically verify these, so presumably the fact that you can’t meaningfully interact with INVOKEDYNAMIC based method invokes is not relevant here. Similarly, any java code that uses the java.lang.reflect API obviously doesn’t use any of this stuff, and cannot, mathematically provably even, be verified in this fashion. Thus, no need to worry about doing the impossible.



Source: stackoverflow