Skip to content
Advertisement

ANTLR AST visitor that returns different data types

I finished converting the ANTLR CST into an AST and created a specific Visitor<T> interface that allows me to visit all my AST nodes, but the main issue i’m having is that some visits should return different data types, and i’m not sure how to go about this. For example, for simple arithmetics operations i want to return a double from their respective visit methods; but other string operation would require their respective nodes to return a string. Since my visit methods all require a generic type T, I tried making a class called Result and two child classes DoubleResult and StringResult so that my visitor can return either a string or double during visits but it seems bad and full of cast and type checks. Is there a better way to do things like this ?

Here’s an example code :

public class ExpressionVisitor implements Visitor<Result> {
    ...
    public Result visit(BinaryExpression node) {
        // here the node's left or right can be StringResults
        // So i'd have to do instance and type checks
    }

    public Result visit(StringExpression node) {
        // here i'd return a StringResult specifically
    }
}

Edit: The goal is to be able to do string arithmetics for example like in python for instance 2*"hello"

But then the binary expression visit method would have to check for which operand is a string (left or right), and then most of the other visit methods would need to check and handle either a DoubleResult or StringResult type accordingly. Is there a cleaner way of achieving the same thing ?

Advertisement

Answer

As you have, not doubt discovered, the Visitor class that ANTLR generates for you is a generic class where you need to identify the expected return type, and must return that type:

There are a few ways you can address your issue. (It definitely comes up in dynamically typed languages).

For a simple language I put together for a tutorial, I just defined a Value class that could hold any of the dynamic types that I allowed for. I had getters/setters for each type as well as an is*() method for each type. My expression Visitor just returned this dynamic type.

Note: prior to execution, I used a semantic validation Listener, that used a stack-based approach where I popped expression types from the stack (that had been pushed on the stack as I visited children) as I exited an expression, validated type compatibility of those values, and then pushed the resulting type onto the stack (literals and variables simply pushed their type on the stack). Any issues I encountered pushed an error message prior to execution. This way I knew that, at execution type, I’d always have the correct type. (The same could be done at runtime; this was just the solution I chose.)

Another potentially handy approach to to realize that, since you are in control of tree navigation when using Visitors, there’s nothing to say that you have to have a single Visitor that you use for the full tree. You can have multiple visitors (each with their own type), and choose which Visitor to use to visit your child nodes. For example, I visited statement nodes with a Visitor that returned a NULL type, but visited all expression nodes with a Visitor that returned my dynamic Value type.

There’s no real reason that you can’t have as many Visitors as you want each with a different type. It’s not uncommon for Visitors to be stateless, so you can just re-use them hopping back and forth between the appropriate type for visiting each child. (Of course, this assumes that, from the parent node, you can determine which Visitor type you should use to visit each child node.)

Minor nit… ANTLR4 produces Parse Trees (not ASTs)

Advertisement