Skip to content

Sandboxed java scripting replacement for Nashorn

I’ve been using Nashorn for awk-like bulk data processing. The idea is, that there’s a lot of incoming data, coming row by row, one by another. And each row consists of named fields. These data are processed by user-defined scripts stored somewhere externally and editable by users. Scripts are simple, like if( c>10) a=b+3, where a, b and c are fields in the incoming data rows. The amount of data is really huge. Code is like that (an example to show the use case):

    ScriptEngine engine = new NashornScriptEngineFactory().getScriptEngine(
            new String[]{"-strict", "--no-java", "--no-syntax-extensions", "--optimistic-types=true"},
            null,
            scr -> false);

    CompiledScript cs;
    Invocable inv=(Invocable) engine;
    Bindings bd=engine.getBindings(ScriptContext.ENGINE_SCOPE);

    bd.remove("load");
    bd.remove("loadWithNewGlobal");
    bd.remove("exit");
    bd.remove("eval");
    bd.remove("quit");

    String scriptText=readScriptText();

    cs = ((Compilable) engine).compile("function foo() {n"+scriptText+"n}");
    cs.eval();


    Map params=readIncomingData();

    while(params!=null)
    {
        Map<String, Object> res = (Map) inv.invokeFunction("foo", params);
        writeProcessedData(res);
        params=readIncomingData();
    }

Now nashorn is obsolete and I’m looking for alternatives. Was googling for a few days but didn’t found exact match for my needs. The requirements are:

  • Speed. There’s a lot of data so it shall be really fast. So I assume as well, precompilation is the must
  • Shall work under linux/openJDK
  • Support sandboxing at least for data access/code execution

Nice to have:

  • Simple, c-like syntax (not lua;)
  • Support sandboxing for CPU usage

So far I found that Rhino is still alive (last release dated 13 Jan 2020) but I’m not sure is it still supported and how fast it is – as I remember, one of reasons Java switched to Nashorn was speed. And speed is very important in my case. Also found J2V8 but linux is not supported. GraalVM looks like a bit overkill, also didn’t get how to use it for such a task yet – maybe need to explore further if it is suitable for that, but looks like it is complete jvm replacement and cannot be used as a library.

It’s not necessary shall be javascript, maybe there are other alternatives. Thank you.

Answer

GraalVM’s JavaScript can be used as a library with the dependencies obtained as any Maven artifact. While the recommended way to run it is to use the GraalVM distribution, there are some explanations how to run it on OpenJDK.

You can restrict things script should have access to, like Java classes, creating threads, etc:

From the documentation:

The following access parameters may be configured:

* Allow access to other languages using allowPolyglotAccess.
* Allow and customize access to host objects using allowHostAccess.
* Allow and customize host lookup to host types using allowHostLookup.
* Allow host class loading using allowHostClassLoading.
* Allow the creation of threads using allowCreateThread.
* Allow access to native APIs using allowNativeAccess.
* Allow access to IO using allowIO and proxy file accesses using fileSystem.

And it is several times faster than Nashorn. Some measurements can be found for example in this article:

GraalVM CE provides performance comparable or superior to Nashorn with 
the composite score being 4 times higher. GraalVM EE is even faster.