Spark (JAVA) – dataframe groupBy with multiple aggregations?

Question

I&#8217;m trying to write a groupBy on Spark with JAVA. In SQL this would look like But what is the Spark/JAVA style equivalent of this query? Let&#8217;s say the variable table is a dataframe, to see the relation to the SQL query. I&#8217;m thinking something like: Which is obviously incorrect, since you can…

Accepted Answer

You could do this with org.apache.spark.sql.functions:import org.apache.spark.sql.functions;table.groupBy("id").agg(    functions.count("id").as("count"),    functions.max("date").as("maxdate")).show();

Advertisement

Answer