Context I want to iterate over a Spark Dataset and update a HashMap for each row. Here is the code I have: Issue My issue is that the foreach doesn’t iterate at all, the lambda is never executed and I don’t know why. I implemented it as indicated here: How to traverse/iterate a Dataset in Spark Java? At the end,
Tag: apache-spark-dataset
Data type mismatch while transforming data in spark dataset
I created a parquet-structure from a csv file using spark: I’m reading the parquet-structure and I’m trying to transform the data in a dataset: Unfortunately I get a data type mismatch error. Do I have to explicitly assign data types? 17/04/12 09:21:52 INFO SparkSqlParser: Parsing command: SELECT *, md5(station_id) as hashkey FROM tmpview Exception in thread “main” org.apache.spark.sql.AnalysisException: cannot resolve