Skip to content

ClassNotFoundException: Failed to find data source: bigquery

I’m trying to load data from Google BigQuery into Spark running on Google Dataproc (I’m using Java). I tried to follow instructions on here:

I get the error: “ClassNotFoundException: Failed to find data source: bigquery.”

My pom.xml looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="" xmlns:xsi=""






After adding the dependency to my pom.xml it was downloading a lot to build the .jar, so I think I should have the correct dependency? However, Eclipse is also warning me that “The import is never used”.

This is the part of my code where I get the error:

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;

public class Main {

    public static void main(String[] args) {

        SparkSession spark = SparkSession.builder()

        Dataset<Row> data ="bigquery")



I think you only added BQ connector as compile time dependency, but it is missing at runtime. You need to either make a uber jar which includes the connector in your job jar (the doc needs to be updated), or include it when you submit the job gcloud dataproc jobs submit spark --properties

User contributions licensed under: CC BY-SA
4 People found this is helpful