Skip to content

Snowplow Data Processing from PubSub to Java API

I am using Snowplow to do the behavioral data tracking. I could consume the data from Pub/Sub to BigQuery using Snowplow loader (& mutator) open source code (, but I would like to consume the data from Pub/Sub to a Java API directly.

However, the data from Pub/Sub is unstructured without a schema in a String format. The data includes “t” as the delimiter as well as “{}” to store some schemas, which may require the string processing to do the data formatting.

Is there any other better way to decode the data from Pub/Sub to Java API rather than writing complex string processing. Thank you!


Snowplow maintains a number of so-called ‘analytics SDKs’ that let you transform the enriched hybrid tsv + JSON format into plain JSON that can then be used in downstream applications.

For Java, your best bet would probably be the Scala Analytics SDK:

There are also SDKs for .NET, Go, JavaScript and Python: