Skip to content
Advertisement

Fetching specific fields from an S3 document

I am using AWS Java SDK in my application to talk to one of my S3 buckets which holds objects in JSON format.

A document may look like this:

JavaScript

Now, for a certain document lets say document1 I need to fetch the values corresponding to field a and b instead of fetching the entire document.

This sounds like something that wouldn’t be possible because S3 buckets can have any type of documents in them and not just JSONs.

Is this something that is achievable though?

Advertisement

Answer

That’s actually doable. You could do selects like you’ve described, but only for particular formats: JSON, CSV, Parquet.

Imagine having a data.json file in so67315601 bucket in eu-central-1:

JavaScript

First, learn how to select the fields via the S3 Console. Use “Object Actions” → “Query with S3 Select”:

enter image description here enter image description here


AWS Java SDK 1.x

Here is the code to do the select with AWS Java SDK 1.x:

JavaScript

The output is:

JavaScript

AWS Java SDK 2.x

The code for the AWS Java SDK 2.x is more cunning. Refer to this ticket for more information.

JavaScript

As you see, it’s possible to make S3 selects programmatically!

You might be wondering what are those @AWSClient and @ExtendWith( S3.class )?

This is a small library to inject AWS clients in your tests, named aws-junit5. It would greatly simplify your tests. I am the author. The usage is really simple — try it in your next project!

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement