JPA performance – SELECT DISTINCT and/or Java Set

Question

I have a Spring Boot application with a JpaRepository. I am trying right now to obtain all the unique values for a certain field in my db. Let&#8217;s suppose I have an object Foo with an Integer PK id and a non-unique Integer field bar. I want to get all the unique values for bar and I&#8217;m wondering whic…

Accepted Answer

Option 1 or 2 is best for performance, option 3 will return all bars from the table, even duplicated values. I&#8217;m guessing the reason you don&#8217;t see the duplicated values from the database is because you map the result to a Set which cannot contain duplicates. And if we talk about performance for option 1 compared to option 2 I would say it really depends on what you will be using these collections for. For example see this question: Performance and Memory allocation comparison between List and Set.I would go with the Option 2 to make the filtering happen in the database and make it clear in the code that there should be no duplicates in this collection.

Advertisement

Answer