Skip to content
Advertisement

Need to insert 100000 rows in mysql using hibernate in under 5 seconds

I am trying to insert 100,000 rows in a MYSQL table under 5 seconds using Hibernate(JPA). I have tried every trick hibernate offers and still can not do better than 35 seconds.

1st optimisation : I started with IDENTITY sequence generator which was resulting in 60 seconds to insert. I later abandoned the sequence generator and started assigning the @Id field myself by reading the MAX(id) and using AtomicInteger.incrementAndGet() to assign fields myself. That reduced the insert time to 35 seconds.

2nd optimisation : I enabled batch inserts, by adding

<prop key="hibernate.jdbc.batch_size">30</prop> <prop key="hibernate.order_inserts">true</prop> <prop key="hibernate.current_session_context_class">thread</prop> <prop key="hibernate.jdbc.batch_versioned_data">true</prop>

to the configuration. I was shocked to find that batch inserts did absolutely nothing to decrease insert time. It was still 35 seconds!

Now, I am thinking about trying to insert using multiple threads. Anyone has any pointers? Should I have chosen MongoDB?

Below is my configuration: 1. Hibernate configuration `

JavaScript

`

  1. Entity configuration :

`

JavaScript

`

  1. Persisting code (service) :

`

JavaScript

` 4. Dao class :

`

JavaScript

`

Logs: `

JavaScript

… …

JavaScript

Advertisement

Answer

After trying all possible solutions I finally found a solution to insert 100,000 rows under 5 seconds!

Things I tried:

1) Replaced hibernate/database’s AUTOINCREMENT/GENERATED id’s by self generated ID’s using AtomicInteger

2) Enabling batch_inserts with batch_size=50

3) Flushing cache after every ‘batch_size’ number of persist() calls

4) multithreading (did not attempt this one)

Finally what worked was using a native multi-insert query and inserting 1000 rows in one sql insert query instead of using persist() on every entity. For inserting 100,000 entities, I create a native query like this "INSERT into MyTable VALUES (x,x,x),(x,x,x).......(x,x,x)" [1000 row inserts in one sql insert query]

Now it takes around 3 seconds for inserting 100,000 records! So the bottleneck was the orm itself! For bulk inserts, the only thing that seems to work is native insert queries!

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement