Skip to content
Advertisement

Java Large Files Disk IO Performance

I have two (2GB each) files on my harddisk and want to compare them with each other:

  • Copying the original files with Windows explorer takes approx. 2-4 minutes (that is reading and writing – on the same physical and logical disk).
  • Reading with java.io.FileInputStream twice and comparing the byte arrays on a byte per byte basis takes 20+ minutes.
  • java.io.BufferedInputStream buffer is 64kb, the files are read in chunks and then compared.
  • Comparison is done is a tight loop like

    JavaScript

What can I do to speed this up? Is NIO supposed to be faster then plain streams? Is Java unable to use DMA/SATA technologies and does some slow OS-API calls instead?

EDIT:
Thanks for the answers. I did some experiments based on them. As Andreas showed

streams or nio approaches do not differ much.
More important is the correct buffer size.

This is confirmed by my own experiments. As the files are read in big chunks, even additional buffers (BufferedInputStream) do not give anything. Optimising the comparison is possible and I got the best results with 32-fold unrolling, but the time spend in comparison is small compared to disk read, so the speedup is small. Looks like there is nothing I can do ;-(

Advertisement

Answer

I tried out three different methods of comparing two identical 3,8 gb files with buffer sizes between 8 kb and 1 MB. the first first method used just two buffered input streams

the second approach uses a threadpool that reads in two different threads and compares in a third one. this got slightly higher throughput at the expense of a high cpu utilisation. the managing of the threadpool takes a lot of overhead with those short-running tasks.

the third approach uses nio, as posted by laginimaineb

as you can see, the general approach does not differ much. more important is the correct buffer size.

what is strange that i read 1 byte less using threads. i could not spot the error tough.

JavaScript

the code used:

JavaScript
Advertisement