Skip to content
Advertisement

Hashing runs out of memory, and getting slower and slower over time

I have a GUI desktop application which generates different types of hash (example MD5) for files and directories. Recently, when I was testing with a 1GB test file I recognized that it becomes slower and slower over time. At the first hashing, it takes about 2 seconds for a 1GB file, however later, for the exactly same file, it takes about 76 seconds.

To demonstrate the problem, I have created a sample code that everyone can try (for repeatability). It has 2 key steps (1) generates a byte array for the file, (2) generates the hash for the byte array. (In the real program there are several switches and if-else statements, for example to decide if it is a file or directory…etc., and lot of javaFX GUI elements involved…)

I’ll show that even this simplified code becomes 8 times slower by repeating it 5 times! As I read multiple forums it is probably the reason of memory leaking or too much memory consumption…or something similar. What I want is, I’d like to empty the memory between each cycle, so hashing would take only the time as the first time (2 seconds).

The mentioned sample code is the following:

JavaScript

The console output, where you can see the increasing time:

JavaScript

Advertisement

Answer

Probably your problem is that you store a reference to the loaded file somewhere.

A better approach to do an operation like a hash calculation over a big file is to not load everything into memory but to load it only piece by piece:

JavaScript
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement