Skip to content
Advertisement

Reading in large text files, problems with the garbage collector and Scanner object

I am writing a program that needs to read in very large files (about 150Mb of text). I am running into an out of memory error when I try to read in files that are larger than 50Mb. Here is an excerpt from my code.

if (returnVal == JFileChooser.APPROVE_OPTION) {
        file = fc.getSelectedFile();
        gui.setTitle("Fluent Helper - " + file.toString());
        try{
            scanner = new Scanner(new FileInputStream(file));
            gui.getStatusLabel().setText("Reading Faces...");
            while(scanner.hasNext()){
                count++;
                if(count<1000000){
                    System.gc();
                    count = 0;
                }
                readStr = scanner.nextLine()+ "n";
                if(readStr.equals("#n")){
                    isFaces = false;
                    gui.getStatusLabel().setText("Reading Cells...");
                }else if(isFaces){
                    faces.add(new Face(readStr));
                }else{
                    cells.add(new Cell(readStr));
                }
            }
        }catch (Exception e){
            e.printStackTrace();
        }finally{
            try{
                scanner.close();
            }catch(Exception e){
                e.printStackTrace();
            }
        }
        System.out.println("flie selected");
    } else {
        System.out.println("file not selected");
    }

the small block that calls the garbage collector every arbitrary number of reads is something I added to solve the memory problem, but it doesn’t work. Instead the program hangs and never gets to the cells portion of the file (which should happen in less than a second). Here is the block.

                    if(count<1000000){
                    System.gc();
                    count = 0;
                }

My guess is that maybe the Scanner’s pointer is getting garbage collected or something. I really don’t have any clue. Launching the program with a larger heap is not really an option for me. The program should be usable by people with out very much computer knowledge.

I would like a solution to get the file in with out a problem, be it a memory management one or fixing the scanner or a more efficient means of reading the file. Thanks everyone.

Advertisement

Answer

The GC will be called automatically when required so calling it yourself will just slow down your application.

The problem is the amount of data you are retaining

                faces.add(new Face(readStr));
            }else{
                cells.add(new Cell(readStr));

These are exceeding the amount of memory you have as a maximum heap. Can you try setting -mx1g to see if this makes a difference?

BTW: Why are you adding a n to the end of each line?

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement