Skip to content

Nested While loop not counting two files correctly

I am working on a program that takes a file in from the command line. this file is then compared to a text file to see if it has certain words in it. if it does, i want it to increment a counter for each time that certain word is found.

I thought I was on the right track, using while loops to make the files go through to completion, and using contains to see if the words existed within both files.

However, when i output the number, it is just the total number of words in the text file! I’m not sure why this is. I am new to java so this is something i’m not too comfortable around haha, any help would be appreciated

String fname = args[0];      // input file of text
    String words1;
    String words2;
    int numWords = 0;            // total number of words
    

    FileInputStream fileKeywords = new FileInputStream("Keywords.txt");
    Scanner keywords = new Scanner(fileKeywords);
    keywords.useDelimiter("[^a-zA-Z']+");  // delimiters are nonletters,'

    FileInputStream fileJava = new FileInputStream(args[0]);
    Scanner java = new Scanner(fileJava); 
    java.useDelimiter("[^a-zA-Z']+");  // delimiters are nonletters,'

    while (java.hasNext()) {
      words1 = java.next();
      while (keywords.hasNext()) {
        words2 = keywords.next();
        if (words2.contains(words1)) {
          numWords++;
        }
      }
    }

Answer

words2 is a single word. contains here is the ‘string contains’ code. For example, "Hello".contains("ell") is true. Your intent is that words2 is all keywords, but.. it isn’t.

That while loop? Just delete it and start over:

  • First, read ALL keywords (yes, this involves one single non-nested while loop) and store them in a datatype that is good at this. HashSet seems best, but if that scares you, ArrayList will do the job allright if the keywords list is less than ~1000 or so words.
  • Then loop, once, through your ‘second scanner’, the one scanning the input file, (java seems like a really, really bad name for this – variable names are very important). Make sure the contains call you invoke is on some collection type (ArrayList or HashSet), not on a string.