I’m doing an exercise where I need to count how many times a word appears on a text and I also need to print in which line the words are present.
Text example:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
This is my method to find and count every word:
public void findWords() { try { File myObj = new File("path\text.txt"); Scanner myReader = new Scanner(myObj); while (myReader.hasNextLine()) { String text = myReader.nextLine(); final String lowerText = text.toLowerCase(); final String[] split = lowerText.split("\W+"); System.out.println("Output: "); for (String s : split) { if (s == null) { continue; } int count = 0; for (int i = 0; i < split.length; i++) { final boolean sameWorld = s.equals(split[i]); if (sameWorld) { count = count + 1; split[i] = null; } } System.out.println(s + " " + count); } } myReader.close(); } catch (FileNotFoundException e) { System.out.println(e); } }
The current output is something like this:
Output: lorem 1 ipsum 1 dolor 1 sit 1 amet 1 consectetur 1 adipiscing 1 ....
And I want it to show in which line the word is like:
Output: lorem 1 - line 1 ipsum 1 - line 1 ...
To make it clearer, the word “ut” appears 3 times in 2 different lines and the output should look like this:
ut 3 - line 1 2
Advertisement
Answer
Would it be possible to create an object to handle the word. It could have a String to represent the text of the word and an array of ints representing the lines it appears on and then an int representing the frequency of it’s occurances in the text. As for tracking the line you could keep track of the line you’re on with a counter variable inside the while loop maybe?
If you went that route You might be better using a hashmap and treeset to store the word objects and then print them out based on some order.