Skip to content
Advertisement

How can I split a string without knowing the split characters a-priori?

For my project I have to read various input graphs. Unfortunately, the input edges have not the same format. Some of them are comma-separated, others are tab-separated, etc. For example:

File 1:

123,45
67,89
...

File 2

123    45
67    89
...

Rather than handling each case separately, I would like to automatically detect the split characters. Currently I have developed the following solution:

String str = "123,45";
String splitChars = "";
for(int i=0; i < str.length(); i++) {
    if(!Character.isDigit(str.charAt(i))) {
      splitChars += str.charAt(i);
   }
}
  
String[] endpoints = str.split(splitChars);

Basically I pick the first row and select all the non-numeric characters, then I use the generated substring as split characters. Is there a cleaner way to perform this?

Advertisement

Answer

Split the string on \D+ which means one or more non-digit characters.

Demo:

import java.util.Arrays;

public class Main {
    public static void main(String[] args) {
        // Test strings
        String[] arr = { "123,45", "67,89", "125      89", "678 129" };
        for (String s : arr) {
            System.out.println(Arrays.toString(s.split("\D+")));
        }
    }
}

Output:

[123, 45]
[67, 89]
[125, 89]
[678, 129]
Advertisement