Splitting a string using special characters and keeping them

Tags: ,



I’m trying to split a string with special characters and not being able to split the parentheses properly. This the code I’m trying :

class Ione
{
    public static void main (String[] args) throws java.lang.Exception
    {
        String str = "g, i+, w+ | (d | (u+, f))+"; 
        String[] chunks = str.split(",\s+|(?=\W)");
        for(int q=0; q<chunks.length; q++) {
          System.out.println(""+chunks[q]);   
       } 
    }
}

The regex does not split the starting parentheses (

I’m trying to get the following output:

g,i,+,w,+,|,(,d,|,(,u,+,f,),),+

Could someone please help me. Thank you.

output of the code

Answer

So you want to use split() to get every character separately, except for spaces and commas, so split by spaces/commas and by “nothing”, i.e. the zero-width “space” between non-space/comma characters.

String str = "g, i+, w+ | (d | (u+, f))+";
String[] chunks = str.split("[\s,]+|(?<![\s,])(?![\s,])");
System.out.println(String.join(",", chunks));

Output

g,i,+,w,+,|,(,d,|,(,u,+,f,),),+

Alternative: Search for what you want, and collect it into an array or List (requires Java 9):

String str = "g, i+, w+ | (d | (u+, f))+";
String[] chunks = Pattern.compile("[^\s,]").matcher(str).results()
        .map(MatchResult::group).toArray(String[]::new);
System.out.println(String.join(",", chunks));

Same output.

For older versions of Java, use a find() loop:

String str = "g, i+, w+ | (d | (u+, f))+";
List<String> chunkList = new ArrayList<>();
for (Matcher m = Pattern.compile("[^\s,]").matcher(str); m.find(); )
    chunkList.add(m.group());
System.out.println(chunkList);

Output

[g, i, +, w, +, |, (, d, |, (, u, +, f, ), ), +]

You can always convert the List to an array:

String[] chunks = chunkList.toArray(new String[0]);


Source: stackoverflow