Skip to content
Advertisement

How to split a byte array around a byte sequence in Java?

How to split a byte[] around a byte sequence in Java? Something like the byte[] version of String#split(regex).

Example

Let’s take this byte array:
[11 11 FF FF 22 22 22 FF FF 33 33 33 33]

and let’s choose the delimiter to be
[FF FF]

Then the split will result in these three parts:
[11 11]
[22 22 22]
[33 33 33 33]

EDIT:

Please note that you cannot convert the byte[] to String, then split it, then back because of encoding issues. When you do such conversion on byte arrays, the resulting byte[] will be different. Please refer to this: Conversion of byte[] into a String and then back to a byte[]

Advertisement

Answer

Note that you can reliably convert from byte[] to String and back, with a one-to-one mapping of chars to bytes, if you use the encoding “iso8859-1”.

However, it’s still an ugly solution.

I think you’ll need to roll your own.

I suggest solving it in two stages:

  1. Work out how to find the of indexes of each occurrence of the separator. Google for “Knuth-Morris-Pratt” for an efficient algorithm – although a more naive algorithm will be fine for short delimiters.
  2. Each time you find an index, use Arrays.copyOfRange() to get the piece you need and add it to your output list.

Here it is using a naive pattern finding algorithm. KMP would become worth it if the delimiters are long (because it saves backtracking, but doesn’t miss delimiters if they’re embedded in sequence that mismatches at the end).

public static boolean isMatch(byte[] pattern, byte[] input, int pos) {
    for(int i=0; i< pattern.length; i++) {
        if(pattern[i] != input[pos+i]) {
            return false;
        }
    }
    return true;
}

public static List<byte[]> split(byte[] pattern, byte[] input) {
    List<byte[]> l = new LinkedList<byte[]>();
    int blockStart = 0;
    for(int i=0; i<input.length; i++) {
       if(isMatch(pattern,input,i)) {
          l.add(Arrays.copyOfRange(input, blockStart, i));
          blockStart = i+pattern.length;
          i = blockStart;
       }
    }
    l.add(Arrays.copyOfRange(input, blockStart, input.length ));
    return l;
}
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement