What is the difference between chars() and codePoints() method in CharSequence interface?

Question

I read javadoc, but don&#8217;t understand the differences, both of them return same result. Also can anyone explain what is &#8216;zero-extending&#8217; means? Javadoc of chars() method Returns a stream of int zero-extending the char values from this sequence. Any char which maps to a surrogate code point is…

Accepted Answer

A &#8216;char&#8217; is a 16-bit unsigned value in Java, so there are 65536 possible chars.Unicode unfortunately now has more than 65536 characters, each of which is identified by a &#8216;codepoint&#8217;, which is a number from 0 to whatever.It is therefore obviously not possible to represent every character as a single Java &#8216;char&#8217;.  There are two choices available to the Java programmer for codepoints larger than 65535: a pair of chars (known as a surrogate pair) or else a single 32-bit integer codepoint.The difference between char and codepoint shows up only for codepoints larger than 65535.Note that the 32-bit &#8216;codepoint&#8217; value is not simply the concatenation of the two 16-bit &#8216;char&#8217; values. The surrogate pair is appropriately decoded.

Advertisement

Answer