For example I have this simple containsignorecase method: But it fails with some comparissions like: ΙΧΘΥΣ & ιχθυσ So I switched to this library which is mentioned here: which has its own method StringUtils.containsIgnoreCase: Now it works for ΙΧΘΥΣ & ιχθυσ, but it fails for weiß & WEISS, tschüß & TSCHÜSS, ᾲ στο διάολο & Ὰͅ Στο Διάολο, ﬂour and

How to make a Java containsignorecase that works with all human languages

For example I have this simple containsignorecase method:

public static boolean containsIgnoreCase(String a, String b) {
    if (a == null || b == null) {
        return false;
    }
    return a.toLowerCase().contains(b.toLowerCase());
}

JavaScript
​x
 
public static boolean containsIgnoreCase(String a, String b) {    if (a == null || b == null) {        return false;    }    return a.toLowerCase().contains(b.toLowerCase());}​

But it fails with some comparissions like: ΙΧΘΥΣ & ιχθυσ

So I switched to this library which is mentioned here:

import org.apache.commons.lang3.StringUtils;

JavaScript
 
import org.apache.commons.lang3.StringUtils;​

which has its own method StringUtils.containsIgnoreCase:

public static boolean containsIgnoreCase2(String a, String b) {
    if (a == null || b == null) {
        return false;
    }

    return StringUtils.containsIgnoreCase(a, b);
}

JavaScript
 
public static boolean containsIgnoreCase2(String a, String b) {    if (a == null || b == null) {        return false;    }​    return StringUtils.containsIgnoreCase(a, b);}​

Now it works for ΙΧΘΥΣ & ιχθυσ, but it fails for weiß & WEISS, tschüß & TSCHÜSS, ᾲ στο διάολο & Ὰͅ Στο Διάολο, ﬂour and water & FLOUR AND WATER.

So I wonder if it is possible to create something that will work for all languages or am I missing something to configure on the apache library?

I also saw that this library icu4j could be used but could not find an example

<dependency>
    <groupId>com.ibm.icu</groupId>
    <artifactId>icu4j</artifactId>
    <version>72.1</version>
</dependency>

JavaScript
 
<dependency>    <groupId>com.ibm.icu</groupId>    <artifactId>icu4j</artifactId>    <version>72.1</version></dependency>​

Any help or recommendation is appreciated 🙂

Answer

toLowerCase() and toUpperCase() are not always symmetric. Your examples work if you uppercase them instead:

public static boolean containsIgnoreCase(String a, String b) {
    if (a == null || b == null) {
        return false;
    }
    return a.toUpperCase().contains(b.toUpperCase());
}

JavaScript
 
public static boolean containsIgnoreCase(String a, String b) {    if (a == null || b == null) {        return false;    }    return a.toUpperCase().contains(b.toUpperCase());}​

Advertisement

Answer