I have a string and I want to remove any other character such as (0..9!@#$%^&*()_., …) and keep only alphabetic characters.
After looking up and doing some tests, I got 2 regexes formats:
String str = "123hello!#$% مرحبا. ok"; str = str.replaceAll("[^a-zA-Z]", ""); str = str.replaceAll("\P{InArabic}+", ""); System.out.println(str);
This should return “hello مرحبا ok”.
But of course, this will return an empty string because we’re removing any non-Latin characters in the first regex then we remove any non-Arabic characters in the second regex.
My question is, how can I merge these 2 regexes in one to keep only Arabic and English characters only.
Advertisement
Answer
Use lowercase p since negation is handled with ^ and no quantifier is needed (but wouldn’t hurt) since using replaceAll
:
String str = "123hello!#$% مرحبا. ok"; str = str.replaceAll("[^a-zA-Z \p{InArabic}]", ""); System.out.println(str);
Prints:
hello مرحبا ok
Note based on your expected results you want spaces included so a space is in the character list.