Skip to content
Advertisement

Java Regular Expression split keeping contractions

When using split(), what regular expression would allow me to keep all word characters but would also preserve contractions like don’t won’t. Anything with word characters on both sides of the apostrophe but removes any leading or trailing apostraphes such as ’tis or dogs’.

I have:

JavaScript

but it keeps the leading and trailing punctuation.

Input of 'Tis the season, for the children's happiness'.

Would produce an output of: Tis the season for the children's happiness

Any advice?

Advertisement

Answer

I would think: split on:

  • either apostrophe + at least one none-word char ['-]\W+,
  • or any none word chars [^\w'-]\W*.

    JavaScript

Here I added - as addition to apostrophe.

Result:

JavaScript

Adding begin and end:

JavaScript

Result:

JavaScript

which for the beginning yields an empty string.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement