When using split(), what regular expression would allow me to keep all word characters but would also preserve contractions like don’t won’t. Anything with word characters on both sides of the apostrophe but removes any leading or trailing apostraphes such as ’tis or dogs’.
String  words = line.split("[^\w'+]+[\w+('*?)\w+]");
but it keeps the leading and trailing punctuation.
'Tis the season, for the children's happiness'.
Would produce an output of:
Tis the season for the children's happiness
I would think: split on:
- either apostrophe + at least one none-word char
or any none word chars
String line = "'Tis the season, for the children's happiness'"; String words = line.split("(['-]\W+|[^\w'-]\W*)"); System.out.println(Arrays.toString(words));
Here I added
- as addition to apostrophe.
['Tis, the, season, for, the, children's, happiness']
Adding begin and end:
String words = line.split("(^['-]|['-]$|['-]\W+|[^\w'-]\W*)");
[, Tis, the, season, for, the, children's, happiness]
which for the beginning yields an empty string.