I know that Regex is a pretty hot topic and that there’s a plethora of similar questions, however, I have not found one which matches my needs.
I need to check the formatting of my string to be as follows:
- All line must start with 5 digits.
- Characters 6 to 12 must be white space.
- Character 13 must be either white space or asterisk.
- if there is any period, colon or semicolon before the final period, the character must not be preceded by a white space, but it must be followed by a white space.
- opening parentheses cannot be followed by a white space.
- closing parentheses cannot be preceded by a white space.
I haven’t tried to implement the colon, semicolon or parentheses, but so far I’m stuck at just the period. These characters are optional so I can’t make a hard check, and I’m trying to catch them but I’m still getting a match in a case like
00000 *TEST .FINAL STATEMENT. //Matches, but it shouldn't match. 00001 *TEST2 . FINAL STATEMENT. //Matches, but it shouldn't match. 00002 *TEST3. FINAL STATEMENT. //Matches, **should** match.
This is the regex I have so far:
^d{5}s{6}[s*][^.]*([^.s]+.s)?[^.]*..*$
I really don’t see how this is happening, especially because I’m using [^.] to indicate I’ll accept anything except a period as a wildcard, and the optional pattern looks correct at a glance: If there’s a period, it should not have white space behind it and it should have white space after it.
Advertisement
Answer
Try this:
^d{5}s{6}[s*] # Your original pattern (?: # Repeat 0 or more times: [^.:;()]*| # Unconstrained characters (?<!s)[.:;](?=s)| # Punctuation after non-space, followed by space ((?!s)| # Opening parentheses not followed by space (?<!s)) # Closing parentheses not preceeded by space )* .$ # Period, then end of string
https://regex101.com/r/WwpssV/1
In the last part of the pattern, the characters with special requirements are .:;()
, so use a negative character set to match anything but those characters: [^.:;()]*
Then alternate with:
if there is any period, colon or semicolon before the final period, the character must not be preceded by a white space, but it must be followed by a white space.
Fulfilled by (?<!s)[.:;](?=s)
– match one of those characters only if not preceded by a space, and if followed by a space.
opening parentheses cannot be followed by a white space.
Fulfilled by ((?!s)
closing parentheses cannot be preceded by a white space.
Fulfilled by (?<!s))
Then just alternate between those 4 possibilities at the end of the pattern.