My program can accept data that has newline characters of n, rn or r (eg Unix, PC or Mac styles)
What is the best way to construct a regular expression that will match whatever the encoding is?
Alternatively, I could use universal_newline support on input, but now I’m interested to see what the regex would be.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
The regex I use when I want to be precise is "rn?|n".
When I’m not concerned about consistency or empty lines, I use "[rn]+", I imagine it makes my programs somewhere in the order of 0.2% faster.
Method 2
The pattern can be simplified to r?n for a little performance gain, as you probably don’t have to deal with the old Mac style (OS 9 is unsupported since February 2002).
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0