Python regex, matching pattern over multiple lines.. why isn’t this working?

I know that for parsing I should ideally remove all spaces and linebreaks but I was just doing this as a quick fix for something I was trying and I can’t figure out why its not working.. I have wrapped different areas of text in my document with the wrappers like “####1” and am trying to parse based on this but its just not working no matter what I try, I think I am using multiline correctly.. any advice is appreciated

This returns no results at all:

string='
####1
ttteest
####1
ttttteeeestt

####2   

ttest
####2'

import re
pattern = '.*?####(.*?)####'
returnmatch = re.compile(pattern, re.MULTILINE).findall(string)
return returnmatch

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Multiline doesn’t mean . will match line return, it means that ^ and $ are limited to lines only

re.M
re.MULTILINE

When specified, the pattern character ‘^’ matches at the beginning of the string and at the >beginning of each line (immediately following each newline); and the pattern character ‘$’ >matches at the end of the string and at the end of each line (immediately preceding each >newline). By default, ‘^’ matches only at the beginning of the string, and ‘$’ only at the >end of the string and immediately before the newline (if any) at the end of the string.

re.S or re.DOTALL makes . match even new lines.

Source

http://docs.python.org/

Method 2

Try re.findall(r"####(.*?)s(.*?)s####", string, re.DOTALL) (works with re.compile too, of course).

This regexp will return tuples containing the number of the section and the section content.

For your example, this will return [('1', 'ttteest'), ('2', ' nnttest')].

(BTW: your example won’t run, for multiline strings, use ''' or """)


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x