How can I access captured groups if I do findall(r'regex(with)capturing.goes.here') ?
I know I can do it through finditer, but I don’t want to iterate.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
findall just returns the captured groups:
>>> re.findall('abc(de)fg(123)', 'abcdefg123 and again abcdefg123')
[('de', '123'), ('de', '123')]
Relevant doc excerpt:
Return all non-overlapping matches of
pattern in string, as a list of
strings. The string is scanned
left-to-right, and matches are
returned in the order found. If one or
more groups are present in the
pattern, return a list of groups; this
will be a list of tuples if the
pattern has more than one group. Empty
matches are included in the result
unless they touch the beginning of
another match.
Method 2
Use groups freely. The matches will be returned as a list of group-tuples:
>>> re.findall('(1(23))45', '12345')
[('123', '23')]
If you want the full match to be included, just enclose the entire regex in a group:
>>> re.findall('(1(23)45)', '12345')
[('12345', '23')]
Method 3
import re
string = 'Perotto, Pier Giorgio'
names = re.findall(r'''
(?P<first>[-w ]+),s #first name
(?P<last> [-w ]+) #last name
''',string, re.X|re.M)
print(names)
returns
[('Perotto', 'Pier Giorgio')]
re.M would make sense if your string is multiline. Also you need VERBOSE (equal to re.X) mode in the regex I’ve written because it is using '''
Method 4
Several ways are possible:
>>> import re
>>> r = re.compile(r"'(d+)'")
>>> result = r.findall("'1', '2', '345'")
>>> result
['1', '2', '345']
>>> result[0]
'1'
>>> for item in result:
... print(item)
...
1
2
345
>>>
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0