How to replace only part of the match with python re.sub

I need to match two cases by one reg expression and do replacement

‘long.file.name.jpg’ -> ‘long.file.name_suff.jpg’

‘long.file.name_a.jpg’ -> ‘long.file.name_suff.jpg’

I’m trying to do the following

re.sub('(_a)?.[^.]*$' , '_suff.',"long.file.name.jpg")

But this is cut the extension ‘.jpg’ and I’m getting

long.file.name_suff. instead of long.file.name_suff.jpg
I understand that this is because of [^.]*$ part, but I can’t exclude it, because
I have to find last occurance of ‘_a’ to replace or last ‘.’

Is there a way to replace only part of the match?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Put a capture group around the part that you want to preserve, and then include a reference to that capture group within your replacement text.

re.sub(r'(_a)?.([^.]*)$' , r'_suff.2',"long.file.name.jpg")

Method 2

 re.sub(r'(?:_a)?.([^.]*)$', r'_suff.1', "long.file.name.jpg")

?: starts a non matching group (SO answer), so (?:_a) is matching the _a but not enumerating it, the following question mark makes it optional.

So in English, this says, match the ending .<anything> that follows (or doesn’t) the pattern _a

Another way to do this would be to use a lookbehind (see here). Mentioning this because they’re super useful, but I didn’t know of them for 15 years of doing REs

Method 3

Just put the expression for the extension into a group, capture it and reference the match in the replacement:

re.sub(r'(?:_a)?(.[^.]*)$' , r'_suff1',"long.file.name.jpg")

Additionally, using the non-capturing group (?:…) will prevent re to store to much unneeded information.

Method 4

You can do it by excluding the parts from replacing. I mean, you can say to the regex module; “match with this pattern, but replace a piece of it”.

re.sub(r'(?<=long.file.name)(_a)?(?=.([^.]*)$)' , r'_suff',"long.file.name.jpg")
>>> 'long.file.name_suff.jpg'

long.file.name and .jpg parts are being used on matching, but they are excluding from replacing.

Method 5

I wanted to use capture groups to replace a specific part of a string to help me parse it later. Consider the example below:

s= '<td> <address> 110 SOLANA ROAD, SUITE 102<br>PONTE VEDRA BEACH, FL32082 </address> </td>'

re.sub(r'(<address>s.*?)(<br>)(.*?</address>)', r'1 -- 3', s)
##'<td> <address> 110 SOLANA ROAD, SUITE 102 -- PONTE VEDRA BEACH, FL32082 </address> </td>'

Method 6

print(re.sub('name(_a)?','name_suff','long.file.name_a.jpg'))
# long.file.name_suff.jpg

print(re.sub('name(_a)?','name_suff','long.file.name.jpg'))
# long.file.name_suff.jpg


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x