Lets say I have:
a = r''' Example
This is a very annoying string
that takes up multiple lines
and <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="10785063">[email protected]</a> a// kind{s} of stupid symbols in it
ok String'''
I need a way to do a replace(or just delete) and text in between “This” and “ok” so that when I call it, a now equals:
a = "Example String"
I can’t find any wildcards that seem to work. Any help is much appreciated.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You need Regular Expression:
>>> import re
>>> re.sub('nThis.*?ok','',a, flags=re.DOTALL)
' Example String'
Method 2
Another method is to use string splits:
def replaceTextBetween(originalText, delimeterA, delimterB, replacementText):
leadingText = originalText.split(delimeterA)[0]
trailingText = originalText.split(delimterB)[1]
return leadingText + delimeterA + replacementText + delimterB + trailingText
Limitations:
- Does not check if the delimiters exist
- Assumes that there are no duplicate delimiters
- Assumes that delimiters are in correct order
Method 3
The DOTALL flag is the key. Ordinarily, the ‘.’ character doesn’t match newlines, so you don’t match across lines in a string. If you set the DOTALL flag, re will match ‘.*’ across as many lines as it needs to.
Method 4
a=re.sub('This.*ok','',a,flags=re.DOTALL)
Method 5
Use re.sub : It replaces the text between two characters or symbols or strings with desired character or symbol or string.
format: re.sub('A?(.*?)B', P, Q, flags=re.DOTALL)
where A : character or symbol or string B : character or symbol or string P : character or symbol or string which replaces the text between A and B Q : input string re.DOTALL : to match across all lines
import re
re.sub('nThis?(.*?)ok', '', a, flags=re.DOTALL)
output : ' Example String'
Lets see an example with html code as input
input_string = '''<body> <h1>Heading</h1> <p>Paragraph</p><b>bold text</b></body>'''
Target : remove <p> tag
re.sub('<p>?(.*?)</p>', '', input_string, flags=re.DOTALL)
output : '<body> <h1>Heading</h1> <b>bold text</b></body>'
Target : replace <p> tag with word : test
re.sub('<p>?(.*?)</p>', 'test', input_string, flags=re.DOTALL)
otput : '<body> <h1>Heading</h1> test<b>bold text</b></body>'
Method 6
If you want first and last words:
re.sub(r'^s*(w+).*?(w+)$', r'1 2', a, flags=re.DOTALL)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0