Remove text between () and []

I have a very long string of text with () and [] in it. I’m trying to remove the characters between the parentheses and brackets but I cannot figure out how.

The list is similar to this:

x = "This is a sentence. (once a day) [twice a day]"

This list isn’t what I’m working with but is very similar and a lot shorter.

Contents hide

Answers:

Method 1

Explanation

Method 2

Method 3

Method 4

Method 5

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You can use re.sub function.

>>> import re 
>>> x = "This is a sentence. (once a day) [twice a day]"
>>> re.sub("([([]).*?([)]])", "g<1>g<2>", x)
'This is a sentence. () []'

If you want to remove the [] and the () you can use this code:

>>> import re 
>>> x = "This is a sentence. (once a day) [twice a day]"
>>> re.sub("[([].*?[)]]", "", x)
'This is a sentence.  '

Important: This code will not work with nested symbols

Explanation

The first regex groups ( or [ into group 1 (by surrounding it with parentheses) and ) or ] into group 2, matching these groups and all characters that come in between them. After matching, the matched portion is substituted with groups 1 and 2, leaving the final string with nothing inside the brackets. The second regex is self explanatory from this -> match everything and substitute with the empty string.

— modified from comment by Ajay Thomas

Method 2

Run this script, it works even with nested brackets.
Uses basic logical tests.

def a(test_str):
    ret = ''
    skip1c = 0
    skip2c = 0
    for i in test_str:
        if i == '[':
            skip1c += 1
        elif i == '(':
            skip2c += 1
        elif i == ']' and skip1c > 0:
            skip1c -= 1
        elif i == ')'and skip2c > 0:
            skip2c -= 1
        elif skip1c == 0 and skip2c == 0:
            ret += i
    return ret

x = "ewq[a [(b] ([c))]] This is a sentence. (once a day) [twice a day]"
x = a(x)
print x
print repr(x)

Just incase you don’t run it,
Here’s the output:

>>> 
ewq This is a sentence.  
'ewq This is a sentence.  '

Method 3

Here’s a solution similar to @pradyunsg’s answer (it works with arbitrary nested brackets):

def remove_text_inside_brackets(text, brackets="()[]"):
    count = [0] * (len(brackets) // 2) # count open/close brackets
    saved_chars = []
    for character in text:
        for i, b in enumerate(brackets):
            if character == b: # found bracket
                kind, is_close = divmod(i, 2)
                count[kind] += (-1)**is_close # `+1`: open, `-1`: close
                if count[kind] < 0: # unbalanced bracket
                    count[kind] = 0  # keep it
                else:  # found bracket to remove
                    break
        else: # character is not a [balanced] bracket
            if not any(count): # outside brackets
                saved_chars.append(character)
    return ''.join(saved_chars)

print(repr(remove_text_inside_brackets(
    "This is a sentence. (once a day) [twice a day]")))
# -> 'This is a sentence.  '

Method 4

This should work for parentheses. Regular expressions will “consume” the text it has matched so it won’t work for nested parentheses.

import re
regex = re.compile(".*?((.*?))")
result = re.findall(regex, mystring)

or this would find one set of parentheses, simply loop to find more:

start = mystring.find("(")
end = mystring.find(")")
if start != -1 and end != -1:
  result = mystring[start+1:end]

Method 5

You can split, filter, and join the string again. If your brackets are well defined the following code should do.

import re
x = "".join(re.split("(|)|[|]", x)[::2])

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating