How can I delete all text between nested curly brackets in a multiline text file?

This question comes from
How can I delete all text between curly brackets in a multiline text file? (just the same, but without the requirements for nesting).

Example:

This is {
{the multiline
text} file }
that wants
{ to {be
changed}
} anyway.

Should become:

This is 
that wants
 anyway.

Is it possible to do this with some sort of one-line bash command (awk, sed, perl, grep, cut, tr… etc)?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

$ sed ':again;$!N;$!b again; :b; s/{[^{}]*}//g; t b' file3
This is 
that wants
 anyway.

Explanation:

  • :again;$!N;$!b again

    This reads in the whole file.

    :again is a label. N reads in the next line and $!N reads in the next line on the condition that we are not already at the last line. $!b again branches back to the again label on the condition that this is not the last line.

  • :b

    This defines a label b.

  • s/{[^{}]*}//g

    This removes text in braces as long as the text contains no inner braces.

  • t b

    If the above substitute command resulted in a change, jump back to label b. In this way, the substitute command is repeated until all brace-groups are removed.

Method 2

A Perl approach:

$ perl -F"" -a00ne 'for (@F){$i++ if /{/; $i||print; $i-- if /}/}' file
This is 
that wants
 anyway

Explanation

  • -a : turns on automatic splitting on the file delimiter given by -F into the @F array.
  • -F"" : sets the input field separator to empty which will result in each element of @F being one of the input characters.
  • -00 : turn on “paragraph mode”, where a “line” is defined as two consecutive newline characters. This means that the entire file in this case will be treated as a single line. If your file can have many paragraphs and the brackets can span multiple paragraphs, use -0777 instead.
  • -ne : read an input file and apply the script given by -e to each line.

The script itself is actually quite simple. A counter is incremented by one every time a { is seen and decremented by one for every }. This means that when the counter is 0, we are not inside brackets and should print:

  • for (@F){} : do this for each element of @F, each character in the line.
  • $i++ if /{/; : increment $i by one if this character is a {
  • $i||print; : print unless $i is set (0 counts as unset).
  • $i-- if /}/ : decrement $i by one if this character is a }


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x