This question comes from
How can I delete all text between curly brackets in a multiline text file? (just the same, but without the requirements for nesting).
Example:
This is {
{the multiline
text} file }
that wants
{ to {be
changed}
} anyway.
Should become:
This is that wants anyway.
Is it possible to do this with some sort of one-line bash command (awk, sed, perl, grep, cut, tr… etc)?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
$ sed ':again;$!N;$!b again; :b; s/{[^{}]*}//g; t b' file3
This is
that wants
anyway.
Explanation:
-
:again;$!N;$!b againThis reads in the whole file.
:againis a label.Nreads in the next line and$!Nreads in the next line on the condition that we are not already at the last line.$!b againbranches back to theagainlabel on the condition that this is not the last line. -
:bThis defines a label
b. -
s/{[^{}]*}//gThis removes text in braces as long as the text contains no inner braces.
-
t bIf the above substitute command resulted in a change, jump back to label
b. In this way, the substitute command is repeated until all brace-groups are removed.
Method 2
A Perl approach:
$ perl -F"" -a00ne 'for (@F){$i++ if /{/; $i||print; $i-- if /}/}' file
This is
that wants
anyway
Explanation
-a: turns on automatic splitting on the file delimiter given by-Finto the@Farray.-F"": sets the input field separator to empty which will result in each element of@Fbeing one of the input characters.-00: turn on “paragraph mode”, where a “line” is defined as two consecutive newline characters. This means that the entire file in this case will be treated as a single line. If your file can have many paragraphs and the brackets can span multiple paragraphs, use-0777instead.-ne: read an input file and apply the script given by-eto each line.
The script itself is actually quite simple. A counter is incremented by one every time a { is seen and decremented by one for every }. This means that when the counter is 0, we are not inside brackets and should print:
for (@F){}: do this for each element of@F, each character in the line.$i++ if /{/;: increment$iby one if this character is a{$i||print;: print unless$iis set (0 counts as unset).$i-- if /}/: decrement$iby one if this character is a}
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0