What does :: do in script of sed

I would like to know what it really does with following scripts used with sed command.

sed -e 's:<F0_M>:<o,f0,male>:' 
          -e 's:<F0_F>:<o,f0,female>:' 
          -e 's:([0-9])::g' 
          -e 's:<sil>::g' 
          -e 's:([^ ]*)$::' |

First and second scripts looks like we are transforming text of type <F0_F> to <o,f0,female>. But what about the last three where it involves '::','g' and '$' sign. In most of the documentations they have used ‘’ and ‘/’ in most of scripts. But here they have used ':' instead of slashes. Can some one explain above three scripts?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

The standard delimiter used in sed commands is /, as in commands like this:

sed -e s/foo/bar/g < input > output

However, if the s command is followed by a different character, that becomes the delimiter for that particular expression.

The use of non-/ delimiters is common when the delimiter itself needs to appear in the command, and so would require careful attention to escaping. For example, a / delimiter is annoying to deal with in scripts that deal with Unix paths.

That doesn’t appear to be the case here, so I assume the author of that command just prefers : as a delimiter in sed commands.

Your command has five expressions:

s:<F0_M>:<o,f0,male>:

This replaces the first instance of <F0_M> on each line of the input with <o,f0,male> in the output. If there is more than one match in the input on that line, the subsequent ones will be left alone.

The single quotes just prevent the shell from interpreting any of the characters in the expression. They’re all passed literally to the sed command.

s:<F0_F>:<o,f0,female>:

Similar to the above case, only apparently for the other gender.

s:([0-9])::g

Removes all single digits in parentheses from the input lines.

Unlike the previous two expressions, this expression affects all instances on each line because of the trailing g, which means “global”.

Note that it only works on single digits. It won’t do anything to (42), for example.

s:<sil>::g

Removes all <sil> instances from each line of the input when writing to the output.

s:([^ ]*)$::

Removes a parenthesized character at the end of the line if it doesn’t contain a space. Also removes an empty pair of parentheses at the end of a line.

There are whole books on these topics, sed and regular expressions. A single answer really is not the right place to learn the whole topic.

The above expression is actually a bit tricky in that regard: $ pins a regular expression (or regex for short) to the end of a line, and ^ to the beginning, but the ^ in that expression means something different.

I recommend that you read Mastering Regular Expressions by Jeffrey Friedl.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x