Replacing pattern after nth match is found on each line?

I have a file containing lines:

india;austria;japan;chile
china;US;nigeria;mexico;russia

I want to replace all the occurences of semicolon on each line with e.g. ;NEW;, but starting from the 2nd occurence only. The result should look like this:

india;austria;NEW;japan;NEW;chile
china;US;NEW;nigeria;NEW;mexico;NEW;russia

I tried this with gsub, but it replaces all the occurences:
awk '/;/{gsub(/;/,";NEW;") }{print}'

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

The awk solution is much longer, but is easier to make it generic:

awk -F; '{for(i=1;i<NF;i++)printf"%s;%s",$i,(i>=2)?"NEW;":"";print$NF}' replacefile

Is possible to do it with sed too, making a loop with the t command and always replacing the 2nd (or whichever you want) separator into some temporary mark (usually n):

sed ':b;s/;/n/2;tb;s/n/;NEW;/g' replacefile

Method 2

There’s a flag for GNU sed’s s/// command that does this:

sed 's/;/;NEW;/2g' <<END
india;austria;japan;chile
china;US;nigeria;mexico;russia
END

outputs

india;austria;NEW;japan;NEW;chile
china;US;NEW;nigeria;NEW;mexico;NEW;russia

See https://www.gnu.org/software/sed/manual/sed.html#The-_0022s_0022-Command

The s command can be followed by zero or more of the following flags:

g

Apply the replacement to all matches to the regexp, not just the first.

number

Only replace the numberth match of the regexp.
Note: the posix standard does not specify what should happen when you mix the g and number modifiers, and currently there is no widely agreed upon meaning across sed implementations. For GNU sed, the interaction is defined to be: ignore matches before the numberth, and then match and replace all matches from the numberth on.

(emphasis mine)

Method 3

I’d do this in two steps:

First, replace all semicolons with ;NEW;:

sed -e s/;/;NEW;/g

Then replace the first ;NEW; with a semicolon:

sed -e s/;NEW;/;/

You can use a pipe to do both replaces on one line. Here’s an example:

$ more replacefile 
india;austria;japan;chile;
china;US;nigeria;mexico;russia
$ cat replacefile |sed -e s/;/;NEW;/g  |sed -e s/;NEW;/;/
india;austria;NEW;japan;NEW;chile;NEW;
china;US;NEW;nigeria;NEW;mexico;NEW;russia

Method 4

I can do it with more code but no loops!

Data

china;US;nigeria;mexico;russia
iindia;austria;japan;chile

Script

BEGIN{ FS=";" }{
    insert=$param
    ix=index($0, insert) + length(insert)

    if (NF <= $param) {
            rest = substr($0,ix,length($0))
            gsub(";",";NEW;",rest)
            line = substr($0,0,ix) rest

            gsub(";;",";",line)
            gsub(";$","",line)
            print line

} else {print}}

Example

 Microknoppix v # awk -f replaceNth.awk -v param=2 countries
 china;US;NEW;nigeria;NEW;mexico;NEW;russia
 iindia;austria;NEW;japan;NEW;chile


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x