comparing the first column of two files and printing the entire row of the second file if the first columns match

I have two files in these formats:

file1 :

air
smell
hand
dude
road
cat

file 2:

air,4,21,01,13,3,2
smell,21,4,2,5,6
dude,1,31,42,1
road,1,4,2,1,4
cat,1,5,6,3,1
hand,1,4,2,1,6
mouse,1,3,5,6,2

what I want to do is print the entire row of file 2, if the first string in column 1 of file 2 is found in file 1, and I want to keep the order of file 1.

expected output:

air,4,21,01,13,3,2
smell,21,4,2,5,6
hand,1,4,2,1,6
dude,1,31,42,1
road,1,4,2,1,4
cat,1,5,6,3,1

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

This should do it:

awk -F, 'FNR==NR {a[$1]; next}; $1 in a' file1 file2

edit:

Interpreted the wrong file for the ordering. A new attempt (requires gawk if that is acceptable)

gawk -F, '
    FNR==NR {a[NR]=$1; next}; 
    {b[$1]=$0}
    END{for (i in a) if (a[i] in b) print b[a[i]]}
' file1 file2

edit 2:

With nowmal awk, and swapping the files around:

awk -F, 'FNR==NR {a[$1]=$0; next}; $1 in a {print a[$1]}' file2 file1

Method 2

join -t, -1 2 -2 1 <(nl -s, -ba -nrz file1 | sort -t, -k2) 
<(sort -t, -k1 file2) | sort -t, -k2 | cut -d, -f1,3-

The lines in file1 are numbered and the result is sorted on 2nd field. This is then joined with file2 (sorted on 1st field):

air,000001,4,21,01,13,3,2
cat,000006,1,5,6,3,1
dude,000004,1,31,42,1
hand,000003,1,4,2,1,6
road,000005,1,4,2,1,4
smell,000002,21,4,2,5,6

the result is then sorted on 2nd field (i.e. the line numbers) to restore the original line order and then the same 2nd field is removed with cut:

air,4,21,01,13,3,2
smell,21,4,2,5,6
hand,1,4,2,1,6
dude,1,31,42,1
road,1,4,2,1,4
cat,1,5,6,3,1

Method 3

You can use the simple grep command.

grep -f file1 file2


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x